La Perf#

La Perf — a local AI performance benchmark#

Compare AI performance across different devices.

What is La Perf?#

La Perf is an open-source benchmark suite designed to help you make informed hardware decisions for local AI workloads.

Whether you're an AI/ML engineer running workloads locally, or an AI enthusiast looking to understand real-world device performance, La Perf provides:

Reproducible benchmarks across different hardware (M4 Max, RTX 4060, A100, etc.)
Real-world workloads (embeddings, LLM inference, VLM tasks, power monitoring)
Transparent metrics with detailed methodology documentation
Community-driven results to help you compare before you buy

Why La Perf?#

The goal of this project is to create an all-in-one source of information you need before buying your next laptop or PC for local AI tasks.

Philosophy

We believe in honest, reproducible benchmarks that reflect real-world performance, not synthetic marketing numbers.

Features#

Supported Benchmarks#

EmbeddingsLLMsVLMs

Text embeddings via sentence-transformers

Models: modernbert-embed-base
Dataset: IMDB (3000 samples)
Metrics: RPS (Rows Per Second), E2E Latency

LLM inference via LM Studio and Ollama

Models: gpt-oss-20b
Dataset: Awesome ChatGPT Prompts
Metrics: TPS, TTFT, Token Generation Time, E2E Latency

Vision-Language Model inference via LM Studio and Ollama

Models: Qwen3-VL-8B
Dataset: Hallucination_COCO
Metrics: TPS, TTFT, Token Generation Time, E2E Latency

On-device Metrics#

Power Metrics

Real-time power and resource monitoring

CPU/GPU usage
Memory consumption (RAM, VRAM)
GPU power draw
Battery drain (laptops)

Quick Links#

Quick Start

Get up and running in minutes

Getting Started
View Results

Compare benchmark results across devices

Results
Metrics

Understand how we measure performance

Metrics Reference
Contribute

Help improve La Perf or submit your results

Contributing Guide

Supported Hardware#

La Perf automatically detects and optimizes for:

NVIDIA GPUs (CUDA)
AMD GPUs (ROCm)
Apple Silicon (MPS/MLX)
Intel GPUs
CPU fallback (all platforms)

Platform Support#

Compatible with Linux, macOS, and Windows.

Recommended Setup

RAM: 8 GB for embeddings, 18+ GB for LLM/VLM benchmarks
GPU: Highly recommended for optimal performance
Tools: Enable full GPU offload in LM Studio/Ollama

Community#

Join the discussion, share your results, and help improve La Perf:

Citation#

If you use LaPerf in your research or reports, please cite it as follows:

Minko B. (2025). LaPerf: Local AI Performance Benchmark Suite. GitHub repository. Available at: https://github.com/bogdan01m/laperf Licensed under the Apache License, Version 2.0.

BibTeX:

@software{laperf,
  author       = {Bogdan Minko},
  title        = {LaPerf: Local AI Performance Benchmark Suite},
  year         = {2025},
  url          = {https://github.com/bogdan01m/laperf},
  license      = {Apache-2.0},
  note         = {GitHub repository}
}