Quick Start#

Get started with La Perf in just a few minutes!

Prerequisites#

Before running La Perf, ensure you have:

uv package manager
Python 3.12+ - uv will automatically install it
Ollama - for LLM, VLM inference (Optional)
LM Studio - for LLM, VLM inference (Optional)

Why uv?

La Perf uses uv for fast, reliable dependency management. It's significantly faster than pip and handles environment isolation automatically.

Installation#

1. Clone the repository#

git clone https://github.com/bogdanminko/laperf.git
cd laperf

2. (Optional) Configure environment variables#

La Perf works out of the box with default settings, but you can customize it:

cp .env.example .env
# Edit .env to customize settings

Common customizations:

Change provider URLs - Use different OpenAI-compatible providers (vLLM, TGI, LocalAI)
Adjust dataset sizes - Change LLM_DATA_SIZE, VLM_DATA_SIZE, EMBEDDING_DATA_SIZE
Select backends - Use LM_STUDIO, OLLAMA, or BOTH for benchmarking
Customize models - Set different model names for your provider

Using a custom provider

To use vLLM or another OpenAI-compatible provider:

# In your .env file:
LLM_BACKEND=LM_STUDIO
LMS_LLM_BASE_URL=http://localhost:8000/v1
LMS_LLM_MODEL_NAME=Qwen/Qwen3-30B-Instruct
LLM_API_KEY=your-api-key-if-needed

3. Install dependencies (optional)#

uv sync

This will:

Create a virtual environment
Install all required dependencies
Set up the project for immediate use

Running Your First Benchmark#

Run all benchmarks#

Using make

make bench

Using uv

uv run python main.py

This will:

Auto-detect your hardware (CUDA / MPS / CPU)
Run all available benchmarks (all are pre-selected — you can toggle individual ones in the TUI using Space)
Save the results to results/report_{your_device}.json

Hardware Detection

La Perf automatically detects your GPU and optimizes accordingly. No manual configuration needed!

Understanding Results#

After running benchmarks, you'll find:

JSON results in results/report_{device}.json
Plots in results/plots/
Summary tables in the terminal

Generate Markdown Tables#

Run

make

or

make generate

This processes JSON results and generates markdown tables for the README.

Next Steps#

View Results - Compare your results with other devices
Understand Metrics - Learn how we measure performance
View Results - See benchmark results across devices
Contribute - Submit your results or add new benchmarks

Troubleshooting#

Out of memory#

If you encounter out-of-memory errors, create a .env file and adjust these settings:

cp .env.example .env

Then edit .env to reduce resource usage:

Reduce batch size: EMBEDDING_BATCH_SIZE=16 (default: 32)
Reduce dataset size: EMBEDDING_DATA_SIZE=1000 (default: 3000)
Reduce LLM/VLM samples: LLM_DATA_SIZE=5 or VLM_DATA_SIZE=5 (default: 10)
Close other GPU-intensive applications
Use CPU mode for testing (slower but works)

Get Help#

Need help? Check out: