Skip to content

Installation#

Detailed installation instructions for La Perf across different platforms.


System Requirements#

Minimum Requirements#

  • Python: 3.12 or higher
  • RAM: 8 GB (embeddings), 16 GB (LLM), 18 GB (VLM)
  • Disk Space: ~100 GB free for models and datasets
  • OS: Linux, macOS, or Windows
  • GPU: NVIDIA (CUDA), AMD (ROCm), or Apple Silicon (MPS)
  • RAM: 24 GB+ for comfortable multitasking
  • SSD: Fast storage for dataset loading

Installing uv#

La Perf uses uv as its package manager.

curl -LsSf https://astral.sh/uv/install.sh | sh
powershell -c "irm https://astral.sh/uv/install.ps1 | iex"
pip install uv

Verify installation:

uv --version

Why uv?

La Perf uses uv for fast, reliable dependency management. It's significantly faster than pip and handles environment isolation automatically.


Installing La Perf#

1. Clone the repository#

git clone https://github.com/bogdanminko/laperf.git
cd laperf

2. Install dependencies#

For benchmarking only#

uv sync

For development#

uv sync --group quality --group dev

This installs additional tools:

  • ruff - Fast Python linter
  • mypy - Type checker
  • bandit - Security scanner
  • pre-commit - Git hooks

3. Verify installation#

uv run python -c "import torch; print(torch.__version__)"

LM Studio Setup#

For LLM/VLM benchmarks, install LM Studio:

1. Download LM Studio#

Visit lmstudio.ai and download for your platform.

2. Load a model#

Best way to find it is using LM Studio UI

Load LLM

Search for gpt-oss-20b in available models

mlx-community/gpt-oss-20b-MXFP4-Q8

lmstudio-community/gpt-oss-20b-GGUF

Load VLM

Search for Qwen3-VL-8B-Instruct in available models

lmstudio-community/Qwen3-VL-8B-Instruct-MLX-4bit

lmstudio-community/Qwen3-VL-8B-Instruct-GGUF-Q4_K_M

3. Start the server#

  1. Click "Developer" tab
  2. Click "Start Server"
  3. Verify it's running on http://localhost:1234

Ollama Setup#

For LLM/VLM benchmarks, install Ollama:

1. Install Ollama#

brew install ollama
curl -fsSL https://ollama.com/install.sh | sh

Download from ollama.com

2. Pull a model#

Pull LLM

ollama pull gpt-oss:20b
Pull VLM
ollama pull qwen3-vl:8b


Verifying Your Setup#

Run a quick test to ensure everything works:

Using make

make bench

Using uv

uv run python main.py

This will:

  1. Auto-detect your hardware (CUDA / MPS / CPU)
  2. Run all available benchmarks (all are pre-selected — you can toggle individual ones in the TUI using Space)
  3. Save the results to results/report_{your_device}.json

Hardware Detection

La Perf automatically detects your GPU and optimizes accordingly. No manual configuration needed!


Troubleshooting#

uv command not found#

After installing uv, restart your terminal or run:

source ~/.bashrc  # or ~/.zshrc on macOS

Python version mismatch#

Ensure you're using Python 3.12+:

uv run python --version

CUDA not detected#


Next Steps#