Local AI Calculator

Select your hardware to see which models run locally. No dedicated GPU? No problem—Ollie also works flawlessly with your favorite cloud APIs.

Select Your GPU

🖥️

Select a GPU above to see which models you can run.

📝 How This Works

VRAM estimates are based on Q4_K_M quantization, which is the most common format for running LLMs locally via Ollama. Actual usage may vary depending on context length, system overhead, and concurrent applications. Apple Silicon uses unified memory — the full system RAM is available for model loading. Models marked "Tight fit" will run but may be slow with long conversations.

Run AI Locally with Ollie

Connect Ollama, Gemini, OpenAI, and more — all from one sovereign, private AI suite.

Download Ollie

Local AI Calculator

No GPU? No Problem.

Prefer Cloud Models?

📝 How This Works

Run AI Locally with Ollie