Local AI Calculator

Select your hardware to see which models run locally. No dedicated GPU? No problem—Ollie also works flawlessly with your favorite cloud APIs.

Select Your GPU
🖥️

Select a GPU above to see which models you can run.

☁️

Prefer Cloud Models?

Ollie isn't just for local AI. You can connect your API keys for GPT-4o, Claude 3.5 Sonnet, and Gemini Pro to write code and edit media without using a single drop of local VRAM.

📝 How This Works

VRAM estimates are based on Q4_K_M quantization, which is the most common format for running LLMs locally via Ollama. Actual usage may vary depending on context length, system overhead, and concurrent applications. Apple Silicon uses unified memory — the full system RAM is available for model loading. Models marked "Tight fit" will run but may be slow with long conversations.

Run AI Locally with Ollie

Connect Ollama, Gemini, OpenAI, and more — all from one sovereign, private AI suite.

Download Ollie