Providers
| Provider | Free Tier (per key) | Models | Get a Key |
|---|---|---|---|
| Groq | ~30 req/min | Llama 3.3 70B, Llama 3.1 8B, Llama 4 Scout, Qwen3 32B | console.groq.com |
| Gemini | ~15 req/min | Gemini 2.5 Flash, 2.5 Pro, 2.0 Flash, 2.0 Flash Lite | aistudio.google.com |
| Mistral | ~5 req/min | Mistral Small, Medium, Nemo | console.mistral.ai |
| Cerebras | ~30 req/min | Llama 3.1 8B, Qwen3 235B, GPT-OSS 120B | cloud.cerebras.ai |
| NVIDIA NIM | ~40 req/min | Llama 3.3 70B, Llama 3.1 405B, Nemotron 70B, DeepSeek R1 | build.nvidia.com |
| Ollama | Unlimited (local) | Any local model | ollama.com |
Combined free capacity: ~120 req/min with one key each, ~360 req/min with three keys per provider via Multi-Key Rotation. All $0.
Adding API keys
Set the corresponding env var. Each accepts a single key OR a comma-separated list:
GROQ_API_KEY=gsk_...,gsk_another,gsk_third # 3× the Groq capacityGEMINI_API_KEY=AI...MISTRAL_API_KEY=...CEREBRAS_API_KEY=csk_...NVIDIA_NIM_API_KEY=nvapi-...OLLAMA_BASE_URL=http://localhost:11434Only the providers with valid keys are enabled at runtime. You don’t need to provide all six.