Question 1

What is FreeLLM?

Accepted Answer

FreeLLM is an open-source, self-hosted API gateway that routes LLM requests across 8 free-tier providers including Groq, Gemini, Mistral, Cerebras, NVIDIA NIM, Cloudflare Workers AI, GitHub Models, and Ollama. It exposes a single OpenAI-compatible endpoint so you can use any OpenAI SDK without changing your code.

Question 2

Is FreeLLM really free?

Accepted Answer

Yes. FreeLLM uses the free tier of each provider. All 8 providers offer free API access. With 3 keys per provider you get roughly 450 free requests per minute. You pay nothing.

Question 3

Can I use FreeLLM with the OpenAI Python or Node.js SDK?

Accepted Answer

Yes. Change your base URL to your FreeLLM instance and keep your existing code. FreeLLM is fully OpenAI API-compatible.

Question 4

What happens when a provider rate-limits?

Accepted Answer

FreeLLM automatically routes to the next available provider. If Groq hits its rate limit, the next request goes to Gemini, then Mistral, then others. You never see a 429 error.

Question 5

Can I self-host FreeLLM?

Accepted Answer

Yes. FreeLLM is MIT-licensed and designed for self-hosting. Deploy to Railway or Render in under 2 minutes using the one-click deploy buttons.

Question 6

What LLM models does FreeLLM support?

Accepted Answer

FreeLLM supports 32+ models including Llama 3.3 70B via Groq, Gemini 2.5 Pro and Flash via Google, Mistral Small and Medium, Cerebras Llama and Qwen3, NVIDIA NIM models, and local models via Ollama.

Feature	FreeLLM	LiteLLM	OpenRouter	Portkey
Truly $0 (no markup, no subscription)	✓	Self-host	—	—
Multi-key rotation per provider	✓	—	—	—
OpenAI-compatible	✓	✓	✓	✓
Automatic failover	✓	✓	✓	✓
Built-in real-time dashboard	✓	—	✓	✓
Response caching (zero quota burn)	✓	Plugin	—	✓
Per-provider token tracking	✓	✓	✓	✓
Circuit breakers	✓	Partial	✓	✓
Self-hosted	✓	✓	—	Both
TypeScript codebase (auditable)	✓	—	?	—
One-click cloud deploy	✓	—	n/a	—

You shouldn't need a credit card
to call an LLM.

Everything you need to run LLMs for free

Drop-in OpenAI SDK

Automatic failover

Multi-key rotation

Token tracking

Circuit breakers

Three meta-models

Real-time dashboard

Truly $0

Response caching

Stitched into one endpoint

Groq

Gemini

Mistral

Cerebras

NVIDIA NIM

GitHub Models

Cloudflare AI

Ollama

Change one line. Keep your code.

The free-tier-first LLM gateway

Worst case: you delete it.

You shouldn't need a credit cardto call an LLM.