Skip to content

Quickstart

Three ways to get FreeLLM running. Pick whichever fits your workflow.

Option A: One-click cloud deploy

The fastest path. Click a button, sign in, paste your provider API keys, done.

Your gateway is live in under 2 minutes. No clone, no Docker, no terminal.

Option B: Docker

Pull the prebuilt multi-arch image from GHCR (no clone needed):

Terminal window
docker run -d -p 3000:3000 \
-e GROQ_API_KEY=gsk_... \
-e GEMINI_API_KEY=AI... \
-e MISTRAL_API_KEY=... \
-e CEREBRAS_API_KEY=... \
-e NVIDIA_NIM_API_KEY=nvapi-... \
--name freellm \
ghcr.io/devansh-365/freellm:latest

Or clone and use docker-compose:

Terminal window
git clone https://github.com/Devansh-365/freellm.git
cd freellm
cp .env.example .env # add your API keys
docker compose up

Option C: Local development

Terminal window
git clone https://github.com/Devansh-365/freellm.git
cd freellm
pnpm install
cp .env.example .env
pnpm dev

The API runs on http://localhost:3000 and the dashboard on http://localhost:5173.

Get free API keys

Sign up for any combination of these (more keys = more capacity):

ProviderWhere to get a key
Groqconsole.groq.com
Geminiaistudio.google.com
Mistralconsole.mistral.ai
Cerebrascloud.cerebras.ai
NVIDIA NIMbuild.nvidia.com

Make your first request

Once your gateway is running, point any OpenAI-compatible SDK at it:

from openai import OpenAI
client = OpenAI(
base_url="http://localhost:3000/v1",
api_key="unused",
)
response = client.chat.completions.create(
model="free-fast",
messages=[{"role": "user", "content": "Hello!"}],
)
print(response.choices[0].message.content)
print("Provider used:", response.x_freellm_provider)

The x_freellm_provider field on the response tells you which upstream provider handled the request. Useful for debugging routing decisions.