🌐 litegen.ai · 📖 Docs, SDKs & API reference → visgotti.github.io/litegen
The universal proxy for AI image and video generation. Like LiteLLM, but for multimedia.
LiteGen provides a unified API gateway for all major image and video generation providers. Route requests across OpenAI DALL-E, Stability AI, Replicate, Google Imagen, Fal.ai, Runway, Luma, and more — with automatic fallback, weighted load balancing, caching, cost tracking, and a real-time dashboard.
- Unified API — OpenAI-compatible REST endpoints for image & video generation
- 10+ Providers — OpenAI, Stability, Replicate, Google, Fal, Runway, Luma (more coming)
- Smart Routing — Fallback chains, weighted round-robin, lowest-cost, lowest-latency
- Caching — In-memory cache with configurable TTL to avoid duplicate generations
- Cost Tracking — Per-request cost estimation and aggregate spend analytics
- API Key Management — Create/revoke keys, weighted key pools for rate limit distribution
- Observability — Request logging, Prometheus metrics, structured tracing
- React Dashboard — Real-time monitoring, provider health, request logs, cost charts
- Easy Config — YAML config + env vars, auto-discovers provider keys
- Self-Hosted — Single binary + Docker, SQLite or Postgres
- Capability registry with per-model schemas — YAML-driven model declarations define allowed parameters, sizes, aspect ratios, and reference-image roles; unknown params are rejected (strict mode) or silently dropped (lax mode).
reference_imagestagged-union — Send reference images as URLs, base64 blobs, or multipart file uploads using{type: "url"|"base64"|"blob", value: "..."}.- Strict validation with per-request lax opt-out — Default
strict: truerejects unsupported parameters; setstrict: falseto drop unknowns and get anX-Litegen-Dropped-Paramsresponse header. - DB-backed multi-key auth: scopes, quotas, RPM rate limits — Create API keys with per-key USD budget caps, requests-per-minute throttles, and CSV scope sets (
generate,read,admin). - DB-backed video generations + polling endpoint — Video jobs are persisted to the
generationstable; poll status viaGET /v1/generations/{id}. A background poller updates status every 5 s. - Per-key webhook delivery on video completion — When a generation reaches a terminal status (
completed,failed,cancelled), the owning key'swebhook_urlreceives a signedPOSTwith the generation payload (X-Litegen-Signature: sha256=<hex>). Retries 3 times with exponential back-off. - OpenTelemetry OTLP gRPC export — Set
OTEL_EXPORTER_OTLP_ENDPOINTto ship traces to any OTel collector. - Prometheus
/metrics— Expose Prometheus-format metrics atGET /metrics. - Dashboard CRUD UI for keys + model schema panel — Create, update, and revoke API keys from the React dashboard; inspect full model capability schemas in the Models panel.
The minimum required env vars:
# Port the proxy listens on (default: 4000)
export LITEGEN__SERVER__PORT=4000
# Master API key — if set, all requests need Authorization: Bearer <key>
# Omit to run in dev mode (no auth)
export LITEGEN__MASTER_KEY=your-secret-master-key
# Database URL (SQLite for dev, Postgres for prod)
export LITEGEN__DATABASE_URL=sqlite://litegen.db
# Directory containing provider YAML model definitions (default: ./models)
export LITEGEN_MODELS_DIR=./models
# Provider API keys (as needed)
export OPENAI_API_KEY=sk-...
export RUNWAY_API_KEY=...Then start the backend and dashboard:
# Backend
cd litegen-core
cargo build --release
./target/release/litegen
# Dashboard (separate terminal)
cd dashboard
npm install
npm run dev# Clone and configure
git clone https://github.com/litegen/litegen.git
cd litegen
cp litegen.example.yaml litegen.yaml
# Edit litegen.yaml with your API keys
# Run
docker compose up -dOpen http://localhost:4000/health to verify, and http://localhost:5173 for the dashboard.
# Backend
cd litegen-core
cargo build --release
./target/release/litegen
# Dashboard (separate terminal)
cd dashboard
npm install
npm run devSet provider keys via env vars (no config file needed):
export OPENAI_API_KEY=sk-...
export STABILITY_API_KEY=sk-...
export REPLICATE_API_TOKEN=r8_...
export GOOGLE_API_KEY=...
export FAL_KEY=...
export RUNWAY_API_KEY=...
export LUMA_API_KEY=...curl -X POST http://localhost:4000/v1/images/generations \
-H "Content-Type: application/json" \
-d '{
"prompt": "a serene mountain landscape at sunset",
"model": "openai/dall-e-3",
"size": "1024x1024",
"quality": "hd"
}'curl -X POST http://localhost:4000/v1/videos/generations \
-H "Content-Type: application/json" \
-d '{
"prompt": "a timelapse of clouds over a city",
"model": "runway/gen-3",
"duration_seconds": 5
}'curl http://localhost:4000/v1/modelscurl http://localhost:4000/healthcurl http://localhost:4000/v1/statscurl "http://localhost:4000/v1/logs?page=1&per_page=50"Configure sophisticated routing in litegen.yaml:
model_routes:
# Fallback: try OpenAI first, then Stability
- model: "dall-e-3"
strategy: fallback
deployments:
- provider: openai
max_retries: 2
timeout_seconds: 120
- provider: stability
# Weighted: 75% Fal, 25% Replicate
- model: "fal/*"
strategy: weighted_round_robin
deployments:
- provider: fal
weight: 3
- provider: replicate
weight: 1
# Cheapest provider first
- model: "*"
strategy: lowest_cost
deployments:
- provider: openai
- provider: stability
- provider: replicate┌──────────────┐ ┌──────────────────────────────────────────┐
│ Clients │────▶│ LiteGen Proxy │
│ (REST API) │ │ │
└──────────────┘ │ ┌────────┐ ┌────────┐ ┌──────────┐ │
│ │ Router │─▶│ Cache │─▶│ Provider │ │
│ │ │ │ (moka) │ │ Registry │ │
│ └────────┘ └────────┘ └──────────┘ │
│ │ │
│ ┌──────┐ ┌──────────┐ ▼ │
│ │ DB │ │ Metrics │ ┌──────────┐ │
│ │(SQLx)│ │(Prom/OTel│ │ OpenAI │ │
│ └──────┘ └──────────┘ │ Stability│ │
│ │ Replicate│ │
│ ┌────────────────────┐ │ Google │ │
│ │ React Dashboard │ │ Fal │ │
│ │ (Vite + Recharts) │ │ Runway │ │
│ └────────────────────┘ │ Luma │ │
│ └──────────┘ │
└──────────────────────────────────────────┘
litegen/
├── litegen-core/ # Rust backend
│ ├── src/
│ │ ├── main.rs # Server entrypoint
│ │ ├── lib.rs # Library exports
│ │ ├── api/ # REST API handlers + auth middleware
│ │ ├── config/ # YAML + env config loading
│ │ ├── db/ # SQLite/Postgres persistence
│ │ ├── providers/ # Provider implementations
│ │ │ ├── image/ # OpenAI, Stability, Replicate, Google, Fal, Mock
│ │ │ └── video/ # OpenAI (Sora), Fal, Replicate, Runway, Luma, Mock
│ │ ├── proxy/ # Router, registry, cache
│ │ └── types/ # Shared types + OpenAPI schemas
│ ├── migrations/ # SQL migrations
│ └── tests/ # Integration tests
├── dashboard/ # React + Vite dashboard
│ └── src/pages/ # Overview, Logs, Models, Health, Keys
├── Dockerfile # Multi-stage build
├── docker-compose.yml # One-command deployment
└── litegen.example.yaml # Example configuration
Several vendors serve both image and video under one provider name — e.g. a single
openai provider routes openai/dall-e-3 (image) and openai/sora (video), and the same
holds for fal and replicate. The model ID prefix (<provider>/<model>) selects the
modality automatically.
| Provider | Image | Video | Models |
|---|---|---|---|
| OpenAI | ✅ | ✅ | DALL-E 2/3 · Sora 2 / Sora 2 Pro |
| Fal.ai | ✅ | ✅ | Flux, SDXL, Recraft · Kling, MiniMax, SVD, LTX |
| Replicate | ✅ | ✅ | Flux, SDXL, SD3 · AnimateDiff, SVD, Zeroscope |
| ✅ | 🔜 | Imagen 3, Gemini 2.5/3 image · Veo 3 video — public API, not yet wired | |
| Stability AI | ✅ | — | SD3.5, Core, Ultra, SDXL · (Stable Video hosted API discontinued) |
| Runway | 🔜 | ✅ | Gen-4 Image — public API, not yet wired · Gen-3 / Gen-4 video |
| Luma | 🔜 | ✅ | Photon image — public API, not yet wired · Dream Machine / Ray video |
Legend: ✅ supported in litegen · 🔜 vendor offers a public API for this modality but litegen hasn't wired it yet (see Roadmap: missing providers) · — no public API.
Gaps below are catalogued from a May 2026 audit of official vendor APIs. "Endpoint" is the upstream create/generate call a litegen provider would target.
| Gap | Modality | Upstream endpoint |
|---|---|---|
| Google Veo (biggest gap — Google is image-only today) | video | POST generativelanguage.googleapis.com/v1beta/models/{veo-3.x}:predictLongRunning |
| Luma Photon (Luma is video-only today) | image | POST api.lumalabs.ai/dream-machine/v1/generations/image (photon-1, photon-flash-1) |
| Runway image (Runway is video-only today) | image | POST api.dev.runwayml.com/v1/text_to_image (gen4_image, gen4_image_turbo) |
Stability Stable Video is intentionally omitted — Stability discontinued the hosted video API (self-host only), so there is no public endpoint to wire.
Both image + video:
- Amazon Bedrock — Nova Canvas / Titan (image) + Nova Reel (video).
bedrock-runtime.{region}.amazonaws.com(SigV4 auth). - MiniMax / Hailuo —
image-01(image) + Hailuo (video).api.minimax.io/v1/{image_generation,video_generation}. - Kling AI (Kuaishou) — kling image + video.
api.klingai.com/v1/{images/generations,videos/text2video}. - Leonardo.Ai — Phoenix (image) + Motion (video).
cloud.leonardo.ai/api/rest/v1. - ByteDance Seedance (BytePlus ModelArk / Volcengine Ark) — Seedream (image) + Seedance (video).
- Tencent Hunyuan — image + video.
aiart.tencentcloudapi.com.
Image only:
- Black Forest Labs (FLUX) —
api.bfl.ai/v1/{model}(flux-2-pro, flux-pro-1.1, flux-kontext). Direct FLUX, no Replicate/Fal hop. - Ideogram —
api.ideogram.ai/v1/ideogram-v3/generate(strong text rendering). - Recraft —
external.api.recraft.ai/v1/images/generations(vector/brand styles).
Video only:
- Vidu —
api.vidu.com/ent/v2/{text2video,img2video}. - Pixverse — public API (also reachable via Fal).
No first-party public API (reachable only indirectly / not integrable directly):
- Midjourney — image + video, but UI/Discord only; no REST API.
- Pika — video only; no first-party API (exposed via the Fal aggregator).
- Adobe Firefly — image + video via Firefly Services (enterprise/Adobe IMS auth), not a simple key.
LITEGEN_MASTER_KEY=your-secret-key docker compose up -dThe service starts on port 4000. The Postgres database is provisioned automatically; no manual migration step is needed.
To allow browser clients through CORS:
LITEGEN_MASTER_KEY=your-secret-key LITEGEN_CORS_ORIGINS=https://your-domain docker compose up -dIn single-tenant mode (the default), a request with no credentials is granted full admin scope (the local-dev convenience bypass). To prevent an unauthenticated admin API from being silently exposed, LiteGen runs a startup self-check: if it is bound to a non-loopback address (e.g. the default 0.0.0.0) in single-tenant mode with no real LITEGEN__MASTER_KEY (unset, empty, or whitespace-only), it refuses to start with a fatal error.
Resolve it by either:
- setting
LITEGEN__MASTER_KEYto a strong random secret (recommended), or - binding loopback-only (
LITEGEN__SERVER__HOST=127.0.0.1) behind a reverse proxy, or - explicitly opting into no-auth with
LITEGEN__ALLOW_NO_AUTH=true(logs a loud warning; only for trusted networks / fronting auth proxies).
Hosted (multi-tenant) mode is unaffected — the no-credential bypass is disabled there, so unauthenticated requests get 401.
| Variable | Required | Default | Description |
|---|---|---|---|
LITEGEN__MASTER_KEY |
Yes* | — | Bearer token that grants full admin access. All API requests must include Authorization: Bearer <value>. *In single-tenant mode the proxy refuses to start if this is unset (or blank/whitespace) while bound to a non-loopback address — see the security note below. |
LITEGEN__DATABASE_URL |
No | sqlite://litegen.db |
SQLite (sqlite://path/to/file.db) or Postgres (postgres://user:pass@host/db) connection URL. |
LITEGEN__SERVER__HOST |
No | 0.0.0.0 |
Bind address. Defaults to all interfaces; set to 127.0.0.1 to bind loopback-only behind a reverse proxy. |
LITEGEN__SERVER__PORT |
No | 4000 |
TCP port the HTTP server listens on. |
LITEGEN__ALLOW_NO_AUTH |
No | false |
Escape hatch for the startup security check. Set to true to run single-tenant with no authentication on a non-loopback bind (e.g. behind a trusted auth proxy); the server logs a loud warning instead of refusing to start. Leave unset in normal deployments. |
LITEGEN_MODELS_DIR |
No | ./models |
Directory containing provider YAML model capability definitions. |
LITEGEN_CORS_ORIGINS |
No | (deny all) | Comma-separated list of allowed CORS origins, e.g. https://app.example.com. Leave empty to deny browser cross-origin requests. |
OTEL_EXPORTER_OTLP_ENDPOINT |
No | — | gRPC endpoint for OpenTelemetry trace export, e.g. http://collector:4317. |
RUST_LOG |
No | info |
Log level filter. Use litegen=debug for verbose logs. |
OPENAI_API_KEY |
No | — | OpenAI provider key. |
STABILITY_API_KEY |
No | — | Stability AI provider key. |
REPLICATE_API_TOKEN |
No | — | Replicate provider token. |
GOOGLE_API_KEY |
No | — | Google Imagen provider key. |
FAL_KEY |
No | — | Fal.ai provider key. |
RUNWAY_API_KEY |
No | — | Runway provider key. |
LUMA_API_KEY |
No | — | Luma Dream Machine provider key. |
| Endpoint | Purpose |
|---|---|
GET /health/live |
Liveness — returns 200 OK {"status":"alive"} as soon as the process is up. Use as a container liveness probe. |
GET /health/ready |
Readiness — returns 200 OK {"status":"ready"} when the database is reachable and at least one provider is configured. Returns 503 otherwise. Use as a container readiness probe before routing traffic. |
LiteGen uses a default-deny CORS policy. Browser JavaScript clients will be blocked unless you explicitly set LITEGEN_CORS_ORIGINS. In production, set it to your front-end domain:
LITEGEN_CORS_ORIGINS=https://app.example.comMultiple origins are supported as a comma-separated list.
A helper script lives at scripts/backup-db.sh. It reads LITEGEN__DATABASE_URL from the environment
and writes a timestamped snapshot to a target directory (default: ./backups).
# One-off backup
LITEGEN__DATABASE_URL=sqlite://litegen.db ./scripts/backup-db.sh /var/backups/litegen
# Postgres
LITEGEN__DATABASE_URL=postgres://user:pass@localhost/litegen ./scripts/backup-db.sh /var/backups/litegenHourly cron example:
# crontab: hourly backup
0 * * * * LITEGEN__DATABASE_URL=... /opt/litegen/scripts/backup-db.sh /var/backups/litegen >> /var/log/litegen-backup.log 2>&1
The SQLite path uses .backup via the sqlite3 CLI (safe under concurrent writes). The Postgres path
pipes pg_dump through gzip. Ship the resulting files to S3, GCS, or any other durable storage.
Run LiteGen behind nginx or Caddy for HTTPS. Example Caddyfile:
api.example.com {
reverse_proxy litegen:4000
}
LiteGen itself does not terminate TLS. Expose only port 443 publicly; keep port 4000 internal.
A single LiteGen instance handles hundreds of concurrent requests. When scaling to multiple instances, be aware:
- Rate limiting and circuit breakers are currently tracked in-process memory. A request hitting instance A does not share state with instance B. For multi-instance deployments, route a given API key to the same instance (sticky sessions by
Authorizationheader) until a Redis-backed shared state layer is added in a future release. - Postgres is recommended over SQLite for multi-instance deployments (SQLite allows only one writer at a time).
When the users table is empty, POST /v1/auth/signup creates the first account and grants it the Owner role automatically.
If LITEGEN__OWNER_EMAIL is set, only that email address can claim ownership on first signup — any other email is rejected until an Owner exists to invite them.
Master-key Bearer auth keeps working as a superuser bypass. Set LITEGEN__MASTER_KEY and include Authorization: Bearer <value> on any request to bypass user auth entirely. Useful for bootstrap scripts, CI pipelines, and recovery scenarios.
| Role | Description |
|---|---|
| Owner | Full access including system:transfer_owner. Only one Owner at a time. |
| Admin | Full access to all resources except transferring ownership. |
| Member | Manage own API keys, create generations, view own logs. |
| Viewer | Read-only — view own keys and logs; cannot create or mutate. |
Owners and Admins can invite new users from the dashboard:
- Go to
/usersin the dashboard. - Click Invite, enter the recipient's email and desired role.
- An invite link is generated:
https://your-host/invite/<token>. Copy and send it manually until SMTP is configured. - The recipient clicks the link and sets a password, or signs in via OAuth if a provider is pre-configured.
Set the following environment variables to enable GitHub and/or Google OAuth:
LITEGEN__OAUTH__GITHUB__CLIENT_ID=...
LITEGEN__OAUTH__GITHUB__CLIENT_SECRET=...
LITEGEN__OAUTH__GOOGLE__CLIENT_ID=...
LITEGEN__OAUTH__GOOGLE__CLIENT_SECRET=...
LITEGEN__OAUTH__CALLBACK_BASE=https://your-litegen.example.comRegister these callback URLs with your OAuth provider:
- GitHub:
${CALLBACK_BASE}/v1/auth/oauth/github/callback - Google:
${CALLBACK_BASE}/v1/auth/oauth/google/callback
Important: OAuth does not auto-create accounts. A user must already be invited (their email must exist in the users table) before OAuth sign-in will succeed. Attempting to OAuth with an uninvited email returns account_not_invited.
| Cookie | Description |
|---|---|
litegen_session |
HttpOnly + Secure + SameSite=Lax, 7-day TTL with sliding expiry. |
litegen_csrf |
Readable CSRF token — include as X-CSRF-Token on any mutating session-authed request. |
For local HTTP development (without HTTPS), set LITEGEN__COOKIE_INSECURE_DEV=true to allow cookies over plain HTTP. Do not use this in production.
LiteGen supports two independent authentication modes:
- Bearer API keys — programmatic clients. Each key carries scopes (
generate,read,admin), budget caps, and RPM limits. Pass asAuthorization: Bearer <key>. - Session cookies — human dashboard users. Roles and permissions are enforced on every request.
- Master key — set via
LITEGEN__MASTER_KEY. Bypasses both modes; treated as a superuser for all endpoints.
Both modes can be active simultaneously. Programmatic clients continue working unchanged when user auth is enabled.
LiteGen runs in one of two modes, selected by LITEGEN__MODE:
LITEGEN__MODE=single_tenant # default — unchanged single-deployment behavior
LITEGEN__MODE=hosted # multi-tenant SaaS: open self-serve signupThe default is single_tenant; everything documented above (first-run owner signup, master-key bypass, roles, sessions) behaves exactly as before. Hosted mode opts into the multi-tenant platform:
POST /v1/auth/signupbecomes open self-serve signup. Each new user gets their own Organization (plus a default Application) and becomes its Owner — signup is no longer gated by the empty-userscheck orLITEGEN__OWNER_EMAIL.- The master key is demoted to platform-admin: it is still a superuser for operational endpoints, but it carries no implicit tenant access — it is not a member of any org and cannot read tenant-scoped data by default.
User ──< OrganizationMember >── Organization ──< Application ──< API key
(per-org role)
- A User belongs to one or more Organizations via memberships, each carrying a per-org role:
owner,admin,member, orviewer. - An Organization owns one or more Applications.
- An Application owns its API keys and its own upstream provider credentials.
Keys are minted as an id/secret pair:
- A public id
pk_live_…— safe to display; shown in the dashboard key list. - A secret
sk_live_…— shown once at creation (and on rotation), never again.
Programmatic clients send the secret as Authorization: Bearer sk_live_…; the secret resolves the owning tenant (org + app) server-side. This is a drop-in for OpenAI-style SDKs — point the SDK's base URL at your LiteGen deployment and use the sk_live_… secret as the API key.
Rotation re-issues a new pk_live_…/sk_live_… pair in place on the same key record (preserving its name, scopes, quota, and RPM limit); the old secret stops working immediately.
A dashboard session can belong to multiple orgs and apps. The active selection is sent on each request as:
X-Litegen-Org-Id— the active organization.X-Litegen-App-Id— the active application within that org.
Both headers are validated against the user's memberships server-side, so a session can never read or mutate a tenant it isn't a member of.
Each Application stores its own upstream provider credentials, encrypted at rest with AES-256-GCM. To enable credential storage in hosted mode, provide a base64-encoded 32-byte key:
LITEGEN__SECRETS_KEY=$(head -c 32 /dev/urandom | base64)At generation time (and during background video-status polling) each request is dispatched using that app's decrypted provider credentials, keeping tenants' upstream keys isolated. Resolution order per provider is: the app's stored credential → the global env-configured credential (if any) → 400 provider_not_configured when neither exists. Cost-estimate endpoints follow the same rule. Decryption/parse failures surface as 500 (server misconfiguration), never a silent fallback to another key.
Each application can store its generated images in its own S3-compatible bucket (AWS S3, MinIO, Cloudflare R2, DigitalOcean Spaces). Configure it per app:
GET /v1/apps/{app_id}/storage— current config (never returns the secret)PUT /v1/apps/{app_id}/storage— set/update; body:bucket_name(required),region,endpoint_url,custom_public_url,path_prefix, and the write-onlyaccess_key_id+secret_access_key(send both to set/rotate; omit both to keep the existing pair)DELETE /v1/apps/{app_id}/storage— revert to the platform default store
The credential pair is encrypted at rest with AES-256-GCM using
LITEGEN__SECRETS_KEY (required in hosted mode). Apps with no storage configured
fall back to the global image_storage backend (S3 if set, else base64 inline).
Hosted mode is fully functional on a single instance. Horizontal multi-instance scaling — shared Redis for rate limits and circuit-breaker state, and object storage for generation artifacts — is a follow-up (Phase 3), and automated infrastructure provisioning is Phase 4.
MIT