Skip to content

thibautrey/multicodex-proxy

Folders and files

NameName
Last commit message
Last commit date

Latest commit

Β 

History

120 Commits
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 

Repository files navigation

MultiVibe

OpenAI-compatible multi-provider router
Quota-aware routing β€’ OAuth onboarding β€’ Persistent storage β€’ Request tracing β€’ Automatic model discovery

GitHub stars GitHub forks GitHub issues


✨ What it does

MultiVibe acts as an OpenAI-compatible gateway that lets you route requests across multiple provider accounts while keeping a single /v1 API surface:

  • OpenAI-compatible API
    • GET /v1/models
    • GET /v1/models/:id
    • POST /v1/chat/completions
    • POST /v1/responses
    • POST /v1/responses/compact
  • Streaming over SSE or WebSocket
    • HTTP streaming uses plain POST with stream: true
    • HTTP response stream is text/event-stream
    • /v1/responses also accepts ws:// / wss:// and Codex-style JSON response.create frames
    • /v1/chat/completions and /v1/responses/compact remain HTTP-only
  • Multi-account routing with quota-aware failover
  • Model aliases (for example small) with ordered fallback across providers/models
  • OAuth onboarding from dashboard (manual redirect paste flow)
  • Persistent account storage across container restarts
  • Request tracing v2 (retention capped at 1000, server pagination, tokens/model/error/latency stats, optional full payload)
  • Usage stats endpoint with global + per-account + per-route aggregates over full history
  • Time-range stats (sinceMs / untilMs) while keeping only the latest 1000 full traces

πŸ–ΌοΈ Dashboard gallery

Screenshots below are taken in sanitized mode (?sanitized=1).

Overview

Overview

Accounts

Accounts

Tracing

Tracing

Playground

Playground

API docs tab

Docs


🧠 Routing strategy

When a request arrives, MultiVibe chooses an account with this strategy:

  1. Prefer accounts untouched on both windows (5h + weekly)
  2. Otherwise prefer account with nearest weekly reset
  3. Fallback by priority
  4. On 429/quota-like errors, block account and retry on next

When the requested model is an alias, MultiVibe resolves it to ordered target models and automatically falls back across target models/providers as quotas are hit.


πŸ“¦ Persistence

Everything important is file-based and survives restart (if /data is mounted):

  • /data/accounts.json
  • /data/oauth-state.json
  • /data/requests-trace.jsonl
  • /data/requests-stats-history.jsonl

Trace retention is capped to the latest 1000 entries. Stats history is append-only and keeps lightweight request metadata for long-term cost/volume tracking.

Docker compose already mounts ./data:/data.


πŸš€ Quick start (Docker)

docker compose up -d --build
  • Dashboard: http://localhost:1455
  • Health: http://localhost:1455/health

πŸ” OAuth onboarding flow

Because this is often deployed remotely (Unraid/VPS), onboarding uses a manual redirect paste flow:

  1. Open dashboard
  2. For OpenAI accounts, enter the account email
  3. Click Start OAuth
  4. Complete login in browser
  5. Copy the full redirect URL shown after the callback completes
  6. Paste that URL in the dashboard and click Complete OAuth

Mistral accounts still use manual token entry in the dashboard.

Default expected redirect URI:

http://localhost:1455/auth/callback

πŸ§ͺ API examples

List models

curl http://localhost:1455/v1/models

Example model object returned:

{
  "id": "gpt-5.3-codex",
  "object": "model",
  "created": 1730000000,
  "owned_by": "multivibe",
  "metadata": {
    "context_window": null,
    "max_output_tokens": null,
    "supports_reasoning": true,
    "supports_tools": true,
    "supported_tool_types": ["function"]
  }
}

Chat completion

curl -X POST http://localhost:1455/v1/chat/completions \
  -H "content-type: application/json" \
  -d '{
    "model": "gpt-5.3-codex",
    "messages": [{"role":"user","content":"hello"}]
  }'

Streaming responses

curl -N -X POST http://localhost:1455/v1/responses \
  -H "content-type: application/json" \
  -d '{
    "model": "gpt-5.3-codex",
    "input": "hello",
    "stream": true
  }'

WebSocket responses

const ws = new WebSocket("ws://localhost:1455/v1/responses", {
  headers: {
    Authorization: "Bearer YOUR_TOKEN",
  },
});

ws.onmessage = (event) => {
  console.log(JSON.parse(event.data));
};

ws.onopen = () => {
  ws.send(
    JSON.stringify({
      type: "response.create",
      model: "gpt-5.3-codex",
      input: [
        { role: "user", content: [{ type: "input_text", text: "hello" }] },
      ],
      stream: true,
    }),
  );
};

Create model alias

curl -X POST http://localhost:1455/admin/model-aliases \
  -H "x-admin-token: change-me" \
  -H "content-type: application/json" \
  -d '{
    "id": "small",
    "targets": ["gpt-5.1-codex-mini", "devstral-small-latest"],
    "enabled": true,
    "description": "Small coding model pool"
  }'

Read traces

# Paginated API (recommended)
curl -H "x-admin-token: change-me" \
  "http://localhost:1455/admin/traces?page=1&pageSize=100"
# Legacy compatibility mode
curl -H "x-admin-token: change-me" \
  "http://localhost:1455/admin/traces?limit=50"

Usage stats

curl -H "x-admin-token: change-me" \
  "http://localhost:1455/admin/stats/usage?sinceMs=1735689600000&untilMs=1738291200000"

Trace stats (historical)

curl -H "x-admin-token: change-me" \
  "http://localhost:1455/admin/stats/traces?sinceMs=1735689600000&untilMs=1738291200000"

Optional filters:

  • accountId=<id>
  • route=/v1/chat/completions
  • sinceMs=<epoch_ms>
  • untilMs=<epoch_ms>

Model alias admin endpoints:

  • GET /admin/model-aliases
  • POST /admin/model-aliases
  • PATCH /admin/model-aliases/:id
  • DELETE /admin/model-aliases/:id

βš™οΈ Environment variables

Variable Default Description
PORT 1455 HTTP server port
STORE_PATH /data/accounts.json Accounts store
OAUTH_STATE_PATH /data/oauth-state.json OAuth flow state
TRACE_FILE_PATH /data/requests-trace.jsonl Request trace file (retained to latest 1000 entries)
TRACE_STATS_HISTORY_PATH /data/requests-stats-history.jsonl Lightweight request history for long-term stats
TRACE_INCLUDE_BODY true Persist full request payloads; trace stats still work when disabled
PROXY_MODELS gpt-5.3-codex,gpt-5.2-codex,gpt-5-codex Fallback comma-separated model list for /v1/models
MODELS_CLIENT_VERSION 1.0.0 Version sent to /backend-api/codex/models for model discovery
MODELS_CACHE_MS 600000 Model discovery cache duration (ms)
ADMIN_TOKEN change-me Admin endpoints auth token
CHATGPT_BASE_URL https://chatgpt.com Upstream base URL
UPSTREAM_PATH /backend-api/codex/responses Upstream request path
UPSTREAM_COMPACT_PATH /backend-api/codex/responses/compact Upstream path for /v1/responses/compact
OAUTH_CLIENT_ID app_EMoamEEZ73f0CkXaXp7hrann OpenAI OAuth client id
OAUTH_AUTHORIZATION_URL https://auth.openai.com/oauth/authorize OAuth authorize endpoint
OAUTH_TOKEN_URL https://auth.openai.com/oauth/token OAuth token endpoint
OAUTH_SCOPE openid profile email offline_access OAuth scope
OAUTH_REDIRECT_URI http://localhost:1455/auth/callback Redirect URI
MISTRAL_COMPACT_UPSTREAM_PATH /v1/responses/compact Mistral upstream path for compact responses

πŸ› οΈ Local dev

npm install
npm --prefix web install
npm run build
npm run start

πŸ“ˆ Star history

Star History Chart

🀝 Contributing

PRs and issues are welcome.

If you open a PR:

  • keep it focused
  • include before/after behavior
  • include screenshots for UI changes

Releases

No releases published

Packages

 
 
 

Languages