Session: dev-research-claudish-settings-20260316-012741-6e25c3bb Date: 2026-03-16 Status: COMPLETE Sources: Live codebase investigation (cli.ts, config.ts, model-parser.ts, provider-resolver.ts, auto-route.ts, remote-provider-registry.ts, profile-config.ts, routing-rules.ts, local.ts, gemini-oauth.ts, vertex-auth.ts, local-queue.ts)
Claudish is a proxy tool that wraps Claude Code with support for non-Anthropic AI providers. It intercepts Claude Code's API calls and reroutes them to providers like OpenRouter, Google Gemini, OpenAI, MiniMax, Kimi, GLM, and local models (Ollama, LM Studio, vLLM, MLX). Configuration is layered: CLI flags override environment variables, which override profile settings from config files. The routing syntax uses provider@model[:concurrency] (v4.0+, preferred) or the legacy prefix/model format (still supported, deprecated). Auto-routing selects a provider automatically based on available credentials. The priority chain is configurable via defaultProvider (v7.0.0+). The default chain (when no defaultProvider is set and only OPENROUTER_API_KEY is present) is: OpenCode Zen → provider subscription plan → native API → OpenRouter fallback. When LITELLM_BASE_URL + LITELLM_API_KEY are set without explicit defaultProvider, legacy auto-promotion puts LiteLLM first. Configuration files live at ~/.claudish/config.json (global) and .claudish.json (local/project); local always takes precedence.
All flags recognized by parseArgs() in packages/cli/src/cli.ts.
| Flag | Short | Type | Default | Description |
|---|---|---|---|---|
--model |
-m |
string | none (prompts interactively) | Model to use. Accepts provider@model syntax, legacy prefix/model, or bare model name for auto-detection |
--default-provider |
string | none | Default provider for auto-routing (v7.0.0+). Overrides env var and config file. Valid: built-in provider names or custom endpoint names | |
--model-opus |
string | none | Model for Opus role (planning, complex tasks) | |
--model-sonnet |
string | none | Model for Sonnet role (default coding) | |
--model-haiku |
string | none | Model for Haiku role (fast tasks, background) | |
--model-subagent |
string | none | Model for sub-agents (Task tool) | |
--port |
number | random (3000–9000) | Proxy server port | |
--auto-approve |
-y |
boolean | false | Skip permission prompts (passes --dangerously-skip-permissions to Claude Code) |
--no-auto-approve |
boolean | Explicitly enable permission prompts (overrides -y) | ||
--dangerous |
boolean | false | Pass --dangerouslyDisableSandbox to Claude Code |
|
--interactive |
-i |
boolean | auto | Interactive mode (default when no prompt argument given) |
--debug |
-d |
boolean | false | Enable debug logging to logs/claudish_*.log; also sets --log-level debug unless overridden |
--log-level |
string | "info" |
Log verbosity: debug (full content), info (truncated content), minimal (labels only) |
|
--quiet |
-q |
boolean | auto | Suppress [claudish] log messages (default in single-shot mode) |
--verbose |
-v |
boolean | auto | Show [claudish] messages (default in interactive mode) |
--json |
boolean | false | Output JSON format for tool integration; implies --quiet |
|
--monitor |
boolean | false | Proxy to real Anthropic API and log all traffic (uses Claude Code's native auth) | |
--stdin |
boolean | false | Read prompt from stdin instead of positional arguments | |
--free |
boolean | false | Show only free models in interactive model selector | |
--profile |
-p |
string | default profile | Named profile for model mapping |
--cost-tracker |
boolean | false | Enable cost tracking; also enables monitor mode | |
--audit-costs |
action | Show cost analysis report and exit | ||
--reset-costs |
action | Reset accumulated cost statistics and exit | ||
--models / --list-models |
-s / --search |
action | List ALL models (from OpenRouter + LiteLLM + local Ollama) or fuzzy-search by query | |
--top-models |
action | List curated recommended models and exit | ||
--force-update |
boolean | false | Force refresh of model catalog cache (used with --models or --top-models) |
|
--summarize-tools |
boolean | false | Summarize tool descriptions to reduce prompt size for local/small models | |
--version |
action | Show version and exit | ||
--help |
-h |
action | Show help message and exit | |
--help-ai |
action | Show AI agent usage guide (from AI_AGENT_GUIDE.md) and exit |
||
--init |
action | Install Claudish skill in .claude/skills/claudish-usage/SKILL.md |
||
--mcp |
action | Run as MCP server | ||
--gemini-login |
action | Login to Gemini Code Assist via OAuth | ||
--gemini-logout |
action | Clear Gemini OAuth credentials | ||
--kimi-login |
action | Login to Kimi/Moonshot AI via OAuth | ||
--kimi-logout |
action | Clear Kimi OAuth credentials | ||
-- |
separator | Everything after -- passes directly to Claude Code without processing |
Passthrough behavior: Any unrecognized flag is automatically forwarded to Claude Code. If the token immediately following the flag does not start with -, it is consumed as that flag's value. Examples: --agent detective, --effort high, --permission-mode plan.
Positional arguments: Tokens without a leading - are treated as the prompt text and forwarded to Claude Code.
Interactive mode detection: If no positional arguments are given and --stdin is not set, Claudish automatically enters interactive mode (as if --interactive was specified).
--json implies --quiet: When --json is set, config.quiet is forced to true regardless of other flags.
--cost-tracker enables monitor mode: Setting --cost-tracker automatically sets config.monitor = true if it is not already set.
These are top-level subcommands recognized before flag parsing begins (checked in packages/cli/src/index.ts).
| Command | Description |
|---|---|
claudish init [--local|--global] |
Setup wizard: creates config file and first profile interactively |
claudish profile list [--local|--global] |
List all profiles from one or both scopes |
claudish profile add [--local|--global] |
Add a new profile interactively |
claudish profile remove <name> [--local|--global] |
Remove a named profile |
claudish profile use <name> [--local|--global] |
Set the default profile |
claudish profile show [name] [--local|--global] |
Show profile details (models, timestamps) |
claudish profile edit [name] [--local|--global] |
Edit a profile interactively |
claudish update |
Check for updates and install the latest version (detects npm, bun, brew) |
claudish telemetry on |
Enable telemetry (opt-in) |
claudish telemetry off |
Disable telemetry |
claudish telemetry status |
Show current telemetry consent and configuration |
claudish telemetry reset |
Reset telemetry consent to unasked state |
Scope flags for profile commands:
--local: Target.claudish.jsonin the current working directory--global: Target~/.claudish/config.json- (omit): Prompted interactively; suggests
--localif CWD appears to be a project directory (has.git,package.json,Cargo.toml,go.mod,pyproject.toml, or.claudish.json)
Claudish automatically loads .env from the current working directory at startup using dotenv. All variables below can be set in .env.
| Variable | Purpose | Default |
|---|---|---|
CLAUDISH_DEFAULT_PROVIDER |
Default provider for auto-routing (v7.0.0+); overrides config file defaultProvider |
none |
CLAUDISH_MODEL |
Default model (higher priority than ANTHROPIC_MODEL) |
none |
CLAUDISH_PORT |
Default proxy port | random (3000–9000) |
CLAUDISH_CONTEXT_WINDOW |
Override context window size for local models (integer) | auto-detected |
CLAUDISH_MODEL_OPUS |
Override model for Opus role | none |
CLAUDISH_MODEL_SONNET |
Override model for Sonnet role | none |
CLAUDISH_MODEL_HAIKU |
Override model for Haiku role | none |
CLAUDISH_MODEL_SUBAGENT |
Override model for sub-agents | none |
CLAUDISH_SUMMARIZE_TOOLS |
Summarize tool descriptions (true or 1 to enable) |
false |
CLAUDISH_TELEMETRY |
Override telemetry (0, false, or off to disable) |
from config |
CLAUDISH_ACTIVE_MODEL_NAME |
(Internal) Set by Claudish to display model name in status line | auto |
CLAUDISH_IS_LOCAL |
(Internal) Set to "true" for local models; used by status line to show "LOCAL" instead of cost |
auto |
CLAUDISH_LOCAL_QUEUE_ENABLED |
Enable/disable local model request queue (false or 0 to disable) |
true |
CLAUDISH_LOCAL_MAX_PARALLEL |
Max concurrent local model requests (integer 1–8; values above 8 are capped) | 1 |
CLAUDISH_QWEN_NO_THINK |
Prepend /no_think to system prompt for Qwen local models (set to "1") |
none |
| Variable | Purpose | Fallback for |
|---|---|---|
ANTHROPIC_MODEL |
Claude Code standard model selection | CLAUDISH_MODEL (lower priority) |
ANTHROPIC_SMALL_FAST_MODEL |
Claude Code standard fast model var | — |
ANTHROPIC_DEFAULT_OPUS_MODEL |
Claude Code opus model var | CLAUDISH_MODEL_OPUS (lower priority) |
ANTHROPIC_DEFAULT_SONNET_MODEL |
Claude Code sonnet model var | CLAUDISH_MODEL_SONNET (lower priority) |
ANTHROPIC_DEFAULT_HAIKU_MODEL |
Claude Code haiku model var | CLAUDISH_MODEL_HAIKU (lower priority) |
CLAUDE_CODE_SUBAGENT_MODEL |
Claude Code subagent model var | CLAUDISH_MODEL_SUBAGENT (lower priority) |
ANTHROPIC_API_KEY |
Placeholder to suppress Claude Code API key dialog | (placeholder set by Claudish) |
ANTHROPIC_AUTH_TOKEN |
Placeholder to suppress Claude Code login screen | (placeholder set by Claudish) |
CLAUDE_PATH |
Custom path to Claude Code binary | ~/.claude/local/claude, then global PATH |
Priority for model selection (highest to lowest):
- CLI flag (
--model,--model-opus, etc.) CLAUDISH_MODEL_*environment variablesANTHROPIC_DEFAULT_*/CLAUDE_CODE_SUBAGENT_MODELenvironment variables- Profile models from config (local
.claudish.jsonfirst, then global) - Interactive selector (if no model specified in interactive mode)
| Variable | Provider | Aliases | Where to Get |
|---|---|---|---|
OPENROUTER_API_KEY |
OpenRouter (default backend / universal fallback) | https://openrouter.ai/keys | |
GEMINI_API_KEY |
Google Gemini direct API (g@, google@) |
https://aistudio.google.com/app/apikey | |
OPENAI_API_KEY |
OpenAI direct API (oai@) |
https://platform.openai.com/api-keys | |
MINIMAX_API_KEY |
MiniMax (mm@, mmax@) |
https://www.minimaxi.com/ | |
MINIMAX_CODING_API_KEY |
MiniMax Coding Plan (mmc@) |
https://platform.minimax.io/ | |
MOONSHOT_API_KEY |
Kimi/Moonshot (kimi@, moon@) |
KIMI_API_KEY |
https://platform.moonshot.cn/ |
KIMI_CODING_API_KEY |
Kimi Coding Plan (kc@); also accepts OAuth via claudish --kimi-login |
https://kimi.com/code | |
ZHIPU_API_KEY |
GLM/Zhipu direct API (glm@, zhipu@) |
GLM_API_KEY |
https://open.bigmodel.cn/ |
GLM_CODING_API_KEY |
GLM Coding Plan at Z.AI (gc@) |
ZAI_CODING_API_KEY |
https://z.ai/subscribe |
ZAI_API_KEY |
Z.AI Anthropic-compatible API (zai@) |
https://z.ai/ | |
OLLAMA_API_KEY |
OllamaCloud hosted API (oc@, llama@, lc@, meta@) |
https://ollama.com/account | |
OPENCODE_API_KEY |
OpenCode Zen (zen@); optional for free models (falls back to "public" bearer) |
https://opencode.ai/ | |
XAI_API_KEY |
xAI / Grok (direct API, detected in model selector) | https://x.ai/ | |
LITELLM_API_KEY |
LiteLLM proxy (ll@, litellm@) |
https://docs.litellm.ai/ | |
POE_API_KEY |
Poe (poe@) |
https://poe.com/ | |
VERTEX_API_KEY |
Vertex AI Express mode (v@, vertex@) |
https://console.cloud.google.com/vertex-ai | |
VERTEX_PROJECT |
Vertex AI OAuth mode — GCP project ID | GOOGLE_CLOUD_PROJECT |
GCP Console |
VERTEX_LOCATION |
Vertex AI region | us-central1 |
|
GOOGLE_APPLICATION_CREDENTIALS |
Path to GCP service account JSON file (Vertex OAuth) | GCP Console | |
GOOGLE_CLOUD_PROJECT |
GCP project ID (also used by Gemini Code Assist OAuth) | GOOGLE_CLOUD_PROJECT_ID |
Note on Vertex AI: Vertex supports two authentication modes:
- Express mode (
VERTEX_API_KEY): Uses the Gemini API endpoint; supports Gemini models only. - OAuth mode (
VERTEX_PROJECT+ Application Default Credentials viagcloud auth application-default loginorGOOGLE_APPLICATION_CREDENTIALS): Supports all Vertex models including partner models (Anthropic Claude, Mistral, etc.).
Note on OpenCode Zen: Free-tier models (cost.input === 0) work without any API key; Claudish automatically uses "Bearer public". Paid models on the zen endpoint require OPENCODE_API_KEY.
| Variable | Provider | Default |
|---|---|---|
GEMINI_BASE_URL |
Google Gemini API | https://generativelanguage.googleapis.com |
OPENAI_BASE_URL |
OpenAI API (also for Azure-compatible) | https://api.openai.com |
MINIMAX_BASE_URL |
MiniMax API | https://api.minimax.io |
MINIMAX_CODING_BASE_URL |
MiniMax Coding Plan endpoint | https://api.minimax.io |
MOONSHOT_BASE_URL |
Kimi/Moonshot API | https://api.moonshot.ai |
KIMI_BASE_URL |
Alias for MOONSHOT_BASE_URL |
|
ZHIPU_BASE_URL |
GLM/Zhipu API | https://open.bigmodel.cn |
GLM_BASE_URL |
Alias for ZHIPU_BASE_URL |
|
ZAI_BASE_URL |
Z.AI API | https://api.z.ai |
OLLAMACLOUD_BASE_URL |
OllamaCloud hosted API | https://ollama.com |
OPENCODE_BASE_URL |
OpenCode Zen API (base; /v1/chat/completions appended) |
https://opencode.ai/zen |
LITELLM_BASE_URL |
LiteLLM proxy server URL (required to enable LiteLLM routing) | none |
Note on OPENCODE_BASE_URL: For the Zen Go plan endpoint, Claudish replaces /zen with /zen/go automatically. Setting OPENCODE_BASE_URL=https://opencode.ai/zen is equivalent to the default.
| Variable | Provider | Default |
|---|---|---|
OLLAMA_BASE_URL |
Ollama local server | http://localhost:11434 |
OLLAMA_HOST |
Alias for OLLAMA_BASE_URL |
|
LMSTUDIO_BASE_URL |
LM Studio local server | http://localhost:1234 |
VLLM_BASE_URL |
vLLM local server | http://localhost:8000 |
MLX_BASE_URL |
MLX local server | http://127.0.0.1:8080 |
| Variable | Purpose | Default |
|---|---|---|
GEMINI_CLIENT_ID |
Custom OAuth client ID for Gemini Code Assist | built-in (from Claudish installation) |
GEMINI_CLIENT_SECRET |
Custom OAuth client secret for Gemini Code Assist | built-in (from Claudish installation) |
These are only needed if you want to use your own Google Cloud OAuth application instead of Claudish's built-in credentials.
{
"version": "1.0.0",
"defaultProfile": "default",
"defaultProvider": "openrouter",
"profiles": {
"default": {
"name": "default",
"description": "Default profile",
"models": {
"opus": "oai@gpt-5.3",
"sonnet": "google@gemini-3-pro",
"haiku": "mm@MiniMax-M2.1",
"subagent": "google@gemini-2.0-flash"
},
"createdAt": "2026-01-01T00:00:00.000Z",
"updatedAt": "2026-01-01T00:00:00.000Z"
}
},
"telemetry": {
"enabled": false,
"askedAt": "2026-01-01T00:00:00Z",
"promptedVersion": "5.10.0"
},
"routing": {
"kimi-*": ["kc", "kimi", "openrouter"],
"glm-*": ["gc", "glm", "openrouter"],
"*": ["litellm", "openrouter"]
},
"customEndpoints": {
"my-vllm": {
"kind": "simple",
"url": "http://gpu-box:8000",
"format": "openai",
"apiKey": "${VLLM_API_KEY}"
}
}
}Field descriptions:
version: Config schema version string (currently"1.0.0").defaultProfile: Name of the profile to use when--profileis not specified.defaultProvider(v7.0.0+): Default provider for auto-routing. Accepts built-in provider names ("openrouter","litellm","openai","anthropic","google") or a custom endpoint name. See Section 6.1 for precedence. Absent means use legacy auto-detection.customEndpoints(v7.0.0+): Named map of custom endpoint definitions. See Section 7.5 for schema.profiles: Map of profile name to profile object. Each profile has:name: Profile identifier (matches the map key).description: Optional human-readable description.models: Model mapping with optional keysopus,sonnet,haiku,subagent. Each value is a full model spec (e.g.,"google@gemini-3-pro"). Absent keys mean no override for that role.createdAt/updatedAt: ISO 8601 timestamps (managed by Claudish).
telemetry: Consent state.enabled: Whether telemetry is on. Default isfalseuntil user explicitly opts in.askedAt: ISO 8601 timestamp of when the user was last prompted. Absent means never prompted.promptedVersion: Claudish version string at time of prompting.
routing: Custom routing rules (see Section 7). Absent means use default auto-routing chain.
Same schema as ~/.claudish/config.json. Placed in the project root directory (wherever Claudish is run from).
Resolution order:
- Profile lookup: local
.claudish.jsonprofiles checked first, then global~/.claudish/config.json. - Default profile: local
defaultProfiletakes precedence if the local config exists and specifies one. - Custom routing rules: local
routingkey entirely replaces global routing rules (no merge). - Local config does not include
telemetry(consent is global only).
Note: The default profile in the local config is looked up first in local profiles, then in global profiles. A local config can reference global profiles by name.
| File | Purpose | Auto-updated |
|---|---|---|
config.json |
Global config: profiles, telemetry, routing | Manual (via claudish profile commands) |
all-models.json |
Cached full model catalog from OpenRouter | Every 2 days, or on --force-update |
litellm-models-{hash}.json |
Cached LiteLLM model list per server (hash = SHA-256 of LITELLM_BASE_URL) |
On each LiteLLM model fetch |
kimi-oauth.json |
Kimi OAuth credentials (access + refresh tokens) | On claudish --kimi-login |
gemini-oauth.json |
Gemini Code Assist OAuth credentials | On claudish --gemini-login |
logs/ |
Debug log files (created when --debug is used) |
Per session |
The preferred syntax. The @ separator unambiguously identifies the provider.
google@gemini-3-pro # Direct Google Gemini API
oai@gpt-5.3 # Direct OpenAI API
openrouter@deepseek/deepseek-r1 # Explicit OpenRouter with vendor-prefixed model
ollama@llama3.2 # Local Ollama, sequential (default)
ollama@llama3.2:3 # Local Ollama, allow up to 3 concurrent requests
ollama@llama3.2:0 # Local Ollama, no concurrency limit (bypass queue)
ll@my-model # LiteLLM proxy with auto catalog resolution
Provider part is case-insensitive. Shortcuts are resolved to canonical provider names.
| Shortcut(s) | Canonical Provider | Notes |
|---|---|---|
g, gemini |
google |
Direct Google Gemini API (GEMINI_API_KEY) |
oai |
openai |
Direct OpenAI API (OPENAI_API_KEY) |
or, openrouter |
openrouter |
OpenRouter (OPENROUTER_API_KEY) |
mm, mmax |
minimax |
MiniMax direct API (MINIMAX_API_KEY) |
mmc |
minimax-coding |
MiniMax Coding Plan (MINIMAX_CODING_API_KEY) |
kimi, moon, moonshot |
kimi |
Kimi/Moonshot API (MOONSHOT_API_KEY or KIMI_API_KEY) |
kc |
kimi-coding |
Kimi Coding Plan (KIMI_CODING_API_KEY or OAuth) |
glm, zhipu |
glm |
GLM/Zhipu direct API (ZHIPU_API_KEY or GLM_API_KEY) |
gc |
glm-coding |
GLM Coding Plan at Z.AI (GLM_CODING_API_KEY or ZAI_CODING_API_KEY) |
zai |
zai |
Z.AI Anthropic-compatible API (ZAI_API_KEY) |
oc, llama, lc, meta |
ollamacloud |
OllamaCloud hosted API (OLLAMA_API_KEY) |
zen |
opencode-zen |
OpenCode Zen (OPENCODE_API_KEY; optional for free models) |
zengo, zgo |
opencode-zen-go |
OpenCode Zen Go subscription plan |
v, vertex |
vertex |
Vertex AI (VERTEX_API_KEY or VERTEX_PROJECT) |
go |
gemini-codeassist |
Gemini Code Assist via OAuth (claudish --gemini-login) |
litellm, ll |
litellm |
LiteLLM proxy (LITELLM_BASE_URL + LITELLM_API_KEY) |
poe |
poe |
Poe API (POE_API_KEY) |
| Shortcut(s) | Provider | Default Endpoint |
|---|---|---|
ollama |
Ollama | http://localhost:11434 |
lms, lmstudio, mlstudio |
LM Studio | http://localhost:1234 |
vllm |
vLLM | http://localhost:8000 |
mlx |
MLX | http://127.0.0.1:8080 |
When no provider@ prefix is given, Claudish detects the provider from the model name pattern. Resolution is by the first matching pattern:
| Pattern | Routes To | Notes |
|---|---|---|
google/* or gemini-* |
Google Gemini | |
openai/* or gpt-* or o1-* or o3-* or chatgpt-* |
OpenAI | |
minimax/* or minimax-* or abab-* |
MiniMax | |
kimi-for-coding (exact) |
Kimi Coding Plan | Must match exactly; checked before kimi-* |
moonshot/* or moonshot-* or kimi-* |
Kimi | |
zhipu/* or glm-* or chatglm-* |
GLM | |
z-ai/* or zai/* |
Z.AI | |
ollamacloud/* or meta-llama/* or llama-* or llama3* |
OllamaCloud | |
qwen* |
Auto-routed (no direct API) | Falls to OpenRouter or LiteLLM |
poe:* |
Poe | Literal poe: prefix |
anthropic/* or claude-* |
Native Anthropic | Claude Code's own auth, no proxy |
vendor/model (unknown vendor) |
Error | Must use explicit openrouter@vendor/model |
bare name (no /) |
Native Anthropic | Treated as Claude model; no proxy |
The old prefix/model format works but emits a deprecation warning suggesting the @ syntax.
| Legacy Prefix | Provider | New Equivalent |
|---|---|---|
g/ |
Google Gemini | g@ |
gemini/ |
Google Gemini | gemini@ |
go/ |
Gemini Code Assist | go@ |
oai/ |
OpenAI | oai@ |
or/ |
OpenRouter | or@ |
mmax/, mm/ |
MiniMax | mm@ |
mmc/ |
MiniMax Coding | mmc@ |
kimi/, moonshot/ |
Kimi | kimi@ |
kc/ |
Kimi Coding | kc@ |
glm/, zhipu/ |
GLM | glm@ |
gc/ |
GLM Coding | gc@ |
zai/ |
Z.AI | zai@ |
oc/ |
OllamaCloud | oc@ |
zen/ |
OpenCode Zen | zen@ |
zengo/, zgo/ |
OpenCode Zen Go | zengo@ |
v/, vertex/ |
Vertex AI | v@ |
litellm/, ll/ |
LiteLLM | ll@ |
ollama/, ollama: |
Ollama (local) | ollama@ |
lmstudio/, lmstudio:, mlstudio/, mlstudio: |
LM Studio (local) | lms@ |
vllm/, vllm: |
vLLM (local) | vllm@ |
mlx/, mlx: |
MLX (local) | mlx@ |
A full URL is accepted directly as a model spec and treated as a local custom endpoint (no API key required):
http://localhost:11434/llama3.2
http://192.168.1.100:8000/mistral
https://localhost:8080/model
When a model name has no explicit provider prefix and does not match a native pattern that maps to a provider with credentials, Claudish builds a fallback chain (implemented in auto-route.ts / getFallbackChain()).
The fallback chain is configurable via the defaultProvider setting. Set it in any of these locations:
| Method | Example |
|---|---|
| Config file | "defaultProvider": "litellm" in ~/.claudish/config.json |
| Env var | CLAUDISH_DEFAULT_PROVIDER=openrouter |
| CLI flag | claudish --default-provider google "task" |
Precedence (highest to lowest):
- CLI flag
--default-provider CLAUDISH_DEFAULT_PROVIDERenv vardefaultProviderin config file- Legacy LITELLM auto-promotion (if
LITELLM_BASE_URL+LITELLM_API_KEYset without explicitdefaultProvider) OPENROUTER_API_KEYpresent → OpenRouter- Hardcoded
"openrouter"
Valid values: any built-in provider name ("openrouter", "litellm", "openai", "anthropic", "google") or a custom endpoint name from customEndpoints.
When defaultProvider is absent and only OPENROUTER_API_KEY is present:
- OpenCode Zen — if
OPENCODE_API_KEYis set. - Provider subscription/coding plan — if the native provider has a subscription alternative and credentials exist:
kimi→ Kimi Coding Plan (kc@kimi-for-coding) ifKIMI_CODING_API_KEYor OAuth present.minimax→ MiniMax Coding Plan (mmc@) ifMINIMAX_CODING_API_KEYpresent.glm→ GLM Coding Plan at Z.AI (gc@) ifGLM_CODING_API_KEYorZAI_CODING_API_KEYpresent.google→ Gemini Code Assist (go@) if OAuth credentials present.
- Native provider API — if the detected native provider has an API key or OAuth credentials.
- OpenRouter — if
OPENROUTER_API_KEYis set (universal fallback).
When LITELLM_BASE_URL and LITELLM_API_KEY are set but defaultProvider is absent, LiteLLM is added to the chain first (before OpenCode Zen). Claudish emits a one-shot stderr hint recommending you set defaultProvider: "litellm" explicitly. This preserves backward compatibility with pre-v7.0.0 behavior.
If none of the chain entries have valid credentials, Claudish returns an error with instructions on how to authenticate.
Custom routing rules are defined in the routing key of config.json or .claudish.json. Local rules entirely replace global rules (no merge).
{
"routing": {
"kimi-for-coding": ["kc", "kimi", "or"],
"kimi-*": ["kimi", "or@moonshot/kimi-k2"],
"glm-*": ["gc", "glm"],
"*": ["litellm", "openrouter"]
}
}- Exact match — e.g.,
"kimi-for-coding": checked first. - Glob patterns — single
*wildcard, e.g.,"kimi-*". Multiple patterns are sorted longest-first (most specific wins). - Catch-all —
"*": matches any model not matched above.
Each entry in the routing chain array is a string. Format options:
"provider"— Use the original model name on the specified provider (e.g.,"kimi"useskimi@{originalModelName})."provider@model"— Use a specific model on the provider (e.g.,"or@moonshot/kimi-k2"uses OpenRouter with the given model ID).
Provider shortcuts (same as @ syntax) are resolved in entries. LiteLLM entries automatically use the model catalog resolver to find the vendor-prefixed model name.
When defaultProvider is set and no explicit routing["*"] catch-all exists in the config, Claudish synthesizes routing["*"] = [<defaultProvider>] at config load time. An explicit routing["*"] always takes precedence over the synthesized one.
{
"defaultProvider": "litellm",
"routing": {
"kimi-*": ["kc", "kimi", "or"]
}
}The above is equivalent to:
{
"routing": {
"kimi-*": ["kc", "kimi", "or"],
"*": ["litellm"]
}
}Claudish warns at load time if:
- A pattern has multiple
*wildcards (only single*is supported). - A rule's entry list is empty (the pattern would have no fallback).
Define named custom endpoints in ~/.claudish/config.json (or .claudish.json) under the customEndpoints key. Each endpoint becomes a provider prefix usable with @ syntax.
For OpenAI- or Anthropic-compatible servers:
{
"customEndpoints": {
"my-vllm": {
"kind": "simple",
"url": "http://gpu-box:8000",
"format": "openai",
"apiKey": "${VLLM_API_KEY}",
"modelPrefix": "my-org/",
"models": ["llama3.1-70b", "qwen2.5-72b"]
}
}
}| Field | Type | Required | Description |
|---|---|---|---|
kind |
"simple" |
yes | Discriminator |
url |
string | yes | Base URL of the server |
format |
"openai" or "anthropic" |
yes | Wire format |
apiKey |
string | no | API key; supports ${VAR} env expansion |
modelPrefix |
string | no | Prepended to model name before sending to API |
models |
string[] | no | Restrict to listed models; omit to allow any |
Usage: claudish --model my-vllm@llama3.1-70b "task"
Full control over transport, auth, headers, and stream format:
{
"customEndpoints": {
"corp-proxy": {
"kind": "complex",
"displayName": "Corporate LLM Proxy",
"transport": "openai",
"baseUrl": "https://llm.corp.internal",
"apiPath": "/api/v2/chat/completions",
"apiKey": "${CORP_LLM_KEY}",
"authScheme": "X-Api-Key",
"headers": { "X-Team": "platform" },
"streamFormat": "openai-sse",
"modelPrefix": "",
"models": ["gpt-4o", "claude-sonnet"]
}
}
}| Field | Type | Required | Description |
|---|---|---|---|
kind |
"complex" |
yes | Discriminator |
displayName |
string | no | Human-readable name (shown in logs) |
transport |
string | yes | Transport type (e.g., "openai", "anthropic") |
baseUrl |
string | yes | Server base URL |
apiPath |
string | no | Custom API path (overrides default for transport) |
apiKey |
string | no | API key; supports ${VAR} env expansion |
authScheme |
string | no | Auth header scheme (default: Bearer; use X-Api-Key for header-name auth) |
headers |
object | no | Additional HTTP headers |
streamFormat |
string | no | Stream parser override (e.g., "openai-sse", "anthropic-sse") |
modelPrefix |
string | no | Prepended to model name |
models |
string[] | no | Restrict to listed models |
The apiKey field supports ${VAR_NAME} syntax. Claudish expands it from process.env at startup. This avoids hardcoding secrets in config files:
"apiKey": "${MY_CUSTOM_API_KEY}"Claudish validates all customEndpoints entries with Zod at proxy startup. Invalid entries:
- Emit a warning to stderr with the validation error
- Are skipped (not registered)
- Do not prevent the proxy from starting
Each valid custom endpoint calls registerRuntimeProvider() (injects into the provider resolver) and registerRuntimeProfile() (injects into the transport layer). The endpoint name becomes a valid provider shortcut immediately.
For each role slot (opus, sonnet, haiku, subagent), resolution from highest to lowest priority:
- CLI flag:
--model-opus,--model-sonnet,--model-haiku,--model-subagent CLAUDISH_MODEL_OPUS,CLAUDISH_MODEL_SONNET,CLAUDISH_MODEL_HAIKU,CLAUDISH_MODEL_SUBAGENTANTHROPIC_DEFAULT_OPUS_MODEL,ANTHROPIC_DEFAULT_SONNET_MODEL,ANTHROPIC_DEFAULT_HAIKU_MODEL,CLAUDE_CODE_SUBAGENT_MODEL- Profile
modelsfields from active profile (local.claudish.jsonfirst, then global~/.claudish/config.json) - No mapping set: Claude Code uses its own internal defaults for that role
The primary model (--model / CLAUDISH_MODEL / ANTHROPIC_MODEL) is separate from role mappings and determines what provider/model handles the main conversation. Role mappings tell Claude Code which models to use internally for different task types.
Claudish provides specialized support for local inference servers with these behaviors:
- Detected automatically via Ollama's
/api/showendpoint or LM Studio's/v1/modelsendpoint. - Override with
CLAUDISH_CONTEXT_WINDOW=<integer>. - For Ollama, Claudish explicitly sets
options.num_ctxto at least 32768 to prevent Ollama's default 2048-token silent truncation.
The LocalModelQueue (in handlers/shared/local-queue.ts) serializes requests to prevent GPU out-of-memory errors:
- Default: sequential (1 at a time), controlled by
CLAUDISH_LOCAL_MAX_PARALLEL. - Range: 1–8 (values above 8 are capped at 8).
- Disable entirely:
CLAUDISH_LOCAL_QUEUE_ENABLED=false. - Per-model override via concurrency suffix:
ollama@llama3.2:3allows 3 concurrent requests for that model spec. ollama@model:0means no concurrency limit (bypasses the queue).
Local provider requests use extended timeouts (10 minutes for headers + body) to accommodate slow local inference. Default undici headersTimeout of 30s is too short.
For small local models with limited context, --summarize-tools (or CLAUDISH_SUMMARIZE_TOOLS=1) compresses Claude Code's tool descriptions to reduce prompt token usage.
For local Qwen models, setting CLAUDISH_QWEN_NO_THINK=1 prepends /no_think to the system prompt to disable the model's chain-of-thought reasoning mode, reducing latency.
| Path | Purpose | Auto-update Trigger |
|---|---|---|
~/.claudish/config.json |
Global settings, profiles, telemetry, routing | Profile/telemetry commands |
~/.claudish/all-models.json |
Full OpenRouter model catalog | Every 2 days; or --force-update |
~/.claudish/litellm-models-{hash}.json |
LiteLLM model list (one file per unique LITELLM_BASE_URL) |
On each LiteLLM model list fetch |
~/.claudish/kimi-oauth.json |
Kimi OAuth access + refresh tokens | claudish --kimi-login |
~/.claudish/gemini-oauth.json |
Gemini Code Assist OAuth tokens | claudish --gemini-login |
.claudish.json |
Local/project config | Profile commands with --local |
.env |
Environment variables (auto-loaded at startup) | Manual |
Cache files can be force-refreshed with claudish --models --force-update or claudish --top-models --force-update. The --force-update flag deletes all-models.json, pricing-cache.json, and all litellm-models-*.json files before fetching fresh data.
Running claudish --mcp starts Claudish as an MCP server. In this mode, Claudish exposes itself as a tool provider to MCP-compatible clients rather than launching Claude Code.
When routing through aggregators like OpenRouter or LiteLLM, models require vendor-prefixed names (e.g., qwen/qwen3-coder-next) that users should not need to know. The ModelCatalogResolver interface in providers/model-catalog-resolver.ts automatically finds the correct prefix.
How it works:
- User specifies bare model name (e.g.,
or@qwen3-coder-next). - Resolver searches the provider's cached model catalog for an exact suffix match.
- If found, uses the vendor-prefixed ID (e.g.,
qwen/qwen3-coder-next). - If not found in cache, falls back to static map (
OPENROUTER_VENDOR_MAP) for cold starts.
Rules:
- Exact match only; no fuzzy or normalized matching.
- Dynamic catalogs (from provider APIs) are primary; static map is cold-start fallback only.
- Resolution is synchronous (
resolveModelNameSync()) using in-memory cache +readFileSync.
Current resolvers:
- OpenRouter: Searches
_cachedOpenRouterModels+all-models.jsonby exact suffix. - LiteLLM: Searches
litellm-models-{hash}.jsonby exact match and prefix-stripping. - Static fallback:
OPENROUTER_VENDOR_MAPfor OpenRouter when no cache exists.
This reference does NOT cover:
- Claude Code flags: The full list of flags that can be passed through to Claude Code (use
claude --help). Claudish forwards any unrecognized flag automatically. - Cost tracking internals: The detailed algorithm for cost accumulation and the format of cost data files.
- MCP server protocol: The specific MCP tool definitions and protocol details when running in
--mcpmode. - Smoke test configuration: The
scripts/smoke/configuration for provider smoke tests. - Token file format: The internal token counting files used by
writeTokenFilefor the status line display.
# Install / verify
npm install -g claudish
claudish --version
# Interactive mode (model selector appears)
claudish
claudish --free # only free models
claudish -p myprofile # with specific profile
# Single-shot (no model selector)
claudish --model g@gemini-2.0-flash "task"
claudish --model oai@gpt-4o "task"
claudish --model ollama@llama3.2 "task"
# Model role mapping
claudish --model-opus g@gemini-3-pro --model-sonnet oai@gpt-5.3
# Auto-approve + disable sandbox (CI/automation)
claudish -y --dangerous --model g@gemini-2.0-flash "task"
# Debug
claudish --debug --model g@gemini-2.0-flash "task"
# Profile management
claudish init
claudish profile list
claudish profile add --global
claudish profile use myprofile --global
# Model discovery
claudish --models # all models
claudish --models gemini # search
claudish --top-models # curated list
claudish --models --json # JSON output
# OAuth login
claudish --gemini-login
claudish --kimi-login
# Telemetry
claudish telemetry status
claudish telemetry off
This document was generated from direct codebase analysis of Claudish source at packages/cli/src/. Last updated for v7.0.0 (default provider, custom endpoints, routing rules catch-all synthesis). Key files: cli.ts, config.ts, model-parser.ts, provider-resolver.ts, auto-route.ts, remote-provider-registry.ts, profile-config.ts, routing-rules.ts.