Skip to content
Open
Show file tree
Hide file tree
Changes from all commits
Commits
Show all changes
27 commits
Select commit Hold shift + click to select a range
c8c2e50
refactor: replace DialectManager class with resolveModelDialect function
MayCXC Mar 24, 2026
78b1f5d
docs: update CLAUDE.md for resolveModelDialect
MayCXC Mar 24, 2026
e16fbc3
refactor: parseResponse on APIFormat, tokenStrategy + transport methods
MayCXC Mar 24, 2026
d7ed62a
refactor: export resolveAPIFormat and resolveTransport factory functions
MayCXC Mar 24, 2026
dbc86db
refactor: add RequestQueue and BaseTransport/ApiKeyTransport/OAuthTra…
MayCXC Mar 24, 2026
8b62997
docs: update CLAUDE.md with new factory functions and transport base
MayCXC Mar 24, 2026
88476f5
refactor: convert all transports to extend BaseTransport hierarchy
MayCXC Mar 24, 2026
32ffd9a
docs: update CLAUDE.md architecture section for v6 transport hierarchy
MayCXC Mar 24, 2026
85eea0e
refactor: move queue hooks to transport method overrides
MayCXC Mar 24, 2026
60c8c1e
refactor: centralize dialect matching rules in resolveModelDialect
MayCXC Mar 24, 2026
a731580
refactor: single source of truth for transport/format/handler
MayCXC Mar 24, 2026
2152dd5
refactor: collapse provider profiles into flat switch
MayCXC Mar 24, 2026
b35fcee
refactor: createHandlerForProvider takes ProviderDefinition directly
MayCXC Mar 24, 2026
b32ce18
refactor: move resolveModelDialect into provider-profiles, delete dia…
MayCXC Mar 24, 2026
1184a05
refactor: remove shouldHandle from all adapters
MayCXC Mar 24, 2026
1920d4a
refactor: single getHandler in proxy-server, all construction via reg…
MayCXC Mar 24, 2026
ef5a8c7
refactor: kimi-coding transport, local adapter subclasses, remove rem…
MayCXC Mar 24, 2026
5e1e00e
refactor: split BaseAPIFormat/GeminiAPIFormat, ZenTransport, pre-comp…
MayCXC Mar 24, 2026
1367d5e
refactor: remove dead getter functions from provider-definitions
MayCXC Mar 24, 2026
cf2463c
docs: update CLAUDE.md and README.md for three-layer provider archite…
MayCXC Mar 24, 2026
e937d3c
refactor: OllamaTransport and LMStudioTransport subclasses
MayCXC Mar 24, 2026
8d110ae
refactor: move model-specific system prompts to dialect prepareRequest
MayCXC Mar 24, 2026
81c7380
refactor: move qwen tool guidance to LocalQwenFormatAdapter
MayCXC Mar 24, 2026
4085f04
refactor: middleware selection via resolver, remove shouldHandle enti…
MayCXC Mar 24, 2026
5546207
fix: review fixes (zen definitions, formatAdapter, BaseModelDialect, …
MayCXC Mar 24, 2026
e3ec863
docs: update CLAUDE.md with middleware and effective resolvers
MayCXC Mar 24, 2026
6636132
feat: GitHub Models provider, user-defined providers, MCP multi-provider
MayCXC Mar 24, 2026
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
59 changes: 32 additions & 27 deletions CLAUDE.md
Original file line number Diff line number Diff line change
Expand Up @@ -80,41 +80,46 @@ Claudish supports local models via:

Local model APIs (LM Studio, Ollama) report `prompt_tokens` as the **full conversation context** each request, not incremental tokens. The `writeTokenFile` function uses assignment (`=`) not accumulation (`+=`) for input tokens to handle this correctly.

## Three-Layer Adapter Architecture (v5.14.0+)
## Three-Layer Provider Architecture

The translation pipeline has three decoupled layers:
All provider-specific logic lives in `providers/provider-profiles.ts`. Four resolvers and one effective-definition resolver. `createHandlerForProvider(def, modelName, apiKey, targetModel, port, opts)` composes them into a ComposedHandler.

### Layer 1: FormatConverter — wire format translation
Translates between Claude API format and target model's wire format (messages, tools, payload).
Each converter declares its stream format via `getStreamFormat()`.
- **Interface**: `adapters/format-converter.ts`
- **Implementations**: OpenAIAdapter, AnthropicPassthroughAdapter, GeminiAdapter, CodexAdapter, OllamaCloudAdapter, LiteLLMAdapter
- **Message/tool conversion**: `handlers/shared/format/openai-messages.ts`, `openai-tools.ts`
### Layer 1: APIFormat — wire format
- **Base**: `BaseAPIFormat` (`adapters/base-api-format.ts`)
- **Classes**: OpenAIAPIFormat, AnthropicAPIFormat, GeminiAPIFormat, CodexAPIFormat, OllamaAPIFormat, LiteLLMAPIFormat, OpenRouterAPIFormat
- **Resolver**: `resolveAPIFormat(def, modelName)` switches on `def.transport`
- **Key methods**: `convertMessages()`, `convertTools()`, `buildPayload()`, `getStreamFormat()`, `parseResponse()`

### Layer 2: ModelTranslator — model dialect translation
Translates model-specific dialect differences (context windows, thinking→reasoning_effort, vision rules).
- **Interface**: `adapters/model-translator.ts`
- **Implementations**: GLMAdapter, GrokAdapter, MiniMaxAdapter, DeepSeekAdapter, QwenAdapter, CodexAdapter
- **Selection**: `AdapterManager` auto-selects based on model ID
### Layer 2: ModelDialect — model quirks
- **Base**: `BaseModelDialect` (`adapters/base-model-dialect.ts`)
- **Classes**: GrokModelDialect, GeminiModelDialect, DeepSeekModelDialect, QwenModelDialect, MiniMaxModelDialect, GLMModelDialect, XiaomiModelDialect
- **Resolver**: `resolveModelDialect(modelId)` matches model family patterns
- **Key methods**: `processTextContent()`, `getContextWindow()`, `supportsVision()`, `prepareRequest()`

### Layer 3: ProviderTransport — HTTP transport
Handles auth, endpoints, headers, rate limiting. Optionally overrides stream format for aggregators.
- **Interface**: `providers/transport/types.ts`
- **Stream format override**: LiteLLM and OpenRouter implement `overrideStreamFormat()` → `"openai-sse"`

### Composition in ComposedHandler
```
ComposedHandler = FormatConverter (explicit adapter) + ModelTranslator (auto-selected) + ProviderTransport
```

**Stream parser selection** (3-tier priority):
- **Base**: `BaseTransport` (`providers/transport/base.ts`) owns a `RequestQueue`
- **Subclasses**: `ApiKeyTransport` (Bearer auth), `OAuthTransport` (token refresh)
- **Classes**: OpenAI, Gemini, CodeAssist, Anthropic, OpenRouter, OllamaCloud, LiteLLM, Vertex, Poe, Local, Zen, KimiCoding
- **Resolver**: `resolveTransport(def, modelName, apiKey)` switches on `def.transport`
- **Rate limiting**: `onResponse()`, `shouldRetry()`, `calculateDelay()` as overridable methods (Gemini parses quotaResetDelay, OpenRouter tracks X-RateLimit headers)

### How to add a provider
1. Add entry to `BUILTIN_PROVIDERS` in `provider-definitions.ts` with `transport` type
2. Add case to `resolveTransport` and `resolveAPIFormat` in `provider-profiles.ts`
3. If model has quirks, add case to `resolveModelDialect` and create a ModelDialect class
4. If transport needs custom auth/queue behavior, create a transport subclass extending `BaseTransport`

### Additional resolvers
- `resolveMiddlewares(modelId)`: selects middleware instances (e.g., GeminiThoughtSignatureMiddleware for Gemini models)
- `resolveEffective(def, modelName, apiKey)`: handles definition swaps (zen + minimax) and publicKeyFallback
- `resolveModelDialect(modelId)`: returns `BaseModelDialect | null` (null = no dialect quirks)

### Stream parser selection
```typescript
transport.overrideStreamFormat() ?? modelAdapter.getStreamFormat() ?? providerAdapter.getStreamFormat()
transport.overrideStreamFormat() ?? modelAdapter.getStreamFormat() ?? adapter.getStreamFormat()
```

**Adding a new provider**: Add one entry to `PROVIDER_PROFILES` table in `providers/provider-profiles.ts`.
**Adding a new model**: Create a ModelTranslator adapter, register in `adapters/adapter-manager.ts`.
**Verifying wiring**: `claudish --probe <model>` shows the full adapter composition.
**Verifying wiring**: `claudish --probe <model>` shows the full composition.

### Stream Parsers
Located in `handlers/shared/stream-parsers/`:
Expand Down
12 changes: 7 additions & 5 deletions README.md
Original file line number Diff line number Diff line change
Expand Up @@ -905,18 +905,20 @@ claudish "your prompt"
2. Find available port (random or specified)
3. Start local proxy on http://127.0.0.1:PORT
4. Spawn: claude --auto-approve --env ANTHROPIC_BASE_URL=http://127.0.0.1:PORT
5. Proxy translates: Anthropic API → OpenRouter API
5. Proxy translates via three-layer pipeline:
- ProviderTransport: auth, endpoint, rate limiting
- APIFormat: wire format conversion (OpenAI, Gemini, Anthropic, etc.)
- ModelDialect: model-specific quirks (reasoning filters, tool format)
6. Stream output in real-time
7. Cleanup proxy on exit
```

### Request Flow

**Normal Mode (OpenRouter):**
```
Claude Code → Anthropic API format → Local Proxy → OpenRouter API formatOpenRouter
Claude Code ← Anthropic API format ← Local Proxy ← OpenRouter API format ← OpenRouter
Claude Code → Anthropic API → Local Proxy → [APIFormat + ProviderTransport]Any Provider
Claude Code ← Anthropic API ← Local Proxy ← [Stream Parser] ← Any Provider
```

**Monitor Mode (Anthropic Passthrough):**
Expand Down
6 changes: 4 additions & 2 deletions packages/cli/scripts/smoke/providers.ts
Original file line number Diff line number Diff line change
Expand Up @@ -7,7 +7,7 @@
*/

import type { RemoteProvider } from "../../src/handlers/shared/remote-provider-types.js";
import { getRegisteredRemoteProviders } from "../../src/providers/remote-provider-registry.js";
import { getAllProviders, toRemoteProvider } from "../../src/providers/provider-definitions.js";
import type { SmokeProviderConfig, WireFormat } from "./types.js";

// Providers to skip in v1 smoke tests
Expand Down Expand Up @@ -154,7 +154,9 @@ function getBaseUrl(provider: RemoteProvider): string {
* @returns Array of SmokeProviderConfig for providers ready to test.
*/
export function discoverProviders(filterName?: string): SmokeProviderConfig[] {
const all = getRegisteredRemoteProviders();
const all = getAllProviders()
.filter((def) => !def.isLocal && def.baseUrl !== "" && def.name !== "qwen" && def.name !== "native-anthropic")
.map(toRemoteProvider);

return all
.filter((p) => {
Expand Down
13 changes: 9 additions & 4 deletions packages/cli/src/adapters/anthropic-api-format.ts
Original file line number Diff line number Diff line change
Expand Up @@ -26,10 +26,6 @@ export class AnthropicAPIFormat extends BaseAPIFormat {
};
}

shouldHandle(modelId: string): boolean {
return false; // Not auto-selected; always explicitly passed
}

getName(): string {
return "AnthropicAPIFormat";
}
Expand Down Expand Up @@ -114,6 +110,15 @@ export class AnthropicAPIFormat extends BaseAPIFormat {
return "anthropic-sse";
}

override parseResponse(data: any): { content: string; usage?: { input: number; output: number } } {
const blocks = data?.content ?? [];
const content = blocks.map((b: any) => b.text ?? "").join("");
const usage = data?.usage
? { input: data.usage.input_tokens ?? 0, output: data.usage.output_tokens ?? 0 }
: undefined;
return { content, usage };
}

override getContextWindow(): number {
// Try catalog lookup first (handles kimi/minimax model name variants)
const catalogEntry = lookupModel(this.modelId);
Expand Down
4 changes: 4 additions & 0 deletions packages/cli/src/adapters/api-format.ts
Original file line number Diff line number Diff line change
Expand Up @@ -34,4 +34,8 @@ export interface APIFormat {
textContent: string,
accumulatedText: string
): import("./base-api-format.js").AdapterResult;

/** Parse a non-streaming response body into content + usage.
* Default handles OpenAI format. Override for other wire formats. */
parseResponse(data: any): { content: string; usage?: { input: number; output: number } };
}
23 changes: 14 additions & 9 deletions packages/cli/src/adapters/base-api-format.ts
Original file line number Diff line number Diff line change
Expand Up @@ -68,11 +68,6 @@ export abstract class BaseAPIFormat implements APIFormat, ModelDialect {
*/
abstract processTextContent(textContent: string, accumulatedText: string): AdapterResult;

/**
* Check if this format/dialect should be used for the given model
*/
abstract shouldHandle(modelId: string): boolean;

/**
* Get name for logging
*/
Expand Down Expand Up @@ -172,6 +167,20 @@ export abstract class BaseAPIFormat implements APIFormat, ModelDialect {
return "openai-sse";
}

/**
* Parse a non-streaming response into content + usage.
* Default handles OpenAI Chat Completions format.
* Override for Gemini, Anthropic, etc.
*/
parseResponse(data: any): { content: string; usage?: { input: number; output: number } } {
const choice = data?.choices?.[0];
const content = choice?.message?.content ?? "";
const usage = data?.usage
? { input: data.usage.prompt_tokens ?? 0, output: data.usage.completion_tokens ?? 0 }
: undefined;
return { content, usage };
}

/**
* Context window size for this model (tokens).
* Used for token tracking and context-left-percent calculation.
Expand Down Expand Up @@ -257,10 +266,6 @@ export class DefaultAPIFormat extends BaseAPIFormat {
};
}

shouldHandle(modelId: string): boolean {
return false; // Default is fallback
}

getName(): string {
return "DefaultAPIFormat";
}
Expand Down
84 changes: 84 additions & 0 deletions packages/cli/src/adapters/base-model-dialect.ts
Original file line number Diff line number Diff line change
@@ -0,0 +1,84 @@
/**
* BaseModelDialect — abstract base for model dialect implementations (Layer 2).
*
* Model dialects handle per-model-family quirks: context windows, parameter mappings
* (thinking -> reasoning_effort), vision support, tool name limits, and text
* post-processing (reasoning filters, XML parsing, token stripping).
*
* This class is for pure dialect adapters that do NOT define a wire format.
* Wire format adapters (OpenAI, Gemini, Anthropic, etc.) extend BaseAPIFormat instead.
*/

import type { ModelDialect } from "./model-dialect.js";
import type { StreamFormat } from "../providers/transport/types.js";
import type { AdapterResult } from "./base-api-format.js";
import type { ModelPricing } from "../handlers/shared/remote-provider-types.js";
import { getModelPricing } from "../handlers/shared/remote-provider-types.js";

export abstract class BaseModelDialect implements ModelDialect {
protected modelId: string;

constructor(modelId: string) {
this.modelId = modelId;
}

/**
* Process text content and extract any model-specific tool call formats.
*/
abstract processTextContent(textContent: string, accumulatedText: string): AdapterResult;

/**
* Get name for logging.
*/
abstract getName(): string;

/**
* Maximum tool name length allowed by this model's API.
* Returns null if no limit (default).
*/
getToolNameLimit(): number | null {
return null;
}

/**
* Handle any request preparation before sending to the model.
*/
prepareRequest(request: any, originalRequest: any): any {
return request;
}

/**
* Reset internal state between requests.
*/
reset(): void {}

/**
* Context window size for this model (tokens).
*/
getContextWindow(): number {
return 200_000;
}

/**
* Whether this model supports vision/image input.
* Default false; override in dialects for models with vision support.
*/
supportsVision(): boolean {
return false;
}

/**
* Stream format: dialects have no opinion (wire format is the adapter's job).
* Returns undefined so ComposedHandler falls through to the format adapter.
*/
getStreamFormat(): StreamFormat | undefined {
return undefined;
}

/**
* Pricing info for this model.
*/
getPricing(providerName: string): ModelPricing {
return getModelPricing(providerName, this.modelId);
}
}
6 changes: 1 addition & 5 deletions packages/cli/src/adapters/codex-api-format.ts
Original file line number Diff line number Diff line change
Expand Up @@ -11,7 +11,7 @@
* This format handles Codex models only. All other OpenAI models use OpenAIAPIFormat.
*/

import { BaseAPIFormat, type AdapterResult, matchesModelFamily } from "./base-api-format.js";
import { BaseAPIFormat, type AdapterResult } from "./base-api-format.js";
import type { StreamFormat } from "../providers/transport/types.js";

export class CodexAPIFormat extends BaseAPIFormat {
Expand All @@ -27,10 +27,6 @@ export class CodexAPIFormat extends BaseAPIFormat {
};
}

shouldHandle(modelId: string): boolean {
return matchesModelFamily(modelId, "codex");
}

getName(): string {
return "CodexAPIFormat";
}
Expand Down
9 changes: 3 additions & 6 deletions packages/cli/src/adapters/deepseek-model-dialect.ts
Original file line number Diff line number Diff line change
Expand Up @@ -5,10 +5,11 @@
* - Strips unsupported thinking params (DeepSeek thinks automatically)
*/

import { BaseAPIFormat, AdapterResult, matchesModelFamily } from "./base-api-format.js";
import { BaseModelDialect } from "./base-model-dialect.js";
import type { AdapterResult } from "./base-api-format.js";
import { log } from "../logger.js";

export class DeepSeekModelDialect extends BaseAPIFormat {
export class DeepSeekModelDialect extends BaseModelDialect {
processTextContent(textContent: string, accumulatedText: string): AdapterResult {
return {
cleanedText: textContent,
Expand All @@ -35,10 +36,6 @@ export class DeepSeekModelDialect extends BaseAPIFormat {
return request;
}

shouldHandle(modelId: string): boolean {
return matchesModelFamily(modelId, "deepseek");
}

getName(): string {
return "DeepSeekModelDialect";
}
Expand Down
64 changes: 0 additions & 64 deletions packages/cli/src/adapters/dialect-manager.ts

This file was deleted.

Loading