-
Notifications
You must be signed in to change notification settings - Fork 186
agent: pass adaptive thinking config to SDK query() #104
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Merged
Merged
Changes from all commits
Commits
Show all changes
3 commits
Select commit
Hold shift + click to select a range
File filter
Filter by extension
Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
There are no files selected for viewing
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
| Original file line number | Diff line number | Diff line change |
|---|---|---|
| @@ -0,0 +1,87 @@ | ||
| // Coverage for every cell of the model x thinking-shape matrix the live | ||
| // Messages API enforces (verified 2026-04-29). The helper is the single | ||
| // source of truth for SDK `query()` thinking options across chat, judge, | ||
| // runtime, and reflection callsites. | ||
|
|
||
| import { describe, expect, test } from "bun:test"; | ||
| import { JUDGE_MODEL_HAIKU, JUDGE_MODEL_OPUS, JUDGE_MODEL_SONNET } from "../../evolution/judge-models.ts"; | ||
| import { getThinkingConfig } from "../thinking-config.ts"; | ||
|
|
||
| describe("getThinkingConfig", () => { | ||
| test("Opus 4.7 returns adaptive (Opus 4.7 rejects manual enabled with 400)", () => { | ||
| expect(getThinkingConfig(JUDGE_MODEL_OPUS)).toEqual({ type: "adaptive" }); | ||
| expect(getThinkingConfig("claude-opus-4-7")).toEqual({ type: "adaptive" }); | ||
| }); | ||
|
|
||
| test("Opus 4.6 returns adaptive (recommended; manual is deprecated)", () => { | ||
| expect(getThinkingConfig("claude-opus-4-6")).toEqual({ type: "adaptive" }); | ||
| }); | ||
|
|
||
| test("Sonnet 4.6 returns adaptive (recommended; manual still functional)", () => { | ||
| expect(getThinkingConfig(JUDGE_MODEL_SONNET)).toEqual({ type: "adaptive" }); | ||
| expect(getThinkingConfig("claude-sonnet-4-6")).toEqual({ type: "adaptive" }); | ||
| }); | ||
|
|
||
| test("Mythos preview returns adaptive", () => { | ||
| expect(getThinkingConfig("claude-mythos-preview")).toEqual({ type: "adaptive" }); | ||
| }); | ||
|
|
||
| test("Haiku 4.5 returns enabled + budgetTokens (Haiku rejects adaptive with 400)", () => { | ||
| const config = getThinkingConfig(JUDGE_MODEL_HAIKU); | ||
| expect(config.type).toBe("enabled"); | ||
| if (config.type === "enabled") { | ||
| expect(config.budgetTokens).toBeGreaterThan(0); | ||
| } | ||
| }); | ||
|
|
||
| test("older Haiku 3.x returns enabled + budgetTokens", () => { | ||
| const config = getThinkingConfig("claude-haiku-3-5"); | ||
| expect(config.type).toBe("enabled"); | ||
| }); | ||
|
|
||
| test("older Sonnet 3.x returns enabled + budgetTokens", () => { | ||
| const config = getThinkingConfig("claude-sonnet-3-7"); | ||
| expect(config.type).toBe("enabled"); | ||
| }); | ||
|
|
||
| test("legacy Opus 4.5 returns enabled + budgetTokens", () => { | ||
| const config = getThinkingConfig("claude-opus-4-5"); | ||
| expect(config.type).toBe("enabled"); | ||
| }); | ||
|
|
||
| test("undefined model defaults to adaptive (safe for all new models)", () => { | ||
| expect(getThinkingConfig(undefined)).toEqual({ type: "adaptive" }); | ||
| }); | ||
|
|
||
| test("null model defaults to adaptive", () => { | ||
| expect(getThinkingConfig(null)).toEqual({ type: "adaptive" }); | ||
| }); | ||
|
|
||
| test("empty string defaults to adaptive", () => { | ||
| expect(getThinkingConfig("")).toEqual({ type: "adaptive" }); | ||
| }); | ||
|
|
||
| test("unknown future model defaults to adaptive", () => { | ||
| // Every new model since Opus 4.7 has been adaptive-only, so when | ||
| // we do not recognise the prefix we send adaptive. A wrong guess | ||
| // returns a clear 400 with the required shape, which is preferable | ||
| // to silent breakage in reflection. | ||
| expect(getThinkingConfig("claude-future-model-2027")).toEqual({ type: "adaptive" }); | ||
| }); | ||
|
|
||
| test("provider-prefixed names still match by suffix-free comparison fail-safe", () => { | ||
| // Some operators set `model: "anthropic/claude-haiku-4-5"` via | ||
| // LiteLLM. The helper currently does prefix-match on the bare | ||
| // Anthropic id. If a slash-prefix is used, we fall through to | ||
| // adaptive default, which is the safer of the two failure modes | ||
| // (adaptive will 400 with a clear error on Haiku rather than | ||
| // silently downgrading thinking). | ||
| expect(getThinkingConfig("anthropic/claude-haiku-4-5")).toEqual({ type: "adaptive" }); | ||
| }); | ||
|
|
||
| test("returned object is a fresh value (callers may spread it)", () => { | ||
| const a = getThinkingConfig(JUDGE_MODEL_OPUS); | ||
| const b = getThinkingConfig(JUDGE_MODEL_OPUS); | ||
| expect(a).toEqual(b); | ||
| }); | ||
| }); |
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
| Original file line number | Diff line number | Diff line change |
|---|---|---|
| @@ -0,0 +1,52 @@ | ||
| // Single-source-of-truth picker for the Agent SDK `thinking` option. | ||
| // | ||
| // The matrix is non-uniform across models: | ||
| // - Opus 4.7 only accepts `{ type: "adaptive" }`. Manual `enabled + | ||
| // budget_tokens` is rejected with a 400. | ||
| // - Haiku 4.5 only accepts `{ type: "enabled", budget_tokens: N }`. | ||
| // Adaptive is rejected with a 400. | ||
| // - Sonnet 4.6 accepts both shapes (manual is deprecated but still | ||
| // functional). | ||
| // | ||
| // Verified against the live Messages API on 2026-04-29; see the design | ||
| // note at local/2026-04-29-thinking-config-design.md (local-only). | ||
| // | ||
| // Every SDK `query()` callsite spreads `getThinkingConfig(model)` instead | ||
| // of hard-coding a single shape, so reflection (Haiku tier), chat (Opus | ||
| // tier), judges (Sonnet tier), and the AgentRuntime path all pick the | ||
| // correct shape. New models default to adaptive because every model | ||
| // Anthropic has shipped since 4.7 only accepts adaptive. | ||
|
|
||
| import type { ThinkingConfig } from "@anthropic-ai/claude-agent-sdk"; | ||
|
|
||
| const ADAPTIVE_PREFIXES: readonly string[] = [ | ||
| "claude-opus-4-7", | ||
| "claude-opus-4-6", | ||
| "claude-sonnet-4-6", | ||
| "claude-mythos", | ||
| ]; | ||
|
|
||
| const MANUAL_ONLY_PREFIXES: readonly string[] = [ | ||
| "claude-haiku-4", | ||
| "claude-haiku-3", | ||
| "claude-sonnet-3", | ||
| "claude-sonnet-4-5", | ||
| "claude-opus-4-5", | ||
| ]; | ||
|
|
||
| const MANUAL_BUDGET_TOKENS = 8192; | ||
|
|
||
| export function getThinkingConfig(model: string | undefined | null): ThinkingConfig { | ||
| if (!model) return { type: "adaptive" }; | ||
| if (ADAPTIVE_PREFIXES.some((p) => model.startsWith(p))) { | ||
| return { type: "adaptive" }; | ||
| } | ||
| if (MANUAL_ONLY_PREFIXES.some((p) => model.startsWith(p))) { | ||
| return { type: "enabled", budgetTokens: MANUAL_BUDGET_TOKENS }; | ||
| } | ||
| // Unknown model: prefer adaptive. Every model Anthropic has released | ||
| // since Opus 4.7 only accepts adaptive, so a new variant is far more | ||
| // likely to require adaptive than to require manual mode. If wrong, | ||
| // the API returns a clear 400 error with the required shape. | ||
| return { type: "adaptive" }; | ||
| } | ||
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Oops, something went wrong.
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
getThinkingConfigonly checksclaude-haiku-3*/claude-sonnet-3*prefixes, but Anthropic’s actual 3.x IDs areclaude-3-5-haiku-*andclaude-3-7-sonnet-*; those values currently fall through to the adaptive default and will sendthinking: { type: "adaptive" }to models that require manual thinking (enabled+ budget), causing runtime 400s for valid legacy model configurations. This regression is easy to trigger by settingmodelto a real 3.x ID (e.g.claude-3-5-haiku-20241022).Useful? React with 👍 / 👎.