feat: OpenRouter reasoning support with multi-turn pass-back#4
Open
82deutschmark wants to merge 1 commit intov2-multi-modelfrom
Open
feat: OpenRouter reasoning support with multi-turn pass-back#482deutschmark wants to merge 1 commit intov2-multi-modelfrom
82deutschmark wants to merge 1 commit intov2-multi-modelfrom
Conversation
Send reasoning: {enabled: true} in the request payload for OpenRouter
models so they return reasoning tokens. Preserve reasoning_details in
assistant messages for multi-turn conversations. Add --no-reasoning
flag to disable this when reasoning adds unwanted overhead.
Localhost llama.cpp calls are unaffected (reasoning_enabled defaults
to False in LLMClient, only set True via create_client for openrouter://).
Co-Authored-By: Claude Opus 4.6 (1M context) <[email protected]>
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
What
Adds proper OpenRouter reasoning support to the bench:
Compaction and reasoning tokens
Verified: OpenRouter includes reasoning tokens in completion_tokens and total_tokens. The compaction trigger (total_now >= token_limit) already counts reasoning tokens — no changes needed.
Previous fix included
Branch includes commit 7ce8d14 which reads both reasoning_content (llama.cpp) and reasoning (OpenRouter) from streaming deltas.