fix(proxy): strip Claude Code billing header from prompt for non-Anthropic providers#126
Open
jsboige wants to merge 1 commit into
Open
fix(proxy): strip Claude Code billing header from prompt for non-Anthropic providers#126jsboige wants to merge 1 commit into
jsboige wants to merge 1 commit into
Conversation
Contributor
…ropic providers Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
f962278 to
ed89ef7
Compare
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Summary
Strips the
x-anthropic-billing-header: cc_version=...; cch=XXXXX;line that Claude Code injects into the system prompt body when the request is not routed to Anthropic's native API.Problem
Claude Code injects a billing header into the
systemfield of every/v1/messagesrequest. Thecch=token changes on every request, which breaks prefix caching on self-hosted inference engines (vLLM, Ollama, LM Studio) that use strict hash matching for KV cache hits. Each request is treated as a unique prefix, resulting in zero cache hits and wasted GPU compute re-processing the same system prompt repeatedly.Fix
/v1/messageshandler, after resolving the handler and before forwarding, check if the handler is aNativeHandler(Anthropic direct).body.systemusing a regex.stringandarray(block) forms of thesystemfield per the Anthropic API spec.Testing
packages/cli/src/proxy-server.tstouched (23 insertions, 0 deletions).NativeHandlerimport already exists in the file — no new dependencies.Co-Authored-By: Claude Opus 4.7 noreply@anthropic.com