You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
extractOpenAIUsage in the OpenAI→Claude response translator (internal/translator/openai/claude/openai_claude_response.go:722-740) subtracts cached_tokens from prompt_tokens before reporting input_tokens to downstream clients.
This causes clients that rely on input_tokens for context window tracking (Claude Code, Factory Droid, etc.) to see near-zero values on cache hits, breaking compaction triggers.
Example
API returns: prompt_tokens=150000, cached_tokens=149900
Current behavior:input_tokens=100 (subtracted), cache_read_input_tokens=149900 Expected behavior:input_tokens=150000, cache_read_input_tokens=149900
Remove the subtraction block. input_tokens should report the full prompt_tokens value. cache_read_input_tokens is already set separately for clients that need the breakdown.
Impact
Affects BYOK setups where Claude Code or similar clients talk to non-Claude upstreams (Codex, Copilot) through the proxy
Claude Code's auto-compaction never fires because usage.input_tokens appears near-zero
Context window grows unbounded until hitting the model's hard limit
Description
extractOpenAIUsagein the OpenAI→Claude response translator (internal/translator/openai/claude/openai_claude_response.go:722-740) subtractscached_tokensfromprompt_tokensbefore reportinginput_tokensto downstream clients.This causes clients that rely on
input_tokensfor context window tracking (Claude Code, Factory Droid, etc.) to see near-zero values on cache hits, breaking compaction triggers.Example
API returns:
prompt_tokens=150000, cached_tokens=149900Current behavior:
input_tokens=100(subtracted),cache_read_input_tokens=149900Expected behavior:
input_tokens=150000,cache_read_input_tokens=149900Code
Fix
Remove the subtraction block.
input_tokensshould report the fullprompt_tokensvalue.cache_read_input_tokensis already set separately for clients that need the breakdown.Impact
usage.input_tokensappears near-zeroVersion
CLIProxyAPIPlus v6.9.10-1-plus (commit 516d22c)