Skip to content

fix: enable LiteLLM virtual key cost tracking for default agent#604

Closed
simonrosenberg wants to merge 1 commit intomainfrom
fix/virtual-key-default-agent
Closed

fix: enable LiteLLM virtual key cost tracking for default agent#604
simonrosenberg wants to merge 1 commit intomainfrom
fix/virtual-key-default-agent

Conversation

@simonrosenberg
Copy link
Copy Markdown
Collaborator

Summary

  • Add apply_virtual_key() helper to litellm_proxy.py that returns an LLM config copy with the per-instance virtual key as api_key
  • Update all 9 benchmarks to use it when creating the default (non-ACP) Agent
  • Thread-safe: uses model_copy() + threading.local (no shared state mutation)

Why

The evaluation orchestrator already creates per-instance LiteLLM virtual keys for every run (evaluation.py:850), but only ACP agents injected them into their LLM client (via build_acp_agent() in acp.py). The default OpenHands agent used the master API key directly, so proxy_cost was always $0.00 for non-ACP runs.

This was discovered during PR validation runs for OpenHands/software-agent-sdk#2656 (eval_limit=5, 3 benchmarks × 3 agent types). The evidence:

Agent Type SDK Cost Proxy Cost Issue
acp-claude $0.21 $0.20 Both work
acp-gemini $0.00 $7.08 SDK blind (separate issue)
default $0.20 $0.00 Proxy blind (this fix)

Test plan

  • Run a default-agent evaluation and verify test_result.proxy_cost is non-zero
  • Run an ACP evaluation and verify behavior is unchanged (no regression)
  • Verify apply_virtual_key() is a no-op when no virtual key is active (returns original LLM unchanged)

Fixes #603

🤖 Generated with Claude Code

The evaluation orchestrator already creates per-instance LiteLLM virtual
keys, but only ACP agents injected them into their LLM client. The
default OpenHands agent used the master API key directly, so proxy_cost
was always $0.00 for non-ACP runs.

Add `apply_virtual_key()` to litellm_proxy.py that returns a copy of
the LLM config with the virtual key as api_key (thread-safe via
model_copy + threading.local). Update all benchmarks to use it when
creating the default Agent.

Fixes #603

Co-Authored-By: Claude Opus 4.6 <[email protected]>
Copy link
Copy Markdown
Collaborator

@all-hands-bot all-hands-bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Taste Rating: 🟡 Acceptable - Pragmatic fix for a real production issue. The design is solid: simple helper, thread-safe, backwards compatible. The core logic is good, but violates project standards on type hints.

Verdict: ✅ Worth merging after fixing type hints - The solution is fundamentally sound, just needs polish.

Key Insight: This is a textbook example of solving a real problem simply—one function, eight lines, no special cases. Just needs proper type annotations per project standards.

return getattr(_thread_local, "virtual_key", None)


def apply_virtual_key(llm): # type: ignore[no-untyped-def]
Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🟠 Important: Missing type hints violates project standards.

Per AGENTS.md: "Avoid # type: ignore unless absolutely necessary". The LLM type is available from the SDK:

Suggested change
def apply_virtual_key(llm): # type: ignore[no-untyped-def]
def apply_virtual_key(llm: LLM) -> LLM:
"""Return an LLM config copy with the per-instance virtual key as api_key.

You'll need to add from openhands.sdk.llm import LLM at the top of the file.

virtual_key = get_current_virtual_key()
if virtual_key is None:
return llm
from pydantic import SecretStr
Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🟡 Suggestion: Move import to top of file.

Per project guidelines: "Place all imports at the top of the file unless... circular imports, conditional imports, or imports that need to be delayed for specific reasons."

No circular import risk here since pydantic is external. Move this to the import block at the top:

from pydantic import SecretStr

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Cost tracking discrepancies: SDK vs LiteLLM proxy virtual key costs diverge by agent type

2 participants