-
Notifications
You must be signed in to change notification settings - Fork 3.5k
Add Anthropic caching support to OpenRouter LLM implementation #7492
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Add Anthropic caching support to OpenRouter LLM implementation #7492
Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@ferenci84 this will be awesome! some nitpicks, the completion option is the only important thing
…erenci84/feature/openrouter-anthropic-caching
🎉 This PR is included in version 1.19.0 🎉 The release is available on: Your semantic-release bot 📦🚀 |
🎉 This PR is included in version 1.22.0 🎉 The release is available on: Your semantic-release bot 📦🚀 |
Description
Resolves #7479
What Happens Now
When you use OpenRouter with Anthropic models (like
anthropic/claude-sonnet-4
,anthropic/claude-opus-4
, etc.) and caching is enabled:System Message Caching: If
cacheSystemMessage
is true, the system message will be sent withcache_control: { type: "ephemeral" }
Conversation Caching: If
cacheConversation
is true, the last two user messages in the conversation will be cachedAutomatic Processing: OpenRouter automatically handles the caching headers internally - no additional headers needed
Example Request Body
When sending a request to OpenRouter with an Anthropic model, the modified body will look like:
Non-Anthropic Models
When using non-Anthropic models (like GPT-4, Llama, etc.) through OpenRouter, the caching fields are NOT added, ensuring compatibility with all model providers.
Checklist
Tests
OpenRouter.vitest.ts is added.
Summary by cubic
Add Anthropic caching support to OpenRouter for Claude models. When caching is enabled, we annotate the system and last two user messages with cache_control so OpenRouter handles caching; other models are unchanged.