Add Anthropic prompt caching via AnthropicChatOptions #4300

markpollack · 2025-09-03T21:04:40Z

Implement comprehensive prompt caching support for Anthropic Claude models in Spring AI:

Core Implementation:

Add AnthropicCacheStrategy enum with 4 strategic options: NONE, SYSTEM_ONLY, SYSTEM_AND_TOOLS, CONVERSATION_HISTORY
Implement strategic cache placement with automatic 4-breakpoint limit enforcement via CacheBreakpointTracker
Support configurable TTL durations: "5m" (default) and "1h" (requires beta header)
Add cache_control support to system messages, tools, and conversation history based on strategy

API Changes:

Extend AnthropicChatOptions with cacheStrategy() and cacheTtl() builder methods
Update AnthropicApi.Tool record to support cache_control field
Add cache usage tracking via cacheCreationInputTokens() and cacheReadInputTokens()

Testing & Quality:

Add comprehensive integration tests with real-world scenarios
Add extensive mock test coverage with complex multi-breakpoint scenarios
Fix all checkstyle violations and test failures
Add cache breakpoint limit warning for production debugging

Documentation:

Complete API documentation with practical examples and best practices
Add real-world use cases: legal document analysis, batch code review, customer support
Include cost optimization guidance demonstrating up to 90% savings
Document future enhancement roadmap for advanced scenarios

Signed-off-by: Mark Pollack [email protected]

@Claudio-code

…tOptions - Add cacheControl field to AnthropicChatOptions with builder method - Create AnthropicCacheType enum with EPHEMERAL type for type-safe cache creation - Update AnthropicChatModel.createRequest() to apply cache control from options to user message ContentBlocks - Extend ContentBlock record with cacheControl parameter and constructor for API compatibility - Update Usage record to include cacheCreationInputTokens and cacheReadInputTokens fields - Update StreamHelper to handle new Usage constructor with cache token parameters - Add AnthropicApiIT.chatWithPromptCache() test for low-level API validation - Add AnthropicChatModelIT.chatWithPromptCacheViaOptions() integration test - Add comprehensive unit tests for AnthropicChatOptions cache control functionality - Update documentation with cacheControl() method examples and usage patterns Cache control is configured through AnthropicChatOptions rather than message classes to maintain provider portability. The cache control gets applied during request creation in AnthropicChatModel when building ContentBlocks for user messages. Original implementation provided by @Claudio-code (Claudio Silva Junior) See spring-projects@15e5026 Fixes spring-projects#1403 Signed-off-by: Soby Chacko <[email protected]>

This commit implements comprehensive prompt caching support for Anthropic Claude models in Spring AI: Core Implementation: - Add AnthropicCacheStrategy enum with 4 strategic options: NONE, SYSTEM_ONLY, SYSTEM_AND_TOOLS, CONVERSATION_HISTORY - Implement strategic cache placement with automatic 4-breakpoint limit enforcement via CacheBreakpointTracker - Support configurable TTL durations: "5m" (default) and "1h" (requires beta header) - Add cache_control support to system messages, tools, and conversation history based on strategy API Changes: - Extend AnthropicChatOptions with cacheStrategy() and cacheTtl() builder methods - Update AnthropicApi.Tool record to support cache_control field - Add cache usage tracking via cacheCreationInputTokens() and cacheReadInputTokens() Testing & Quality: - Add comprehensive integration tests with real-world scenarios - Add extensive mock test coverage with complex multi-breakpoint scenarios - Fix all checkstyle violations and test failures - Add cache breakpoint limit warning for production debugging Documentation: - Complete API documentation with practical examples and best practices - Add real-world use cases: legal document analysis, batch code review, customer support - Include cost optimization guidance demonstrating up to 90% savings - Document future enhancement roadmap for advanced scenarios Signed-off-by: Mark Pollack <[email protected]>

markpollack · 2025-09-03T21:06:18Z

...s/spring-ai-anthropic/src/main/java/org/springframework/ai/anthropic/AnthropicChatModel.java

+			// Create cache control with TTL if specified, otherwise use default 5m
+			if (cacheTtl != null && !cacheTtl.equals("5m")) {
+				cacheControl = new ChatCompletionRequest.CacheControl("ephemeral", cacheTtl);
+				logger.info("Created cache control with TTL: type={}, ttl={}", "ephemeral", cacheTtl);


these need to be changed to 'debug'

adase11 · 2025-09-04T16:30:18Z

I see that allowing for more fine grained control of the caching strategy (1h vs 5m) is listed as a follow up, that makes sense. I also liked one additional feature that I included here sobychacko#2 where I could optimize the cache block usage by providing a minimum content size parameter so that we're not attempting to cache messages that will have very little impact (or none at all if they're under the minimum size Anthropic will even let you cache).

The 2 biggest benefits I saw when testing out my approach - in terms of cache utilization (number of tokens cached as a percent of the total tokens in the request) - were fine grained control over TTL based on message type and providing a configurable minimum content size in order for the message to be eligible to be attempted to cache.

In a production application very often System messages are the same across many conversations, and obviously user and assistant messages are unique to an individual conversation - thus the 1 hour cache is beneficial for system messages while less beneficial for user and assistant messages (the 5 min cache is better there because typically that would take care of a user having a multi turn conversation. This differentiation is why the more fine grained control over the caching TTL is beneficial.

The minimum message size is pretty self explanatory - it lets the caching strategy go about caching the messages that are going to have the largest impact on tokens used for a conversation. I found it useful to have this be able to be segmented by message type - but thats not as critical.

I'm happy to help out in any way I can if you choose to pick up either of these items. @markpollack

sobychacko · 2025-09-05T14:49:34Z

This PR is merged via 5afd2d2.

@adase11, We can continue the conversation here, although this PR is closed. When appropriate, we can create a new issue to follow up on the concerns you raised.

adase11 · 2025-09-05T15:30:23Z

@sobychacko thanks for coordinating, sounds like a plan. How would you like me to move forward with my suggestions, I'm happy to discuss further or I can set up a example branch demonstrating some of what I'm talking about.

sobychacko · 2025-09-05T15:58:33Z

@adase11, could you create a new issue and capture all your feedback there? Then we can proceed based on that issue. Perhaps you could copy/paste the above comments, as well as the ones you added on the PR branch? Once you create the issue, please notify us here so that we can prioritize it.

adase11 · 2025-09-05T16:18:47Z

Perfect, can do

sobychacko and others added 3 commits September 3, 2025 17:00

fix tests

2ee62da

markpollack assigned sobychacko Sep 3, 2025

markpollack added the anthropic label Sep 3, 2025

markpollack added this to the 1.1.0.M1 milestone Sep 3, 2025

markpollack commented Sep 3, 2025

View reviewed changes

sobychacko mentioned this pull request Sep 5, 2025

GH-1403: Implements Anthropic's prompt caching feature to improve tok… #4199

Closed

sobychacko closed this Sep 5, 2025

adase11 mentioned this pull request Sep 5, 2025

Make Anthropic prompt caching message-type aware (TTLs, eligibility, min-size) and optimize cache-block usage #4325

Open

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Add Anthropic prompt caching via AnthropicChatOptions #4300

Add Anthropic prompt caching via AnthropicChatOptions #4300

Uh oh!

markpollack commented Sep 3, 2025

Uh oh!

markpollack Sep 3, 2025

Uh oh!

adase11 commented Sep 4, 2025

Uh oh!

sobychacko commented Sep 5, 2025

Uh oh!

adase11 commented Sep 5, 2025

Uh oh!

sobychacko commented Sep 5, 2025

Uh oh!

adase11 commented Sep 5, 2025

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

Add Anthropic prompt caching via AnthropicChatOptions #4300

Add Anthropic prompt caching via AnthropicChatOptions #4300

Uh oh!

Conversation

markpollack commented Sep 3, 2025

Uh oh!

markpollack Sep 3, 2025

Choose a reason for hiding this comment

Uh oh!

adase11 commented Sep 4, 2025

Uh oh!

sobychacko commented Sep 5, 2025

Uh oh!

adase11 commented Sep 5, 2025

Uh oh!

sobychacko commented Sep 5, 2025

Uh oh!

adase11 commented Sep 5, 2025

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants