Skip to content

Conversation

adase11
Copy link

@adase11 adase11 commented Sep 3, 2025

Enhances spring-projects#4199

By supporting the ability to:

Configure how many prompt segments (“cache blocks”) to cache. (Anthropic only allows max of 4 right now, user could choose less if they desire)
Control which message types are eligible for caching. (Allow for selecting based on use-case - for example my use case required SYSTEM and TOOL but not USER or ASSISTANT )
Choose cache type per message type (e.g., EPHEMERAL, EPHEMERAL_1H). This is supported by Anthropic now, also came in handy for my use case
Set global minimum length for caching and override it per message type. Allowing users to optimize the messages that are attempted to be cached - allowing them to attempt to optimize their usage of the max 4 cache blocks.

garethjevans and others added 3 commits August 29, 2025 13:32
Signed-off-by: Gareth Evans <[email protected]>

Fix checkstyle

Signed-off-by: Soby Chacko <[email protected]>
…che type, allow for more fined grained configuration of what's cached, account for Anthropic's max of 4 cache blocks per request

port over anthropic prompt caching support
@adase11
Copy link
Author

adase11 commented Sep 3, 2025

Sorry for the messed up commit history, I forked the Spring repo before I realized I should have forked yours @sobychacko

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants