GH-1403 Anthropic Prompt Caching Support #2
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
Enhances spring-projects#4199
By supporting the ability to:
Configure how many prompt segments (“cache blocks”) to cache. (Anthropic only allows max of 4 right now, user could choose less if they desire)
Control which message types are eligible for caching. (Allow for selecting based on use-case - for example my use case required SYSTEM and TOOL but not USER or ASSISTANT )
Choose cache type per message type (e.g., EPHEMERAL, EPHEMERAL_1H). This is supported by Anthropic now, also came in handy for my use case
Set global minimum length for caching and override it per message type. Allowing users to optimize the messages that are attempted to be cached - allowing them to attempt to optimize their usage of the max 4 cache blocks.