Support prompt caching API #220

yuzisun · 2025-01-29T19:05:52Z

Description:
Extend OpenAI compatible API to support prompt caching API, this feature is supported both by KServe via LMCache and AWS Bedrock

[optional Relevant Links:]

Any extra documentation required to understand the issue.

mathetake · 2025-01-29T23:04:30Z

could you share the big picture of what kind of change/implementation is necessary to this repo? (I guess it's transformer impl?)

yuzisun added the enhancement New feature or request label Jan 29, 2025

Provide feedback