Skip to content

v0.1.0

Latest

Choose a tag to compare

@Lijiachen1018 Lijiachen1018 released this 02 Dec 08:42
· 7 commits to develop since this release
5ba2684

We are excited to announce the first official release of Unified Cache Manager.

Hightlights

  • Offload Prefix Cache to storage.
  • Homogeneous/ Heterogeneos PD disaggregation.
  • Training-Free sparsity in accelerating inference.(vllm==0.9.2, vllm-ascend==0.9.2rc1)in #199, #335, #190, #451

Core:

  • Garbage collection for store in #315 and #312
  • Adapt to vllm and vllm-ascend in #13, #292, #415 and #362
  • UCM supports metrics display online via Grafana and Promethues in #414 and docs in #416

Known Issues

If using Ascend platform, please be mind of

  • not compatible with broadcast
  • load_only_first_rank: false in config

Others

  • Update documents
  • Tools for performance tuning, hyperparameter optimization in #418

What's Changed

New Contributors

Full Changelog: v0.1.0rc4...v0.1.0