LMDeploy Release V0.1.0a2

lvhan028 released this 06 Dec 06:50

· 834 commits to main since this release

fddad30

What's Changed

💥 Improvements

Unify prefill & decode passes by @lzhangzz in #775
add cuda12.1 build check ci by @irexyc in #782
auto upload cuda12.1 python pkg to release when create new tag by @irexyc in #784
Report the inference benchmark of models with different size by @lvhan028 in #794
Add chat template for Yi by @AllentDan in #779

🐞 Bug fixes

Fix early-exit condition in attention kernel by @lzhangzz in #788
Fix missed arguments when benchmark static inference performance by @lvhan028 in #787
fix extra colon in InternLMChat7B template by @C1rN09 in #796
Fix local kv head num by @lvhan028 in #806

📚 Documentations

Update benchmark user guide by @lvhan028 in #763

🌐 Other

bump version to v0.1.0a2 by @lvhan028 in #807

New Contributors

@C1rN09 made their first contribution in #796

Full Changelog: v0.1.0a1...v0.1.0a2

Contributors

lvhan028, irexyc, and 3 other contributors

Assets 10