LMDeploy Release V0.0.9

lvhan028 released this 20 Sep 08:10

· 923 commits to main since this release

0be9e7a

Highlight

Support InternLM 20B, including FP16, W4A16, and W4KV8

What's Changed

🚀 Features

Support InternLM 20B by @lvhan028 in #440

💥 Improvements

Reduce gil switching by @irexyc in #407
Profile token generation with more settings by @AllentDan in #364

🐞 Bug fixes

Fix disk space limit for building docker image by @RunningLeon in #404
more general pypi ci by @irexyc in #412
Fix build.md by @pangsg in #411
Fix memory leak by @irexyc in #415
Fix token count bug by @AllentDan in #416
[Fix] Support actual seqlen in flash-attention2 by @grimoire in #418
[Fix] output[-1] when output is empty by @wangruohui in #405

🌐 Other

rename readthedocs config file by @RunningLeon in #429
bump version to v0.0.9 by @lvhan028 in #428

New Contributors

@pangsg made their first contribution in #411

Full Changelog: v0.0.8...v0.0.9

Contributors

grimoire, lvhan028, and 5 other contributors

Assets 2