LMDeploy Release v0.6.2.post1
What's Changed
Bugs
- Fix llama3.2 VL vision in "Supported Modals" documents @blankanswer in #2703
- miss to read moe_ffn weights from converted tm model @lvhan028 in #2698
- better tp exit log @grimoire in #2677
- fix index error when computing ppl on long-text prompt @lvhan028 in #2697
- Support min_tokens, min_p parameters for api_server @AllentDan in 2681
- fix ascend get_started.md link @CyCle1024 in #2696
- Call cuda empty_cache to prevent OOM when quantizing model @AllentDan in #2671
- Fix turbomind TP for v0.6.2 by @lzhangzz in #2713
🌐 Other
- [ci] support v100 dailytest (https://github.com/InternLM/lmdeploy/pull/2665[)](https://github.com/InternLM/lmdeploy/commit/434195ea0c80b38dc2cf80c79d53a30f22b53aab)
- bump version to 0.6.2.post1 by @lvhan028 in #2717
Full Changelog: v0.6.2...v0.6.2.post1