Release LMDeploy Release V0.5.0 · InternLM/lmdeploy

What's Changed

🚀 Features

support MiniCPM-Llama3-V 2.5 by @irexyc in #1708
[Feature]: Support llava for pytorch engine by @RunningLeon in #1641
Device dispatcher by @grimoire in #1775
Add GLM-4-9B-Chat by @lzhangzz in #1724
Torch deepseek v2 by @grimoire in #1621
Support internvl-chat for pytorch engine by @RunningLeon in #1797
Add interfaces to the pipeline to obtain logits and ppl by @irexyc in #1652
[Feature]: Support cogvlm-chat by @RunningLeon in #1502

💥 Improvements

support mistral and llava_mistral in turbomind by @lvhan028 in #1579
Add health endpoint by @AllentDan in #1679
upgrade the version of the dependency package peft by @grimoire in #1687
Follow the conventional model_name by @AllentDan in #1677
API Image URL fetch timeout by @vody-am in #1684
Support internlm-xcomposer2-4khd-7b awq by @AllentDan in #1666
update dockerfile and docs by @RunningLeon in #1715
lazy import VLAsyncEngine to avoid bringing in VLMs dependencies when deploying LLMs by @lvhan028 in #1714
feat: align with OpenAI temperature range by @zhyncs in #1733
feat: align with OpenAI temperature range in api server by @zhyncs in #1734
Refactor converter about get_input_model_registered_name and get_output_model_registered_name_and_config by @lvhan028 in #1702
Refine max_new_tokens logic to improve user experience by @AllentDan in #1705
Refactor loading weights by @grimoire in #1603
refactor config by @grimoire in #1751
Add anomaly handler by @lzhangzz in #1780
Encode raw image file to base64 by @irexyc in #1773
skip inference for oversized inputs by @grimoire in #1769
fix: prevent numpy breakage by @zhyncs in #1791
More accurate time logging for ImageEncoder and fix concurrent image processing corruption by @irexyc in #1765
Optimize kernel launch for triton2.2.0 and triton2.3.0 by @grimoire in #1499
feat: auto set awq model_format from hf by @zhyncs in #1799
check driver mismatch by @grimoire in #1811
PyTorchEngine adapts to the latest internlm2 modeling. by @grimoire in #1798
AsyncEngine create cancel task in exception. by @grimoire in #1807
compat internlm2 for pytorch engine by @RunningLeon in #1825
Add model revision & download_dir to cli by @irexyc in #1814
fix image encoder request queue by @irexyc in #1837
Harden stream callback by @lzhangzz in #1838
Support Qwen2-1.5b awq by @AllentDan in #1793
remove chat template config in turbomind engine by @irexyc in #1161
misc: align PyTorch Engine temprature with TurboMind by @zhyncs in #1850
docs: update cache-max-entry-count help message by @zhyncs in #1892

🐞 Bug fixes

fix typos by @irexyc in #1690
[Bugfix] fix internvl-1.5-chat vision model preprocess and freeze weights by @DefTruth in #1741
lock setuptools version in dockerfile by @RunningLeon in #1770
Fix openai package can not use proxy stream mode by @AllentDan in #1692
Fix finish_reason by @AllentDan in #1768
fix uncached stop words by @grimoire in #1754
[side-effect]Fix param --cache-max-entry-count is not taking effect (#1758) by @QwertyJack in #1778
support qwen2 1.5b by @lvhan028 in #1782
fix falcon attention by @grimoire in #1761
Refine AsyncEngine exception handler by @AllentDan in #1789
[side-effect] fix weight_type caused by PR #1702 by @lvhan028 in #1795
fix best_match_model by @irexyc in #1812
Fix Request completed log by @irexyc in #1821
fix qwen-vl-chat hung by @irexyc in #1824
Detokenize with prompt token ids by @AllentDan in #1753
Update engine.py to fix small typos by @WANGSSSSSSS in #1829
[side-effect] bring back "--cap" argument in chat cli by @lvhan028 in #1859
Fix vl session-len by @AllentDan in #1860
fix gradio vl "stop_words" by @irexyc in #1873
fix qwen2 cache_position for PyTorch Engine when transformers>4.41.2 by @zhyncs in #1886
fix model name matching for internvl by @RunningLeon in #1867

📚 Documentations

docs: add BentoLMDeploy in README by @zhyncs in #1736
[Doc]: Update docs for internlm2.5 by @RunningLeon in #1887

🌐 Other

add longtext generation benchmark by @zhulinJulia24 in #1694
add qwen2 model into testcase by @zhulinJulia24 in #1772
fix pr test for newest internlm2 model by @zhulinJulia24 in #1806
react test evaluation config by @zhulinJulia24 in #1861
bump version to v0.5.0 by @lvhan028 in #1852

New Contributors

@DefTruth made their first contribution in #1741
@QwertyJack made their first contribution in #1778
@WANGSSSSSSS made their first contribution in #1829

Full Changelog: v0.4.2...v0.5.0

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

LMDeploy Release V0.5.0

What's Changed

🚀 Features

💥 Improvements

🐞 Bug fixes

📚 Documentations

🌐 Other

New Contributors

Contributors