Release LMDeploy Release V0.2.5 · InternLM/lmdeploy

What's Changed

Support mistral and sliding window attention by @grimoire in #1075
torch engine support chatglm3 by @grimoire in #1159
Support qwen1.5 in pytorch engine by @grimoire in #1160
Support mixtral for pytorch engine by @RunningLeon in #1133
Support torch deepseek moe by @grimoire in #1163
Support gemma model in pytorch engine by @grimoire in #1184
Auto backend for pipeline and serve when backend is not set to pytorch explicitly by @RunningLeon in #1211

Fix argument error by @ispobock in #1193
Use LifoQueue for turbomind async_stream_infer by @AllentDan in #1179
Update interactive output len strategy and response by @AllentDan in #1164
Support min_new_tokens generation config in pytorch engine by @grimoire in #1096
Batched sampling by @grimoire in #1197
refactor the logic of getting model_name by @AllentDan in #1188
Add parameter max_prefill_token_num by @lvhan028 in #1203
optmize baichuan in pytorch engine by @grimoire in #1223
check model required transformers version by @grimoire in #1220
torch optmize chatglm3 by @grimoire in #1215
Async torch engine by @grimoire in #1206
remove unused kernel in pytorch engine by @grimoire in #1237

Fix session length for profile generation by @ispobock in #1181
fix torch engine infer by @RunningLeon in #1185
fix module map by @grimoire in #1205
[Fix] Correct session length warning by @AllentDan in #1207
Fix all devices occupation when applying tp to torch engine by updating device map by @grimoire in #1172
Fix falcon chatglm2 template by @grimoire in #1168
[Fix] Avoid AsyncEngine running the same session id by @AllentDan in #1219
Fix None session_len by @lvhan028 in #1230
fix multinomial sampling by @grimoire in #1228
fix returning logits in prefill phase of pytorch engine by @grimoire in #1209
optimize pytorch engine inference with falcon model by @grimoire in #1234
fix bf16 multinomial sampling by @grimoire in #1239
reduce torchengine prefill mem usage by @grimoire in #1240

auto generate pipeline api for readthedocs by @RunningLeon in #1186
Added tutorial document for deploying lmdeploy on Jetson series boards. by @BestAnHongjun in #1192
update doc index by @zhyncs in #1241

Full Changelog: v0.2.4...v0.2.5