Release LMDeploy Release V0.2.0 · InternLM/lmdeploy

What's Changed

🚀 Features

Support internlm2 by @lvhan028 in #963
[Feature] Add params config for api server web_ui by @amulil in #735
[Feature]Merge lmdeploy lite calibrate and lmdeploy lite auto_awq by @pppppM in #849
Compute cross entropy loss given a list of input tokens by @lvhan028 in #830
Support QoS in api_server by @sallyjunjun in #877
Refactor torch inference engine by @lvhan028 in #871
add image chat demo by @irexyc in #874
check-in generation config by @lvhan028 in #902
check-in ModelConfig by @AllentDan in #907
pytorch engine config by @grimoire in #908
Check-in turbomind engine config by @irexyc in #909
S-LoRA support by @grimoire in #894
add init in adapters by @grimoire in #923
Refactor LLM inference pipeline API by @AllentDan in #916
Refactor gradio and api_server by @AllentDan in #918
Add request distributor server by @AllentDan in #903
Upgrade lmdeploy cli by @RunningLeon in #922

💥 Improvements

add top_k value for /v1/completions and update the documents by @AllentDan in #870
export "num_tokens_per_iter", "max_prefill_iters" and etc when converting a model by @lvhan028 in #845
Move api_server dependencies from serve.txt to runtime.txt by @lvhan028 in #879
Refactor benchmark bash script by @lvhan028 in #884
Add test case for function regression by @zhulinJulia24 in #844
Update test triton CI by @RunningLeon in #893
Update dockerfile by @RunningLeon in #891
Perform fuzzy matching on chat template according to model path by @AllentDan in #839
support accessing lmdeploy version by lmdeploy.version_info by @lvhan028 in #910
Remove flash-attn dependency of lmdeploy lite module by @lvhan028 in #917
Improve setup by removing pycuda dependency and adding cuda runtime and cublas to RPATH by @irexyc in #912
remove unused settings in turbomind engine config by @irexyc in #921
Cleanup fixed attributes in turbomind engine config by @irexyc in #928
fix get_gpu_mem by @grimoire in #934
remove instance_num argument by @AllentDan in #931
Fix matching results of several chat templates like llama2, solar, yi and so on by @AllentDan in #925
add pytorch random sampling by @grimoire in #930
suppress turbomind chat warning by @irexyc in #937
modify type hint of api to avoid import _turbomind by @AllentDan in #936
accelerate pytorch benchmark by @grimoire in #946
Remove tp from pipline argument list by @lvhan028 in #947
set gradio default value the same as chat.py by @AllentDan in #949
print help for cli in case of failure by @RunningLeon in #955
return dataclass for pipeline by @AllentDan in #952
set random seed when it is None by @AllentDan in #958
avoid run get_logger when import lmdeploy by @RunningLeon in #956
support mlp s-lora by @grimoire in #957
skip resume logic for pytorch backend by @AllentDan in #968
Add ci for ut by @RunningLeon in #966

🐞 Bug fixes

add tritonclient req by @RunningLeon in #872
Fix uninitialized parameter by @lvhan028 in #875
Fix overflow by @irexyc in #897
Fix data offset by @AllentDan in #900
Fix context decoding stuck issue when tp > 1 by @irexyc in #904
[Fix] set scaling_factor 1 forcefully when sequence length is less than max_pos_emb by @lvhan028 in #911
fix pytorch llama2 with new transformers by @grimoire in #914
fix local variable 'output_ids' referenced before assignment by @irexyc in #919
fix pipeline stop_words type error by @AllentDan in #929
pass stop words to openai api by @AllentDan in #887
fix profile generation multiprocessing error by @AllentDan in #933
Miss init.py in modeling folder by @lvhan028 in #951
fix cli with special arg names by @RunningLeon in #959
fix logger in tokenizer by @RunningLeon in #960

📚 Documentations

Improve user guide by @lvhan028 in #899
Add user guide about pytorch engine by @grimoire in #915
Update supported models and add quick start section in README by @lvhan028 in #926
Fix scripts in benchmark doc by @panli889 in #941
Update get_started and w4a16 tutorials by @lvhan028 in #945
Add more docstring to api_server and proxy_server by @AllentDan in #965
stable api_server benchmark result by a non-zero await by @AllentDan in #885
fix pytorch backend can not properly stop by @AllentDan in #962
[Fix] Fix calibrate bug when transformers>4.36 by @pppppM in #967

🌐 Other

bump version to v0.2.0 by @lvhan028 in #969

New Contributors

@amulil made their first contribution in #735
@zhulinJulia24 made their first contribution in #844
@sallyjunjun made their first contribution in #877
@panli889 made their first contribution in #941

Full Changelog: v0.1.0...v0.2.0

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

LMDeploy Release V0.2.0

What's Changed

🚀 Features

💥 Improvements

🐞 Bug fixes

📚 Documentations

🌐 Other

New Contributors

Contributors