Skip to content

Releases: xorbitsai/inference

v1.3.1.post1

11 Mar 04:10
2ef99fb
Compare
Choose a tag to compare

What's new in 1.3.1.post1 (2025-03-11)

These are the changes in inference v1.3.1.post1.

Bug fixes

  • BUG: Fix reasoning content parser for qwq-32b by @amumu96 in #3024
  • BUG: Failed to download model 'QwQ-32B' (size: 32, format: ggufv2) after multiple retries by @Jun-Howie in #3031

Documentation

Full Changelog: v1.3.1...v1.3.1.post1

v1.3.1

09 Mar 04:39
5d6ec93
Compare
Choose a tag to compare

What's new in 1.3.1 (2025-03-09)

These are the changes in inference v1.3.1.

New features

Enhancements

Bug fixes

  • BUG: fix qwen2.5-vl-7b cannot chat bug by @amumu96 in #2944
  • BUG: Fix modelscope model id on Qwen2.5-VL Added support for AWQ quantization format in Qwen2.5-VL by @Jun-Howie in #2943
  • BUG: fix Error while using Langchain-chatchat, because the parameter [max_tokens] passed is None by @William533036 in #2962
  • BUG: using jina-clip-v2, no attribute error when only text of image pass in by @Minamiyama in #2974
  • BUG: fix compatibility of mlx-lm v0.21.5 by @qinxuye in #2993
  • BUG: Fix tokenizer error in create_embedding by @shuaiqidezhong in #2992
  • BUG: wrong kwargs passing to encode method when using jina-clip-v2 by @Minamiyama in #2991
  • BUG: [UI] fix the white screen bug. by @yiboyasss in #3014

New Contributors

Full Changelog: v1.3.0.post2...v1.3.1

v1.3.0.post2

22 Feb 15:30
378a47a
Compare
Choose a tag to compare

What's new in 1.3.0.post2 (2025-02-22)

These are the changes in inference v1.3.0.post2.

Bug fixes

Full Changelog: v1.3.0.post1...v1.3.0.post2

v1.3.0.post1

21 Feb 16:14
b2004d4
Compare
Choose a tag to compare

What's new in 1.3.0.post1 (2025-02-21)

These are the changes in inference v1.3.0.post1.

New features

Enhancements

  • enh: add gpu utilization info by @amumu96 in #2852
  • ENH: Update Kokoro model by @codingl2k1 in #2843
  • ENH: cmdline supports --n-worker, add --model-path and make it compatible with --model_path by @qinxuye in #2890
  • BLD: update sglang to v0.4.2.post4 and vllm to v0.7.2 by @qinxuye in #2838
  • BLD: fix flashinfer installation in dockerfile by @qinxuye in #2844

Bug fixes

Tests

Documentation

Others

  • CHORE: Xavier now supports vLLM >= 0.7.0, drops support for older versions by @ChengjieLi28 in #2886

New Contributors

Full Changelog: v1.2.2...v1.3.0.post1

v1.2.2

08 Feb 09:28
ac97a13
Compare
Choose a tag to compare

What's new in 1.2.2 (2025-02-08)

These are the changes in inference v1.2.2.

New features

Bug fixes

  • BUG: fix llama-cpp when some quantizations have multiple parts by @qinxuye in #2786
  • BUG: Use Cache class instead of raw tuple for transformers continuous batching, compatible with latest transformers by @ChengjieLi28 in #2820

Documentation

New Contributors

Full Changelog: v1.2.1...v1.2.2

v1.2.1

24 Jan 08:59
a57b99b
Compare
Choose a tag to compare

What's new in 1.2.1 (2025-01-24)

These are the changes in inference v1.2.1.

New features

Enhancements

Bug fixes

Tests

Documentation

  • DOC: update new models in README and doc by @qinxuye in #2761
  • DOC: using discord instead of slack & updating model to qwen2.5 in getting started doc by @qinxuye in #2775

Others

  • FIX: [UI] normalize language input to ensure consistent array format. by @yiboyasss in #2771

New Contributors

Full Changelog: v1.2.0...v1.2.1

v1.2.0

10 Jan 09:34
df45f11
Compare
Choose a tag to compare

What's new in 1.2.0 (2025-01-10)

These are the changes in inference v1.2.0.

New features

Enhancements

  • ENH: [UI] Update Button Style and Interaction Logic for Editing Cache in Model Card. by @yiboyasss in #2746
  • ENH: Improve error message by @codingl2k1 in #2738

Bug fixes

Others

New Contributors

Full Changelog: v1.1.1...v1.2.0

v1.1.1

27 Dec 10:21
d342869
Compare
Choose a tag to compare

What's new in 1.1.1 (2024-12-27)

These are the changes in inference v1.1.1.

New features

Enhancements

Bug fixes

New Contributors

Full Changelog: v1.1.0...v1.1.1

v1.1.0

13 Dec 10:29
b132fca
Compare
Choose a tag to compare

What's new in 1.1.0 (2024-12-13)

These are the changes in inference v1.1.0.

New features

Enhancements

  • ENH: Optimize error message when user parameters are passed incorrectly by @namecd in #2623
  • ENH: bypass the sampling parameter skip_special_tokens to vLLM backend by @zjuyzj in #2655
  • ENH: unify prompt_text as cosyvoice for fish speech by @qinxuye in #2658
  • ENH: Update glm4 chat model to new weights by @codingl2k1 in #2660
  • ENH: upgrade sglang in Docker by @amumu96 in #2668

Bug fixes

Documentation

Others

New Contributors

Full Changelog: v1.0.1...v1.1.0

v1.0.1

29 Nov 10:22
8dd5715
Compare
Choose a tag to compare

What's new in 1.0.1 (2024-11-29)

These are the changes in inference v1.0.1.

New features

Enhancements

Bug fixes

Documentation

New Contributors

Full Changelog: v1.0.0...v1.0.1