Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

vllm参数配置bug #3070

Open
1 of 3 tasks
Liguoz opened this issue Mar 15, 2025 · 1 comment
Open
1 of 3 tasks

vllm参数配置bug #3070

Liguoz opened this issue Mar 15, 2025 · 1 comment
Labels
Milestone

Comments

@Liguoz
Copy link

Liguoz commented Mar 15, 2025

System Info / 系統信息

Image

enable_chunked_prefill每次配置成false启动完毕,打开启动会重新变成none
enable_prefix_cache 默认参数拼写错误,应该是enable_prefix_caching

Running Xinference with Docker? / 是否使用 Docker 运行 Xinfernece?

  • docker / docker
  • pip install / 通过 pip install 安装
  • installation from source / 从源码安装

Version info / 版本信息

最新的

The command used to start Xinference / 用以启动 xinference 的命令

xinference launch --model-name custom-DeepSeek-R1-Distill-Qwen-14B --model-type LLM --model-engine vLLM --model-format pytorch --size-in-billions 14 --quantization none --n-gpu auto --replica 1 --n-worker 1 --gpu-idx 0,1,2,3 --gpu_memory_utilization 0.95 --enforce_eager true

Reproduction / 复现过程

启动模型的时候可见

Expected behavior / 期待表现

参数配置符合预期

@XprobeBot XprobeBot added the gpu label Mar 15, 2025
@XprobeBot XprobeBot added this to the v1.x milestone Mar 15, 2025
@qinxuye
Copy link
Contributor

qinxuye commented Mar 17, 2025

关于 enable_prefix_caching:

#2998 (comment)

欢迎提交 PR 修复。

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

No branches or pull requests

3 participants