Uh oh!

There was an error while loading. Please reload this page.

vllm-project / vllm Public

Notifications You must be signed in to change notification settings
Fork 19.6k
Star 86.6k

Code
Issues 2k
Pull requests 3.8k
Discussions
Actions
Projects
Security and quality 54
Insights

Additional navigation options

Code
Issues
Pull requests
Discussions
Actions
Projects
Security and quality
Insights

Pull requests: vllm-project/vllm

Labels 60 Milestones 0

New pull request New

3,817 Open 27,816 Closed

Author

Filter by author

Uh oh!

There was an error while loading. Please reload this page.

Label

Filter by label

Uh oh!

There was an error while loading. Please reload this page.

Use alt + click/return to exclude labels

or ⇧ + click/return for logical OR

Projects

Filter by project

Uh oh!

There was an error while loading. Please reload this page.

Milestones

Filter by milestone

Uh oh!

There was an error while loading. Please reload this page.

Reviews

Filter by reviews

No reviews Review required Approved review Changes requested

Assignee

Filter by who’s assigned

Assigned to nobody

Uh oh!

There was an error while loading. Please reload this page.

Sort

Sort by

Newest Oldest Most commented Least commented Recently updated Least recently updated Best match

Most reactions

Pull requests list

[Quant] Add online NVFP4 dense-linear quantization (W4A16 + W4A4) nvidia

#49060 opened Jul 18, 2026 by bhoomit Contributor

Loading…

[Bugfix][DSv4][SM120] Skip empty sparse-MLA prefill chunks bug

Something isn't working

nvidia

#49059 opened Jul 18, 2026 by ormandj • Draft

fix: LoRA Triton kernel OOB crash with NVFP4/EP padded buffers

#49058 opened Jul 18, 2026 by fattchris

Loading…

[Bugfix] Fix false boolean values in YAML CLI configuration bug

Something isn't working

#49057 opened Jul 18, 2026 by hsusul

Loading…

[Bugfix] Emit a valid media type from encode_{audio,image,video}_url bug

Something isn't working

multi-modality

Related to multi-modality (#4194)

#49056 opened Jul 18, 2026 by vineethsaivs

Loading…

[Bugfix][MiniMax-M3] Avoid NaNs for empty sparse decode rows bug

Something isn't working

#49054 opened Jul 18, 2026 by thepowerfuldeez

Loading…

[Bugfix] Reject negative --device-ids indices instead of selecting the wrong device bug

Something isn't working

#49053 opened Jul 18, 2026 by vineethsaivs

Loading…

[Bugfix][KV Offload] Bound unaligned SWA loads by physical GPU blocks bug

Something isn't working

kv-connector v1

#49052 opened Jul 18, 2026 by coltonottley

Loading…

[Docs] quantized_kvcache: add measured performance section, replace deprecated gated example model documentation

Improvements or additions to documentation

#49050 opened Jul 18, 2026 by robertlangdonn

Loading…

[Doc] Document SimpleCPUOffloadConnector in the KV offloading guide documentation

Improvements or additions to documentation

#49048 opened Jul 18, 2026 by ayaangazali

Loading…

[Rust Frontend] Extract request preparation from the inference path rust

#49045 opened Jul 18, 2026 by sagearc Contributor

Loading…

[ROCm] [Release] [Per-commit] Reenable per commit rocm wheel ci/build ready

ONLY add when PR is ready to merge/full CI is needed

rocm

Related to AMD ROCm

#49044 opened Jul 18, 2026 by tjtanaa Member

Loading…

4 tasks

[Bugfix]Reject invalid FlashInfer MNNVL workspaces bug

Something isn't working

nvidia

#49043 opened Jul 18, 2026 by lengrongfu Contributor

Loading…

3 of 4 tasks

[Rust Frontend] Fix macro-based content format detection rust

#49042 opened Jul 18, 2026 by reidliu41 Contributor

Loading…

4 tasks

[Core][Frontend] Add weight version tagging for RL rollouts documentation

Improvements or additions to documentation

frontend rust v1

#49040 opened Jul 18, 2026 by ShuoleiWang

Loading…

[Bugfix] Propagate VLLM_MARLIN_INPUT_DTYPE into INC/AutoRound quant path bug

Something isn't working

#49039 opened Jul 18, 2026 by lazypool

Loading…

Make load_weights completely optional deepseek

Related to DeepSeek models

gpt-oss

Related to GPT-OSS models

llama

Related to Llama models

mistral

Related to Mistral models

qwen

Related to Qwen models

speculative-decoding v1

#49038 opened Jul 18, 2026 by hmellor Member • Draft

[Perf] Add per-phase cold-start startup span benchmark harness performance

Performance-related issues

#49037 opened Jul 18, 2026 by SuperMarioYL

Loading…

fix(spec_decode): bypass embedding dim check for MTP speculative decoding methods speculative-decoding v1

#49036 opened Jul 18, 2026 by ArjunPakhan

Loading…

1 of 3 tasks

fix: handle missing parent modules in _has_module

#49035 opened Jul 18, 2026 by ShuhaoZhangTony

Loading…

fix(v1): avoid false shutdown failures on clean exit v1

#49034 opened Jul 18, 2026 by ShuhaoZhangTony

Loading…

Revert "[Sampler] Stop upcasting logits to fp32 in apply_sampling_params" (#48641) v1

#49033 opened Jul 18, 2026 by vllm-agent Contributor • Draft

Revert "[Bugfix] Fix activation quantization dispatch for WNA4Int/WNA8Int" (#48785) bug

Something isn't working

#49032 opened Jul 18, 2026 by vllm-agent Contributor • Draft

[Bugfix][Multimodal] Fix video temporal padding estimates bug

Something isn't working

multi-modality

Related to multi-modality (#4194)

qwen

Related to Qwen models

#49030 opened Jul 18, 2026 by labAxiaoming Contributor

Loading…

4 tasks done

[Bugfix][Tool Parser] Implement streaming for phi4_mini_json parser bug

Something isn't working

tool-calling

#49028 opened Jul 18, 2026 by Yigtwxx

Loading…

Previous 1 2 3 4 5 … 152 153 Next

Previous Next

ProTip! Follow long discussions with comments:>50.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Uh oh!

Uh oh!

Uh oh!