HabanaAI / vllm-fork Public

forked from vllm-project/vllm

Notifications You must be signed in to change notification settings
Fork 72
Star 53

Code
Issues 9
Pull requests 43
Discussions
Actions
Projects
Security
Insights

Additional navigation options

Code
Issues
Pull requests
Discussions
Actions
Projects
Security
Insights

Pull requests: HabanaAI/vllm-fork

Labels 15 Milestones 0

New pull request New

43 Open 700 Closed

Author

Filter by author

Label

Filter by label

Use alt + click/return to exclude labels

or ⇧ + click/return for logical OR

Projects

Filter by project

Milestones

Filter by milestone

Reviews

Filter by reviews

No reviews Review required Approved review Changes requested

Assignee

Filter by who’s assigned

Assigned to nobody

Sort

Sort by

Newest Oldest Most commented Least commented Recently updated Least recently updated Best match

Most reactions

Pull requests list

support inc dynamic quant deepseek

#814 opened Feb 11, 2025 by changwangss

Loading…

Fix guided decoding crashes

#811 opened Feb 10, 2025 by kzawora-intel

Loading…

Rebase 2025-02-10

#810 opened Feb 10, 2025 by kzawora-intel

Loading…

support inc dynamic quantization

#803 opened Feb 8, 2025 by changwangss

Loading…

Qwen2 vl

#802 opened Feb 7, 2025 by malkomes • Draft

[SW-212072] Extend accuracy tests for models that we support

#800 opened Feb 7, 2025 by AnetaKaczynska

Loading…

mszu/merged scheduler

#799 opened Feb 7, 2025 by szutenberg • Draft

[WIP] Updating docs for the vLLM 1.20 release

#798 opened Feb 7, 2025 by PatrykWo

Loading…

Pin triton to v3.1.0 for HPU

#796 opened Feb 7, 2025 by iboiko-habana

Loading…

Pin triton to v3.1.0 for HPU

#795 opened Feb 7, 2025 by iboiko-habana

Loading…

Fix sporadic issue in async_engine/test_api_server tests

#794 opened Feb 7, 2025 by akarnows

Loading…

Support qwenvl model for HPU

#793 opened Feb 7, 2025 by yingjie-han

Loading…

[DEEPSEEK_V3/R1] includes features of fp8 dequant, MLA, Expert parallelism

#792 opened Feb 6, 2025 by xuechendi

Loading…

Enable roberta embedding

#786 opened Feb 5, 2025 by yeonsily

Loading…

Improve RMSNorm to support 2D inputs

#784 opened Feb 5, 2025 by YangQun1

Loading…

[SW-207299] Recalc scales from user

#774 opened Feb 3, 2025 by linoybu

Loading…

Fix warmup padding

#759 opened Jan 30, 2025 by mfylcek • Draft

Initial enablement for text-embedding

#758 opened Jan 30, 2025 by libinta

Loading…

[DO NOT MERGE][PoC] Mark dynamic shapes in torch.compile mode

#755 opened Jan 29, 2025 by kzawora-intel • Draft

Pipeline Parallelism implementation.

#731 opened Jan 23, 2025 by jmaksymczuk • Draft

Allow tests to run in t.compile

#724 opened Jan 22, 2025 by Kacper-Pietkun

Loading…

[DONOTMERGE] check fake-hpu build num

#722 opened Jan 22, 2025 by madamczykhabana • Draft

Delayed sampling

#720 opened Jan 22, 2025 by mfylcek • Draft

make benchmark_throughput static support single image input

#718 opened Jan 22, 2025 by yma11

Loading…

multi-image support for llama3.2

#705 opened Jan 20, 2025 by yma11

Loading…

Previous 1 2 Next

Previous Next

ProTip! What’s not been updated in a month: updated:<2025-01-11.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly