waybarrios / vllm-mlx Public

Notifications You must be signed in to change notification settings
Fork 188
Star 1.3k

Code
Issues 42
Pull requests 17
Discussions
Actions
Projects
Security and quality
Insights

Additional navigation options

Code
Issues
Pull requests
Discussions
Actions
Projects
Security and quality
Insights

Pull requests: waybarrios/vllm-mlx

Labels 16 Milestones 0

New pull request New

17 Open 411 Closed

Author

Filter by author

Uh oh!

There was an error while loading. Please reload this page.

Label

Filter by label

Uh oh!

There was an error while loading. Please reload this page.

Use alt + click/return to exclude labels

or ⇧ + click/return for logical OR

Projects

Filter by project

Uh oh!

There was an error while loading. Please reload this page.

Milestones

Filter by milestone

Uh oh!

There was an error while loading. Please reload this page.

Reviews

Filter by reviews

No reviews Review required Approved review Changes requested

Assignee

Filter by who’s assigned

Assigned to nobody

Uh oh!

There was an error while loading. Please reload this page.

Sort

Sort by

Newest Oldest Most commented Least commented Recently updated Least recently updated Best match

Most reactions

Pull requests list

fix(text-model-from-vlm): realize private lazy arrays before the model leaves the build thread

#614 opened Jun 12, 2026 by ursk Contributor

Loading…

fix(ssd-cache): preserve original bfloat16 dtype across quantized spill

#612 opened Jun 12, 2026 by CBribiescas Contributor

Loading…

Keep assistant tool_calls and tool messages on the MLLM chat path

#611 opened Jun 12, 2026 by waybarrios Owner

Loading…

fix(engine): stop MLLM text route at the model's full config EOS set

#610 opened Jun 11, 2026 by ursk Contributor

Loading…

Guard --mllm against continuous batching (silent empty output)

#601 opened Jun 6, 2026 by eejd Contributor

Loading…

fix(step): add Step3p5 parser support

#598 opened Jun 6, 2026 by Thump604 Collaborator

Loading…

fix(qwen3-xml): parse bare <function=> without <tool_call> wrapper

#597 opened Jun 6, 2026 by CBribiescas Contributor

Loading…

feat(mllm): auto-extract audio from video_url on omni models

#591 opened Jun 3, 2026 by txdadlab

Loading…

fix(gpt-oss): route harmony prompts through openai-harmony (refs #568)

#581 opened May 25, 2026 by CBribiescas Contributor

Loading…

fix(llama-tool-parser): recognize Llama 3.1+ / 3.3 tool-call formats

#580 opened May 25, 2026 by CBribiescas Contributor

Loading…

Add SimpleEngine prefix trie cache

#574 opened May 24, 2026 by Thump604 Collaborator

Loading…

fix(gpt-oss): plumb harmony tool calls all the way through to response

#562 opened May 22, 2026 by CBribiescas Contributor

Loading…

3 tasks

Reduce EngineCore idle polling

#552 opened May 20, 2026 by Thump604 Collaborator

Loading…

Keep MLLM media stream on owner thread

#551 opened May 19, 2026 by Thump604 Collaborator

Loading…

Keep VLM TextModel generation on owner thread

#543 opened May 17, 2026 by Thump604 Collaborator

Loading…

fix: Qwen tool streaming recovery

#497 opened May 4, 2026 by kylejeske

Loading…

perf: O(1) tool lookup in ToolExecutor via lazily-cached name index optimization

#449 opened Apr 26, 2026 by clickbrain Contributor

Loading…

ProTip! Follow long discussions with comments:>50.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Uh oh!