Skip to content

support vlm models for gguf format #654

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 24 commits into from
Jul 16, 2025
Merged

support vlm models for gguf format #654

merged 24 commits into from
Jul 16, 2025

Conversation

n1ck-guo
Copy link
Contributor

@n1ck-guo n1ck-guo commented Jul 9, 2025

gguf-py vlm support list and evaluation model (quant and infer):

  • LlavaVisionModel: pixtral-12b, Mistral-Small-3.1-24B-Instruct-2503
  • SmolVLMModel: SmolVLM-256M-Instruct
  • Llama4VisionModel: Llama-4-Maverick-17B-128E-Instruct (packing)
  • Qwen2VLVisionModel: Qwen2-VL-2B-Instruct, Qwen2.5-VL-7B-Instruct
  • InternVisionModel
  • Gemma3VisionModel: gemma-3-12b-it
  • WhisperEncoderModel

@wenhuach21
Copy link
Contributor

better list the vlm models we support or don't support at rtn mode if there are some gaps between ours and official's
better list the vlm models we support at tuning mode

n1ck-guo added 2 commits July 9, 2025 03:24
Signed-off-by: n1ck-guo <[email protected]>
Signed-off-by: n1ck-guo <[email protected]>
@wenhuach21
Copy link
Contributor

if possible, add CPU unit tests to verify basic functionality, and CUDA unit tests for more advanced scenarios, such as mixed-bit quantization, 8-bit for visual modules, and 4-bit for language module

Signed-off-by: n1ck-guo <[email protected]>
n1ck-guo and others added 6 commits July 10, 2025 21:52
Signed-off-by: n1ck-guo <[email protected]>
Signed-off-by: n1ck-guo <[email protected]>
Signed-off-by: n1ck-guo <[email protected]>
Signed-off-by: n1ck-guo <[email protected]>
Signed-off-by: n1ck-guo <[email protected]>
n1ck-guo added 8 commits July 14, 2025 00:58
Signed-off-by: n1ck-guo <[email protected]>
Signed-off-by: n1ck-guo <[email protected]>
Signed-off-by: n1ck-guo <[email protected]>
Signed-off-by: n1ck-guo <[email protected]>
Signed-off-by: n1ck-guo <[email protected]>
Signed-off-by: n1ck-guo <[email protected]>
Signed-off-by: n1ck-guo <[email protected]>
Signed-off-by: n1ck-guo <[email protected]>
@wenhuach21 wenhuach21 self-requested a review July 16, 2025 03:42
Signed-off-by: n1ck-guo <[email protected]>
@wenhuach21 wenhuach21 changed the title support to export vlm gguf format, include text and mmproj model support vlm models for gguf Jul 16, 2025
@wenhuach21 wenhuach21 changed the title support vlm models for gguf support vlm models for gguf format Jul 16, 2025
@n1ck-guo n1ck-guo merged commit a2f3f01 into main Jul 16, 2025
1 of 6 checks passed
@n1ck-guo n1ck-guo deleted the hengguo/gguf_for_vlm branch July 16, 2025 05:31
Signed-off-by: n1ck-guo <[email protected]>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants