support vlm models for gguf format #654

n1ck-guo · 2025-07-09T07:00:26Z

gguf-py vlm support list and evaluation model (quant and infer):

LlavaVisionModel: pixtral-12b, Mistral-Small-3.1-24B-Instruct-2503
SmolVLMModel: SmolVLM-256M-Instruct
Llama4VisionModel: Llama-4-Maverick-17B-128E-Instruct (packing)
Qwen2VLVisionModel: Qwen2-VL-2B-Instruct, Qwen2.5-VL-7B-Instruct
InternVisionModel
Gemma3VisionModel: gemma-3-12b-it
WhisperEncoderModel

Signed-off-by: n1ck-guo <[email protected]>

wenhuach21 · 2025-07-09T07:05:58Z

better list the vlm models we support or don't support at rtn mode if there are some gaps between ours and official's
better list the vlm models we support at tuning mode

auto_round/autoround.py

Signed-off-by: n1ck-guo <[email protected]>

wenhuach21 · 2025-07-09T07:33:15Z

if possible, add CPU unit tests to verify basic functionality, and CUDA unit tests for more advanced scenarios, such as mixed-bit quantization, 8-bit for visual modules, and 4-bit for language module

Signed-off-by: n1ck-guo <[email protected]>

test/test_cpu/test_gguf_format.py

auto_round/export/export_to_gguf/convert.py

Signed-off-by: n1ck-guo <[email protected]>

auto_round/script/mllm.py

Signed-off-by: n1ck-guo <[email protected]>

test/test_cpu/test_gguf_format.py

Signed-off-by: n1ck-guo <[email protected]>

n1ck-guo added 4 commits July 8, 2025 21:37

update

2e4438c

Signed-off-by: n1ck-guo <[email protected]>

Merge branch 'main' of https://github.com/intel/auto-round into main

6ff6039

support vlm gguf

b8a02ad

Signed-off-by: n1ck-guo <[email protected]>

support for vlm gguf, include text / mmproj model

8e1d82c

Signed-off-by: n1ck-guo <[email protected]>

n1ck-guo requested review from wenhuach21 and WeiweiZhang1 July 9, 2025 07:00

wenhuach21 reviewed Jul 9, 2025

View reviewed changes

auto_round/autoround.py Show resolved Hide resolved

wenhuach21 reviewed Jul 9, 2025

View reviewed changes

auto_round/autoround.py Outdated Show resolved Hide resolved

n1ck-guo added 2 commits July 9, 2025 03:24

code scan

d992094

Signed-off-by: n1ck-guo <[email protected]>

fix

20b8bc5

Signed-off-by: n1ck-guo <[email protected]>

update

e3c08da

Signed-off-by: n1ck-guo <[email protected]>

wenhuach21 reviewed Jul 10, 2025

View reviewed changes

test/test_cpu/test_gguf_format.py Show resolved Hide resolved

wenhuach21 reviewed Jul 10, 2025

View reviewed changes

auto_round/export/export_to_gguf/convert.py Show resolved Hide resolved

n1ck-guo and others added 6 commits July 10, 2025 21:52

fix bug

de9a0c8

Signed-off-by: n1ck-guo <[email protected]>

fix

d2e862c

Signed-off-by: n1ck-guo <[email protected]>

fix

85eab95

Signed-off-by: n1ck-guo <[email protected]>

fix

6646e40

Signed-off-by: n1ck-guo <[email protected]>

Merge branch 'main' into hengguo/gguf_for_vlm

9241167

code scan

d23d531

Signed-off-by: n1ck-guo <[email protected]>

wenhuach21 reviewed Jul 11, 2025

View reviewed changes

auto_round/script/mllm.py Outdated Show resolved Hide resolved

n1ck-guo added 8 commits July 14, 2025 00:58

update

90bd1a6

Signed-off-by: n1ck-guo <[email protected]>

fix

ef24040

Signed-off-by: n1ck-guo <[email protected]>

better ut

90408e0

Signed-off-by: n1ck-guo <[email protected]>

update

9baf05d

Signed-off-by: n1ck-guo <[email protected]>

using small model for ut

9d92378

Signed-off-by: n1ck-guo <[email protected]>

support for llama4

ef44dc1

Signed-off-by: n1ck-guo <[email protected]>

fix

f207d17

Signed-off-by: n1ck-guo <[email protected]>

add ut for vlm-gguf

306b7b2

Signed-off-by: n1ck-guo <[email protected]>

fix

d54490b

Signed-off-by: n1ck-guo <[email protected]>

wenhuach21 reviewed Jul 16, 2025

View reviewed changes

test/test_cpu/test_gguf_format.py Show resolved Hide resolved

wenhuach21 self-requested a review July 16, 2025 03:42

wenhuach21 approved these changes Jul 16, 2025

View reviewed changes

add cuda ut

d84cbb9

Signed-off-by: n1ck-guo <[email protected]>

wenhuach21 changed the title ~~support to export vlm gguf format, include text and mmproj model~~ support vlm models for gguf Jul 16, 2025

wenhuach21 changed the title ~~support vlm models for gguf~~ support vlm models for gguf format Jul 16, 2025

n1ck-guo merged commit a2f3f01 into main Jul 16, 2025
1 of 6 checks passed

n1ck-guo deleted the hengguo/gguf_for_vlm branch July 16, 2025 05:31

update whats new

f8cbc79

Signed-off-by: n1ck-guo <[email protected]>

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

support vlm models for gguf format #654

support vlm models for gguf format #654

Uh oh!

n1ck-guo commented Jul 9, 2025 •

edited

Loading

Uh oh!

wenhuach21 commented Jul 9, 2025

Uh oh!

Uh oh!

Uh oh!

wenhuach21 commented Jul 9, 2025

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

support vlm models for gguf format #654

support vlm models for gguf format #654

Uh oh!

Conversation

n1ck-guo commented Jul 9, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

wenhuach21 commented Jul 9, 2025

Uh oh!

Uh oh!

Uh oh!

wenhuach21 commented Jul 9, 2025

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

n1ck-guo commented Jul 9, 2025 •

edited

Loading