Add Qwen2.5_VL Architecture Definition #579

qiuxy28 · 2025-05-14T08:16:03Z

When merging Qwen2.5-VL models with mergekit, the system fails with:
RuntimeError:Tensor visual.merger.mlp.1.bias required but not present in model Qwen/Qwen2.5-VL-7B-Instruct
The error occurs because visual.merger.mlp.1 is an activation function without trainable parameters, but our architecture definition incorrectly expects weight/bias parameters.

Changes

Add Qwen2.5_VL Architecture Definition

Testing

Successfully merged Qwen2.5-VL models using both linear and TIES methods
Benchmark results show expected performance on multimodal tasks

github-actions · 2025-05-14T08:16:15Z

Thank you for your submission, we really appreciate it. Like many open-source projects, we ask that you sign our Contributor License Agreement before we can accept your contribution. You can sign the CLA by just posting a Pull Request Comment same as the below format.

I have read the CLA Document and I hereby sign the CLA

_{You can retrigger this bot by commenting recheck in this Pull Request.}_{Posted by the CLA Assistant Lite bot.}

qiuxy28 · 2025-05-14T08:20:18Z

I have read the CLA Document and I hereby sign the CLA

Add Qwen2.5VL

7f598dc

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Add Qwen2.5_VL Architecture Definition #579

Add Qwen2.5_VL Architecture Definition #579

qiuxy28 commented May 14, 2025

github-actions bot commented May 14, 2025

qiuxy28 commented May 14, 2025

Add Qwen2.5_VL Architecture Definition #579

Are you sure you want to change the base?

Add Qwen2.5_VL Architecture Definition #579

Conversation

qiuxy28 commented May 14, 2025

Changes

Testing

github-actions bot commented May 14, 2025

qiuxy28 commented May 14, 2025