reproduce your results #106

zzksdu · 2025-02-17T12:37:50Z

As introduced in your paper, the VITA training process consists of 3 steps.
The first step is to finetune llm module.
The second step is multimodel alignment
The third part is multimodel instruction tuning.

In your code, there are many training scripts. Can you indicate which script corresponds to which step of training?

Another question is, in your source code, the language model uses qwen2, so is the final language model qwen2 or mixtral 8 * 7B?
@wangxiongts @BradyFU @linhaojia13 @longzw1997

zzksdu · 2025-02-18T03:03:37Z

Do you have any plans to make your training dataset public?

linhaojia13 · 2025-02-18T09:11:18Z

VITA-1.0 uses Mixtral as its base language model, while VITA-1.5 uses Qwen2.5-7B-Instruct. Currently, VITA-1.0 is deprecated, so let me explain the training stages for VITA-1.5:

pretrain_mlp_qwen_nodes.sh: Stage 1.1
finetune_qwen_nodes.sh: Stage 1.2
finetuneTask_qwen_nodes.sh: Stage 1.3
finetuneTaskNeg_qwen_nodes.sh: Stage 2.2

As for the datasets used, they are not publicly available, but the majority of them consist of open-source data.

zzksdu · 2025-03-12T07:30:47Z

@linhaojia13

在论文中， s1.3是 unfeeze vision tower + mlp + llm，为什么在finetuneTask_qwen_nodes.sh与finetune_qwen_nodes.sh不太相同，finetune_qwen_nodes.sh中添加了unfreeze_vision_tower，而s1.3的脚本中没有这个参数。

2. 另外一个问题是：

在load s1.2的模型的过程中出现下面的问题，这正常吗？

加载你们repo的官方模型也会遇到这个问题。

3. pretrain_audio_mlp_qwen_nodes.sh可以认为是s2.1的训练脚本吗？
4. s2.2的训练脚本中，设置 --freeze_audio_encoder True --freeze_audio_encoder_adapter False ，这个和论文中的vision tower and audio encoder 均是激活的，有点不太相同。所以这里是按照论文中的来实际配置吗？
cc. @BradyFU @wangxiongts @lxysl

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

reproduce your results #106

reproduce your results #106

zzksdu commented Feb 17, 2025

zzksdu commented Feb 18, 2025

linhaojia13 commented Feb 18, 2025

zzksdu commented Mar 12, 2025 •

edited

Loading

reproduce your results #106

reproduce your results #106

Comments

zzksdu commented Feb 17, 2025

zzksdu commented Feb 18, 2025

linhaojia13 commented Feb 18, 2025

zzksdu commented Mar 12, 2025 • edited Loading

zzksdu commented Mar 12, 2025 •

edited

Loading