-
Notifications
You must be signed in to change notification settings - Fork 165
reproduce your results #106
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Comments
Do you have any plans to make your training dataset public? |
VITA-1.0 uses Mixtral as its base language model, while VITA-1.5 uses Qwen2.5-7B-Instruct. Currently, VITA-1.0 is deprecated, so let me explain the training stages for VITA-1.5:
As for the datasets used, they are not publicly available, but the majority of them consist of open-source data. |
在load s1.2的模型的过程中出现下面的问题,这正常吗? 加载你们repo的官方模型也会遇到这个问题。
|
As introduced in your paper, the VITA training process consists of 3 steps.
The first step is to finetune llm module.
The second step is multimodel alignment
The third part is multimodel instruction tuning.
In your code, there are many training scripts. Can you indicate which script corresponds to which step of training?
Another question is, in your source code, the language model uses qwen2, so is the final language model qwen2 or mixtral 8 * 7B?
@wangxiongts @BradyFU @linhaojia13 @longzw1997
The text was updated successfully, but these errors were encountered: