We read every piece of feedback, and take your input very seriously.
To see all available qualifiers, see our documentation.
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
你好请教下 在imagenet上面用10*4090上面需要20多天的样子 不知道这个速度又没有问题 我用jax的版本训练差不多1天就可以了(300个epoch) 多谢哈
The text was updated successfully, but these errors were encountered:
你好,我们第二版训练Tokenizer和二阶段的Transformer都是用的是910B NPU,但是在第一版的时候我们用的是32卡V100,时间大概是4-5天,想问一下你这个jax版本是把代码改到了JAX吗,精度能对齐吗?
Sorry, something went wrong.
你好~请问IBQ最小的模型训练需要多少卡多久呀 @RobertLuo1
你好,很高兴你对我们工作的关注,IBQ一阶段我们都是用64卡910B训练的,不同codebook size的训练时间不同,应该是4-7天,第二阶段300M的模型我们用的是32卡,最大2B用的是96卡,训练时间2B大概需要9天,300M的大概是4天
No branches or pull requests
你好请教下
在imagenet上面用10*4090上面需要20多天的样子 不知道这个速度又没有问题
我用jax的版本训练差不多1天就可以了(300个epoch)
多谢哈
The text was updated successfully, but these errors were encountered: