Is this model meant for full bfloat16, AMP bfloat16 or no bfloat16?

#7
by umarbutler - opened

The paper does not make it clear.

Answer.AI org

We trained ModernBERT with amp_bf16. We'll add that detail to our next arxiv preprint update. I imagine ModernBERT will work fine with fp32, amp_bf16, or bf16. Although, the latter might need additional finetuning depending on the usecase.

Sign up or log in to comment