VideoChat-Flash Collection Faster and more powerful VideoChat. • 3 items • Updated about 10 hours ago • 1
rStar-Math: Small LLMs Can Master Math Reasoning with Self-Evolved Deep Thinking Paper • 2501.04519 • Published 3 days ago • 176
Riva Collection A family of Riva production (NVAIE) speech models that achieve state-of-the-art results on speech transcription, translation, and synthesis tasks. • 1 item • Updated about 13 hours ago • 3
YuLan-Mini Collection A highly capable 2.4B lightweight LLM using only 1T pre-training data with all details. • 5 items • Updated 13 days ago • 10
DRT-o1: Optimized Deep Reasoning Translation via Long Chain-of-Thought Paper • 2412.17498 • Published 19 days ago • 21
InternVL2.5-MPO Collection Enhancing the Reasoning Ability of MLLMs via Mixed Preference Optimization • 16 items • Updated 1 day ago • 24
TACO Models Collection This collection contains the best-performing TACO models based on LLaMA-3/Qwen2 and SigLIP/CLIP. • 3 items • Updated 22 days ago • 8
ModernBERT Collection Bringing BERT into modernity via both architecture changes and scaling • 3 items • Updated 23 days ago • 122
Bamba Collection Collection of Bamba - hybrid Mamba2 model architecture based models trained on open data • 8 items • Updated 24 days ago • 18
Granite 3.1 Language Models Collection A series of language models with 128K context length trained by IBM licensed under Apache 2.0 license. • 8 items • Updated 24 days ago • 47