rStar-Math: Small LLMs Can Master Math Reasoning with Self-Evolved Deep Thinking Paper • 2501.04519 • Published 3 days ago • 175
Towards System 2 Reasoning in LLMs: Learning How to Think With Meta Chain-of-Though Paper • 2501.04682 • Published 3 days ago • 63
Apollo: An Exploration of Video Understanding in Large Multimodal Models Paper • 2412.10360 • Published 29 days ago • 136
Multimodal Latent Language Modeling with Next-Token Diffusion Paper • 2412.08635 • Published about 1 month ago • 42
OpenCoder: The Open Cookbook for Top-Tier Code Large Language Models Paper • 2411.04905 • Published Nov 7, 2024 • 113
Data Selection via Optimal Control for Language Models Paper • 2410.07064 • Published Oct 9, 2024 • 8
Self-Boosting Large Language Models with Synthetic Preference Data Paper • 2410.06961 • Published Oct 9, 2024 • 16
DataComp-LM: In search of the next generation of training sets for language models Paper • 2406.11794 • Published Jun 17, 2024 • 50
Direct Preference Knowledge Distillation for Large Language Models Paper • 2406.19774 • Published Jun 28, 2024 • 22
ALPINE: Unveiling the Planning Capability of Autoregressive Learning in Language Models Paper • 2405.09220 • Published May 15, 2024 • 24
You Only Cache Once: Decoder-Decoder Architectures for Language Models Paper • 2405.05254 • Published May 8, 2024 • 10
The Era of 1-bit LLMs: All Large Language Models are in 1.58 Bits Paper • 2402.17764 • Published Feb 27, 2024 • 606
TEAL: Tokenize and Embed ALL for Multi-modal Large Language Models Paper • 2311.04589 • Published Nov 8, 2023 • 18