19 29 64

Li Dong

unilm

AI & ML interests

Language Model Pre-Training

Recent Activity

upvoted a paper 2 days ago

rStar-Math: Small LLMs Can Master Math Reasoning with Self-Evolved Deep Thinking

upvoted a paper 2 days ago

Towards System 2 Reasoning in LLMs: Learning How to Think With Meta Chain-of-Though

liked a dataset 4 days ago

kadirnar/fluxdev_controlnet_16k

View all activity

Organizations

unilm's activity

upvoted 2 papers 2 days ago

rStar-Math: Small LLMs Can Master Math Reasoning with Self-Evolved Deep Thinking

Paper • 2501.04519 • Published 3 days ago • 175

Towards System 2 Reasoning in LLMs: Learning How to Think With Meta Chain-of-Though

Paper • 2501.04682 • Published 3 days ago • 63

upvoted a paper 26 days ago

Apollo: An Exploration of Video Understanding in Large Multimodal Models

Paper • 2412.10360 • Published 29 days ago • 136

upvoted a paper 30 days ago

Multimodal Latent Language Modeling with Next-Token Diffusion

Paper • 2412.08635 • Published about 1 month ago • 42

upvoted a paper 2 months ago

OpenCoder: The Open Cookbook for Top-Tier Code Large Language Models

Paper • 2411.04905 • Published Nov 7, 2024 • 113

upvoted 3 papers 3 months ago

upvoted a paper 5 months ago

DataComp-LM: In search of the next generation of training sets for language models

Paper • 2406.11794 • Published Jun 17, 2024 • 50

upvoted a paper 6 months ago

Direct Preference Knowledge Distillation for Large Language Models

Paper • 2406.19774 • Published Jun 28, 2024 • 22

upvoted a paper 7 months ago

BEiT: BERT Pre-Training of Image Transformers

Paper • 2106.08254 • Published Jun 15, 2021 • 2

upvoted 2 papers 8 months ago

ALPINE: Unveiling the Planning Capability of Autoregressive Learning in Language Models

Paper • 2405.09220 • Published May 15, 2024 • 24

You Only Cache Once: Decoder-Decoder Architectures for Language Models

Paper • 2405.05254 • Published May 8, 2024 • 10

upvoted a paper 9 months ago

Multi-Head Mixture-of-Experts

Paper • 2404.15045 • Published Apr 23, 2024 • 59

upvoted a paper 10 months ago

Algorithmic progress in language models

Paper • 2403.05812 • Published Mar 9, 2024 • 18

upvoted 2 papers 11 months ago

The Era of 1-bit LLMs: All Large Language Models are in 1.58 Bits

Paper • 2402.17764 • Published Feb 27, 2024 • 606

Towards Optimal Learning of Language Models

Paper • 2402.17759 • Published Feb 27, 2024 • 16

upvoted 3 papers about 1 year ago

TEAL: Tokenize and Embed ALL for Multi-modal Large Language Models

Paper • 2311.04589 • Published Nov 8, 2023 • 18

Does GPT-4 Pass the Turing Test?

Paper • 2310.20216 • Published Oct 31, 2023 • 17

Text Rendering Strategies for Pixel Language Models

Paper • 2311.00522 • Published Nov 1, 2023 • 10