Nathan Lambert's picture

Nathan Lambert

natolambert

·

https://www.natolambert.com/

AI & ML interests

Reinforcement learning, Ethics, Robotics, Dynamics Models

Recent Activity

updated a dataset about 20 hours ago

allenai/reward-bench-results

new activity about 20 hours ago

allenai/reward-bench:multilingual

updated a collection 3 days ago

View all activity

Articles

Ethics and Society Newsletter #4: Bias in Text-to-Image Models

Can foundation models label data like humans?

Creating a Coding Assistant with StarCoder

StackLLaMA: A hands-on guide to train LLaMA with RLHF

Red-Teaming Large Language Models

What Makes a Dialog Agent Useful?

Illustrating Reinforcement Learning from Human Feedback (RLHF)

Stable Diffusion with 🧨 Diffusers

Organizations

natolambert's activity

updated a dataset about 20 hours ago

allenai/reward-bench-results

Updated about 20 hours ago • 9.59k • 2

New activity in allenai/reward-bench about 20 hours ago

multilingual

#8 opened 8 days ago by

updated a collection 3 days ago

2025 Artifacts

6 items • Updated 3 days ago

liked a model 3 days ago

microsoft/phi-4

Text Generation • Updated 3 days ago • 35.9k • 958

updated a collection 4 days ago

2025 Artifacts

6 items • Updated 3 days ago

liked a model 4 days ago

nvidia/Cosmos-1.0-Diffusion-14B-Text2World

Updated 1 day ago • 798 • 32

updated a collection 5 days ago

2025 Artifacts

6 items • Updated 3 days ago

liked 2 models 5 days ago

metagene-ai/METAGENE-1

Updated 2 days ago • 153 • 20

yulan-team/YuLan-Mini

Text Generation • Updated 8 days ago • 713 • 31

updated a collection 5 days ago

2025 Artifacts

6 items • Updated 3 days ago

authored 9 papers 5 days ago

Objective Mismatch in Model-based Reinforcement Learning

Paper • 2002.04523 • Published Feb 11, 2020

Confidence-Building Measures for Artificial Intelligence: Workshop Proceedings

Paper • 2308.00862 • Published Aug 1, 2023

A Survey on Data Selection for Language Models

Paper • 2402.16827 • Published Feb 26, 2024 • 4

D2PO: Discriminator-Guided DPO with Response Evaluation Models

Paper • 2405.01511 • Published May 2, 2024

Unpacking DPO and PPO: Disentangling Best Practices for Learning from Preference Feedback

Paper • 2406.09279 • Published Jun 13, 2024 • 2

WildGuard: Open One-Stop Moderation Tools for Safety Risks, Jailbreaks, and Refusals of LLMs

Paper • 2406.18495 • Published Jun 26, 2024 • 13

Towards a Framework for Openness in Foundation Models: Proceedings from the Columbia Convening on Openness in Artificial Intelligence

Paper • 2405.15802 • Published May 17, 2024

Molmo and PixMo: Open Weights and Open Data for State-of-the-Art Multimodal Models

Paper • 2409.17146 • Published Sep 25, 2024 • 106

2 OLMo 2 Furious

Paper • 2501.00656 • Published 11 days ago • 15

updated a collection 6 days ago

2025 Artifacts

6 items • Updated 3 days ago