Ai2

Enterprise

non-profit

Verified

https://allenai.org/

allen_ai

allenai

AI & ML interests

Building breatkthrough AI to solve the world's biggest problems.

Recent Activity

natolambert authored a paper 5 days ago

Objective Mismatch in Model-based Reinforcement Learning

natolambert authored a paper 5 days ago

Confidence-Building Measures for Artificial Intelligence: Workshop Proceedings

natolambert authored a paper 5 days ago

A Survey on Data Selection for Language Models

View all activity

allenai's activity

natolambert

updated a dataset about 18 hours ago

allenai/reward-bench-results

Updated about 18 hours ago • 9.59k • 2

natolambert

in allenai/reward-bench about 18 hours ago

multilingual

#8 opened 8 days ago by

liujch1998

authored 12 papers 4 days ago

Don't throw away your value model! Making PPO even better via Value-Guided Monte-Carlo Tree Search decoding

Paper • 2309.15028 • Published Sep 26, 2023 • 1

MathVista: Evaluating Mathematical Reasoning of Foundation Models in Visual Contexts

Paper • 2310.02255 • Published Oct 3, 2023 • 2

Crystal: Introspective Reasoners Reinforced with Self-Feedback

Paper • 2310.04921 • Published Oct 7, 2023 • 1

NaturalProofs: Mathematical Theorem Proving in Natural Language

Paper • 2104.01112 • Published Mar 24, 2021

Generated Knowledge Prompting for Commonsense Reasoning

Paper • 2110.08387 • Published Oct 15, 2021

Minds versus Machines: Rethinking Entailment Verification with Language Models

Paper • 2402.03686 • Published Feb 6, 2024 • 1

NaturalProver: Grounded Mathematical Proof Generation with Language Models

Paper • 2205.12910 • Published May 25, 2022

Rainier: Reinforced Knowledge Introspector for Commonsense Question Answering

Paper • 2210.03078 • Published Oct 6, 2022 • 1

Unpacking DPO and PPO: Disentangling Best Practices for Learning from Preference Feedback

Paper • 2406.09279 • Published Jun 13, 2024 • 2

AI as Humanity's Salieri: Quantifying Linguistic Creativity of Language Models via Systematic Attribution of Machine Text against Web Text

Paper • 2410.04265 • Published Oct 5, 2024

Establishing Task Scaling Laws via Compute-Efficient Model Ladders

Paper • 2412.04403 • Published Dec 5, 2024 • 2

2 OLMo 2 Furious

Paper • 2501.00656 • Published 11 days ago • 15

natolambert

authored 6 papers 5 days ago

Objective Mismatch in Model-based Reinforcement Learning

Paper • 2002.04523 • Published Feb 11, 2020

Confidence-Building Measures for Artificial Intelligence: Workshop Proceedings

Paper • 2308.00862 • Published Aug 1, 2023

A Survey on Data Selection for Language Models

Paper • 2402.16827 • Published Feb 26, 2024 • 4

D2PO: Discriminator-Guided DPO with Response Evaluation Models

Paper • 2405.01511 • Published May 2, 2024

Unpacking DPO and PPO: Disentangling Best Practices for Learning from Preference Feedback

Paper • 2406.09279 • Published Jun 13, 2024 • 2

WildGuard: Open One-Stop Moderation Tools for Safety Risks, Jailbreaks, and Refusals of LLMs

Paper • 2406.18495 • Published Jun 26, 2024 • 13