Codebase for Preprint

"Reviving The Classics: Active Reward Modeling in Large Language Model Alignment"

Authors: Yunyi Shen, Hao Sun, Jean-Francois Ton. The first two authors contribute equally.

[ Preprint ] | [Embeddings (To be released here)]

We have a series of work focusing on reward models in RLHF:

Part I. Reward Model Foundation preprint, repo
Part II. Active Reward Modeling (This repo)
Part III. Accelerating Reward Model Research with our Infra. (SOON)

Structure of the repo

Algorithms we tested were implemented in model, there are two algorithms from other authors, namely coreset (Huggins et al. 2016) in lrcoresets and batchBALD (Kirsch et al 2019) in batchbald_redux, we did minimal modification to make sure then can be compitable with our computation environment.

Experiment code to be released soon after we remove unnecessary parts due to our specific computation environment.

Name		Name	Last commit message	Last commit date
Latest commit History 3 Commits
batchbald_redux		batchbald_redux
lrcoresets		lrcoresets
model		model
.gitignore		.gitignore
README.md		README.md
requirements.txt		requirements.txt

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Codebase for Preprint

"Reviving The Classics: Active Reward Modeling in Large Language Model Alignment"

Authors: Yunyi Shen, Hao Sun, Jean-Francois Ton. The first two authors contribute equally.

Structure of the repo

About

Releases

Packages

Languages

YunyiShen/ARM-FI

Folders and files

Latest commit

History

Repository files navigation

Codebase for Preprint

"Reviving The Classics: Active Reward Modeling in Large Language Model Alignment"

Authors: Yunyi Shen*, Hao Sun*, Jean-Francois Ton. The first two authors contribute equally.

Structure of the repo

About

Resources

Stars

Watchers

Forks

Releases

Packages 0

Languages

Authors: Yunyi Shen, Hao Sun, Jean-Francois Ton. The first two authors contribute equally.

Packages