Codebase for Preprint
[ Preprint ] | [Embeddings (To be released here)]
We have a series of work focusing on reward models in RLHF:
- Part I. Reward Model Foundation preprint, repo
- Part II. Active Reward Modeling (This repo)
- Part III. Accelerating Reward Model Research with our Infra. (SOON)
Algorithms we tested were implemented in model
, there are two algorithms from other authors, namely coreset (Huggins et al. 2016) in lrcoresets
and batchBALD (Kirsch et al 2019) in batchbald_redux
, we did minimal modification to make sure then can be compitable with our computation environment.
Experiment code to be released soon after we remove unnecessary parts due to our specific computation environment.