raft_study

AI & ML interests

None defined yet.

Recent Activity

hendrydong authored a paper 19 days ago

Offline Reinforcement Learning for LLM Multi-Step Reasoning

hendrydong authored a paper 3 months ago

MathHay: An Automated Benchmark for Long-Context Mathematical Reasoning in LLMs

hendrydong authored a paper 5 months ago

ThinK: Thinner Key Cache by Query-Driven Pruning

View all activity

models 4

raftrsf/sfr_raft_iter5_2epoch

Text Generation • Updated Jun 17, 2024 • 22

raftrsf/sfr_raft_iter4_2epoch

Text Generation • Updated Jun 13, 2024 • 22

raftrsf/sfr_raft_iter4

Text Generation • Updated Jun 13, 2024 • 21

raftrsf/pair_pref

Text Generation • Updated May 18, 2024 • 20

datasets 8

raftrsf/sfr_concise_iter5_top1

Viewer • Updated Jun 14, 2024 • 20k • 31

raftrsf/sfr_concise_iter5_k32_with_rewards

Viewer • Updated Jun 14, 2024 • 20k • 29

raftrsf/sfr_concise_iter4_top1

Viewer • Updated Jun 12, 2024 • 20k • 34

raftrsf/sfr_concise_iter4_k32_with_rewards

Viewer • Updated Jun 12, 2024 • 20k • 53

raftrsf/ipo_eval_data_baseline.json

Viewer • Updated May 18, 2024 • 7.62k • 32

raftrsf/zephyr_pi0_gen_57k_for_offline_dpo_ipo

Viewer • Updated May 7, 2024 • 57.5k • 30

raftrsf/iterative_ipo_pm_iter1_n4

Viewer • Updated Apr 25, 2024 • 13.5k • 30

raftrsf/iterative_ipo_pm_iter1

Viewer • Updated Apr 24, 2024 • 13.5k • 30