Skip to content

feat(trainer): symmetric DPPO-Binary TV default loss (no KL, no advantage conditioning)#2434

Closed
samsja wants to merge 5 commits into
mainfrom
feat/dppo-diff-default-loss
Closed

feat(trainer): symmetric DPPO-Binary TV default loss (no KL, no advantage conditioning)#2434
samsja wants to merge 5 commits into
mainfrom
feat/dppo-diff-default-loss

Commits

Commits on May 8, 2026

Commits on May 9, 2026