Skip to content

fix: re-schedule empty trajectories to guarantee full groups#2014

Merged
samsja merged 1 commit into
mainfrom
fix/empty-trajectory-group-filtering
Mar 11, 2026
Merged

fix: re-schedule empty trajectories to guarantee full groups#2014
samsja merged 1 commit into
mainfrom
fix/empty-trajectory-group-filtering

Conversation

@mikasenghaas
Copy link
Copy Markdown
Member

@mikasenghaas mikasenghaas commented Mar 10, 2026

Summary

  • Empty trajectories were filtered after group completion and buffer sampling, which could yield groups with fewer than rollouts_per_example rollouts
  • This broke the advantage computation which reshapes rewards with .view(-1, rollouts_per_example) — either crashing on misaligned tensors or silently computing wrong per-problem baselines
  • Now empty trajectories are detected per-rollout as they arrive. When one is found, rollouts_to_schedule is incremented so the group naturally re-fills, and only complete groups are yielded

🤖 Generated with Claude Code


Note

Medium Risk
Touches core rollout scheduling/group completion logic; a bug could cause groups to stall or skew batch composition, but the change is small and localized.

Overview
Prevents incomplete rollout groups by handling empty trajectories before group completion.

In Scheduler.generate_batch, each finished rollout is now checked for an empty trajectory; empty results increment batch/empty_rollouts, log a warning, and increase group.rollouts_to_schedule so the group naturally re-fills, while the previous post-sampling filtering of empty rollouts is removed.

Written by Cursor Bugbot for commit b43629f. This will update automatically on new commits. Configure here.

…completion

Empty trajectories were filtered after sampling from the buffer, which
could yield incomplete groups and break the advantage computation's
assumption that each group has exactly `rollouts_per_example` rollouts.

Now empty trajectories are detected per-rollout as they complete. When
one is found, `rollouts_to_schedule` is incremented so the group
naturally re-fills via `_fill_inflight_requests`, and the group is only
yielded once it has the full count of non-empty rollouts.

Co-Authored-By: Claude Opus 4.6 <[email protected]>
@mikasenghaas mikasenghaas requested a review from samsja March 10, 2026 22:19
@samsja samsja merged commit 00a4752 into main Mar 11, 2026
10 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants