chore: bump verifiers to main (v1 seq_len fix) by hallerite · Pull Request #2692 · PrimeIntellect-ai/prime-rl

hallerite · 2026-06-02T21:38:01Z

What

Bumps the deps/verifiers submodule e1d4f259 → 05c66c23 (current verifiers main).

The headline reason is the merged v1 token-usage fix (verifiers#1525): for v1 environments, token_usage.final_input_tokens/final_output_tokens were always 0 because the trajectory's responses are serialized to plain dicts and compute_context_token_metrics gated on isinstance(response, Response). That zeroed our wandb seq_len/* and progress/tokens metrics. v1 now records those metrics at write time from the live Response.

The bump also pulls in the other commits merged to verifiers main since e1d4f259 (Scale-SWE taskset, v1 sandbox runtime scaling, Multi-SWE patch alignment, unbounded harness max_turns, port-rebind fix).

Validation

Ran configs/debug/reverse_text_v1.toml end-to-end on GPU against the bumped submodule:

metric	before	after
`seq_len/all/mean`	0	160.1
`progress/tokens`	0	21013

Full inference + orchestrator + trainer loop completed cleanly on the new main. uv.lock unchanged (verifiers is a workspace/path dep).

Note

Low Risk
Dependency-only submodule pointer update with no in-repo logic changes; validated on a debug v1 training run with improved metrics reporting.

Overview
Updates the deps/verifiers git submodule from e1d4f259 to 05c66c23 (current upstream main), so this repo picks up verifiers’ v1 token-usage fix: trajectory responses serialized as dicts no longer leave token_usage.final_input_tokens / final_output_tokens at zero, which restores meaningful seq_len/* and progress/tokens reporting (e.g. wandb) instead of flat zeros.

The same bump also brings in other recent verifiers main work (Scale-SWE taskset, v1 sandbox runtime scaling, Multi-SWE patch alignment, unbounded harness max_turns, port-rebind fix). uv.lock is unchanged because verifiers is a workspace/path dependency; validation noted a full reverse_text_v1 debug run with non-zero sequence-length and token progress metrics.

^{Reviewed by Cursor Bugbot for commit db56c75. Bugbot is set up for automated code reviews on this repo. Configure here.}

Advances deps/verifiers e1d4f259 -> 05c66c23, which includes the merged v1 token-usage fix (#1525): v1 envs now record final/context token usage at write time, so wandb seq_len and progress/tokens are no longer 0. Validated end-to-end with configs/debug/reverse_text_v1.toml: seq_len/all/mean 0 -> 160. Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>

hallerite marked this pull request as ready for review June 2, 2026 21:44

samsja approved these changes Jun 2, 2026

View reviewed changes

hallerite merged commit 8d88f5e into main Jun 2, 2026
22 checks passed

hallerite deleted the chore/bump-verifiers-v1-seqlen branch June 2, 2026 22:20

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

chore: bump verifiers to main (v1 seq_len fix)#2692

chore: bump verifiers to main (v1 seq_len fix)#2692
hallerite merged 1 commit into
mainfrom
chore/bump-verifiers-v1-seqlen

hallerite commented Jun 2, 2026 •

edited by cursor Bot

Loading

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

Conversation

hallerite commented Jun 2, 2026 • edited by cursor Bot Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

What

Validation

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

hallerite commented Jun 2, 2026 •

edited by cursor Bot

Loading