Skip to content

disinguish is_truncated and log stop_conditions#2009

Merged
rasdani merged 1 commit into
mainfrom
daniel/specific-is-truncated
Mar 10, 2026
Merged

disinguish is_truncated and log stop_conditions#2009
rasdani merged 1 commit into
mainfrom
daniel/specific-is-truncated

Conversation

@rasdani
Copy link
Copy Markdown
Contributor

@rasdani rasdani commented Mar 10, 2026

  • breaks down is_truncated into overlong_prompt and generation_truncation where it was lumped together previously
  • adds verifiers stop_conditions to wandb logging

Note

Low Risk
Logging-only changes that add new metrics/columns without altering rollout generation or training behavior.

Overview
Adds stop_condition to the per-rollout results_df and emits new W&B metrics that break down truncation into generation truncation vs prompt_too_long.

Also logs the normalized rate of each observed stop_condition value (e.g. max_turns, has_error) to make rollout termination causes visible in monitoring.

Written by Cursor Bugbot for commit 48ff293. This will update automatically on new commits. Configure here.

Copy link
Copy Markdown

@cursor cursor Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Cursor Bugbot has reviewed your changes and found 1 potential issue.

Bugbot Autofix prepared a fix for the issue found in the latest run.

  • ✅ Fixed: Generation truncated metric may never exclude prompt_too_long
    • Removed the broken generation_truncated metric that filtered on non-existent 'prompt_too_long' stop_condition value; the PR still logs actual stop_condition values via the dict comprehension.

Create PR

Or push these changes by commenting:

@cursor push e3393b3f68
Preview (e3393b3f68)
diff --git a/src/prime_rl/orchestrator/orchestrator.py b/src/prime_rl/orchestrator/orchestrator.py
--- a/src/prime_rl/orchestrator/orchestrator.py
+++ b/src/prime_rl/orchestrator/orchestrator.py
@@ -667,10 +667,7 @@
             "is_truncated/mean": results_df.groupby("example_id").is_truncated.mean().mean(),
             "is_truncated/max": results_df.groupby("example_id").is_truncated.mean().max(),
             "is_truncated/min": results_df.groupby("example_id").is_truncated.mean().min(),
-            "stop_condition/generation_truncated": (
-                results_df.is_truncated & (results_df.stop_condition != "prompt_too_long")
-            ).mean(),
-            # Log rate of each stop condition (e.g. max_turns, prompt_too_long, has_error)
+            # Log rate of each stop condition (e.g. gibberish, repetition)
             **{
                 f"stop_condition/{sc}": rate
                 for sc, rate in results_df.stop_condition.dropna().value_counts(normalize=True).items()

This Bugbot Autofix run was free. To enable autofix for future PRs, go to the Cursor dashboard.

Comment thread src/prime_rl/orchestrator/orchestrator.py
@rasdani rasdani merged commit 51bd1ad into main Mar 10, 2026
12 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants