fix: preserve Qwen3.5 broadcast weight names by samsja · Pull Request #2690 · PrimeIntellect-ai/prime-rl

samsja · 2026-06-02T02:03:56Z

What changed

Preserves Qwen3.5 HF hub weight names during live broadcast and checkpoint export.
Routes NCCL/filesystem broadcast and weight checkpoint export through a shared helper.
Updates the training monitor skill with the Qwen3.5 skipped-weight failure mode.

Why

Qwen3.5 VLM checkpoints already use model.language_model... HF hub naming. Calling transformers.core_model_loading.revert_weight_conversion on those keys rewrites them into the wrong namespace for vLLM live reload. In the failing runs, vLLM logged skipped keys like language_model.language_model..., so inference was not receiving updated LM/linear-attention weights after trainer updates.

With the bypass, raw HF keys are sent for model_type = "qwen3_5"; vLLM maps them to its internal language_model.model... keys and the weight updates load.

Validation

uv sync --all-extras
uv run ruff check src/prime_rl/trainer/weights.py src/prime_rl/trainer/ckpt.py src/prime_rl/trainer/rl/broadcast/nccl.py src/prime_rl/trainer/rl/broadcast/filesystem.py src/prime_rl/trainer/models/qwen3_5_moe/modeling_qwen3_5_moe.py
Slurm run 23166 reached trainer step 9 with Mismatch KL between 0.0005 and 0.0010; old skipped-weight logs were gone.

W&B: https://wandb.ai/primeintellect/wordle/runs/6f6cfafdf0274166ad038e7e79375f29

Note

Medium Risk
Transformers version and weight-export/broadcast behavior affect Qwen3.5 inference sync; dependency pin changes the whole training stack’s HF behavior.

Overview
Pins Transformers to 5.6.2 on PyPI (replacing the git pin and >=5.1.0.dev0 override) so training, checkpoints, and vLLM live reload share a single release aligned with Qwen3.5 fixes.

In the diff, broadcast/checkpoint paths still call revert_weight_conversion for non–PrimeRL models; only redundant inline comments were removed in ckpt.py and filesystem.py. Qwen3.5 workaround docstrings in model.py now say they can drop once an official Transformers release includes the fixes (not a specific git commit).

The start-run skill adds steps to verify verifier env packages import under uv run and how to wire missing local envs into pyproject.toml before rl launches.

^{Reviewed by Cursor Bugbot for commit 1143117. Bugbot is set up for automated code reviews on this repo. Configure here.}

S1ro1 · 2026-06-03T02:19:21Z

    "science-env",
    "simpleqa-verified",
    "tau2-bench",
+    "wordle",


is this intended?

S1ro1

lgtm boss

samsja force-pushed the exp/qwen35-kl-wordle branch from 3a3a31f to 13ee170 Compare June 2, 2026 18:15

samsja added 3 commits June 2, 2026 23:47

exp: add qwen35 kl debug configs

8f05d7a

exp: lower qwen35 wordle kl drift

e97cb9e

fix: preserve qwen3.5 broadcast weight names

099f5a5

samsja force-pushed the exp/qwen35-kl-wordle branch from 13ee170 to 099f5a5 Compare June 2, 2026 20:42

chore: keep qwen35 debug configs local

ac0e345

samsja changed the title ~~exp: add qwen35 kl debug configs~~ fix: preserve Qwen3.5 broadcast weight names Jun 2, 2026

samsja and others added 3 commits June 2, 2026 13:59

chore: clarify qwen3.5 weight naming bypass

2109945

fix: use upstream qwen3.5 conversion mapping

de676d5

chore: use released transformers package

511b0de

samsja marked this pull request as ready for review June 3, 2026 01:57

chore: drop qwen3.5 cp patch

2e36f92

S1ro1 reviewed Jun 3, 2026

View reviewed changes

chore: remove wordle env packaging

1143117

S1ro1 approved these changes Jun 3, 2026

View reviewed changes

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

fix: preserve Qwen3.5 broadcast weight names#2690

fix: preserve Qwen3.5 broadcast weight names#2690
samsja wants to merge 9 commits into
mainfrom
exp/qwen35-kl-wordle

samsja commented Jun 2, 2026 •

edited by cursor Bot

Loading

Uh oh!

S1ro1 Jun 3, 2026

Uh oh!

S1ro1 left a comment

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

Conversation

samsja commented Jun 2, 2026 • edited by cursor Bot Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

What changed

Why

Validation

Uh oh!

S1ro1 Jun 3, 2026

Choose a reason for hiding this comment

Uh oh!

S1ro1 left a comment

Choose a reason for hiding this comment

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

samsja commented Jun 2, 2026 •

edited by cursor Bot

Loading