Conversation
f74100b to
e650b11
Compare
0ee126e to
76959ac
Compare
There was a problem hiding this comment.
Cursor Bugbot has reviewed your changes and found 1 potential issue.
❌ Bugbot Autofix is OFF. To automatically fix reported issues with cloud agents, enable autofix in the Cursor dashboard.
Reviewed by Cursor Bugbot for commit cd4ec22. Configure here.
| if prefix_len > 0 and prefix_len <= len(step_routed): | ||
| sample.routed_experts[prefix_len - 1] = step_routed[prefix_len - 1] | ||
| sample.routed_experts.extend(step_routed[prefix_len:]) | ||
| sample.routed_experts = extend_routed_experts(sample.routed_experts, step_routed, prefix_len) |
There was a problem hiding this comment.
Missing null check crashes extend_routed_experts on None
High Severity
The condition guarding the extend_routed_experts call was narrowed from checking both tokens.get("routed_experts") is not None and sample.routed_experts is not None to only sample.routed_experts is not None. If the first step has routed experts (making sample.routed_experts non-None) but a subsequent step's tokens["routed_experts"] is None, then step_routed will be None and extend_routed_experts(sample.routed_experts, None, prefix_len) will crash inside validate_routed_experts when accessing None.dtype.
Reviewed by Cursor Bugbot for commit cd4ec22. Configure here.


Summary
Adds Prime-RL support for the compact split routed-experts path used by the patched vLLM/router/verifiers stack:
(shape, bytes)payloads onlyinference.routed_experts_replay_max_blocksfor vLLM routed-experts replay cache sizingsrc/prime_rl/inference/vllm_state.mdCross-repo PRs:
Pinned stack:
verifiers:162cffbvllm:0.20.2rc1.dev214+g24c0208fc.precompiled437a618dd32400d2636e17de266061aa6685001653e6b9a78751e3ae53036e51vllm-router: kept at upstream0.1.22with aTODO: update router wheel when readycomment; P/D routed experts require router PR Add qwq #28 once the wheel is publishedLocal validation:
uv sync --all-extrasuv run ruff check .uv run ruff format --check .PYTEST_OUTPUT_DIR=/tmp/outputs uv run pytest tests/unit -m "not gpu"(342 passed, 65 deselected)uv run ruff check src/prime_rl/utils/monitor/wandb.pyuv run ruff format --check src/prime_rl/utils/monitor/wandb.pyNote
Medium Risk
Medium risk because it changes the routed-experts HTTP/transport payload format and replay indexing across multiple trainer models, and pins a custom
vllmwheel/verifiers revision that can affect inference behavior.Overview
Adds end-to-end support for split prompt vs completion routed-experts in the vLLM P/D path by emitting
prompt_routed_expertsplus per-choicerouted_expertsfrom/inference/v1/generateusing a compact base64int16payload, and removing the chat endpoint’s custom routed-experts capture override.Switches trainer/orchestrator transport from nested lists to a new
RoutedExpertsbytes+shape struct (transport/routed_experts.py), with strict alignment/concat/slice/padding helpers applied during trajectory interleaving, batch packing, and tensor materialization.Updates MoE model router-replay to index routed experts by sparse MoE-layer order (not decoder layer index) and asserts layer-count consistency, and exposes
inference.routed_experts_replay_max_blocksfor the vLLM replay cache.Pins the patched stack by updating
verifiersand the x86_64vllmwheel source, and documents the required vLLM fork/build contract ininference/vllm_state.md; unit tests are updated to match the new payload/validation behavior.Reviewed by Cursor Bugbot for commit 7c8adcf. Bugbot is set up for automated code reviews on this repo. Configure here.