rLLM v0.2 release #251

jeffreysijuntan · 2025-10-16T21:24:26Z

jeffreysijuntan
Oct 16, 2025
Maintainer

rLLM v0.2 Release (Blog Post)

We are excited to release rLLM v0.2, a major upgrade of our RL training framework. In v0.1, rLLM provided agent and OpenAI Gym-like environment abstractions to support training ReACT-style agents. In v0.2, we additionally introduce AgentWorkflowEngine and AgentWorkflowTrainer—more general abstractions that enable arbitrary agentic programs to be trained. Agent builders and researchers can now define multi-agent systems, complex workflows (e.g., solver-judge, planner executor, MCTS), and agentic programs with custom reward functions, and train them with reinforcement learning without rewriting their production code.

Key Features in v0.2

Support the official verl==0.5.0 as training backend, no custom verl fork anymore! verl==0.5.0 comes with support of the following features which are now supported in rLLM (@kylemontgomery1):
- Megatron training support (@jeewoo-lee)
- SGLang as the rollout engine, in addition to vLLM.
Introduce AgentWorkflowEngine, which enables passing in arbitrary agentic programs for training. (@kylemontgomery1)
Support more agents and environments
- Terminus and TerminalBench (@JasonWei05)
- Tongyi DeepResearch agent (@yayashuxue)
- AppWorld and AppWorldReactAgent (@sunan135)
Integration with other agentic framework/SDK
- Strands SDK from AWS
- SmolAgents

What's Changed

fix <tool_calls_begin> variable by @wj-Mcat in fix <tool_calls_begin> variable #142
Fix not registered license from code by @annyan09023 in Fix not registered license from code #144
fix r2egym import error; update installation README by @jeffreysijuntan in fix r2egym import error; update installation README #146
update deepscaler max_prompt_length to avoid exception during training by @jeffreysijuntan in update deepscaler max_prompt_length to avoid exception during training #148
fix(syntax): Resolve invalid escape sequence warnings by @tonyz0x0 in fix(syntax): Resolve invalid escape sequence warnings #154
added Tools for SFT by @mananroongta in added Tools for SFT #160
update docs by @jeffreysijuntan in update docs #167
Add dark mode to docs by @philippnormann in Add dark mode to docs #168
[FIX] Fix tool calling result parsing problem in tranjectory visualizer & MCP tool name fixing by @VincentXWD in [FIX] Fix tool calling result parsing problem in tranjectory visualizer & MCP tool name fixing #174
[hotfix][miniwob] Fix gymnasium.error.NameNotFound by @abrohamLee in [hotfix][miniwob] Fix gymnasium.error.NameNotFound #172
Load full DeepCoder dataset, instead of LCB subset by @mananroongta in Load full DeepCoder dataset, instead of LCB subset #178
[feat][docker] Installation with Docker by @abrohamLee in [feat][docker] Installation with Docker #177
Add macOS compatibility: exclude GPU dependencies on darwin by @yayashuxue in Add macOS compatibility: exclude GPU dependencies on darwin #180
Torch 2.7.0 only compatible with MacOS python=3.11 by @yayashuxue in Torch 2.7.0 only compatible with MacOS python=3.11 #184
Migrate to verl v0.5.0 by @kylemontgomery1 in Migrate to verl v0.5.0 #193
Terminal Bench Integration into rLLM (Simplified) by @JasonWei05 in Terminal Bench Integration into rLLM (Simplified) #205
feat: Integrate Strands SDK with RLLM for scalable tool-enabled agent training by @yayashuxue in feat: Integrate Strands SDK with RLLM for scalable tool-enabled agent training #206
Add VimGolf agent training example by @James4Ever0 in Add VimGolf agent training example #209
fix: update search engine source data path by @noiji in fix: update search engine source data path #216
[feature] Adding Megatron support for v0.2 by @jeewoo-lee in [feature] Adding Megatron support for v0.2 #221
Use RolloutEngine for single_turn_workflow.py by @1stprinciple in Use RolloutEngine for single_turn_workflow.py #223
Standalone inference: remove hard verl dependency by @JasonWei05 in Standalone inference: remove hard verl dependency #228
Update pyproject.toml to v0.2.0 by @NIL-zhuang in Update pyproject.toml to v0.2.0 #229
proper handling the case that next_observation is empty dict by @erranlli in proper handling the case that next_observation is empty dict #233
[v0.2] Add lazy import to fix circular import and ray init config support by @listar2000 in [v0.2] Add lazy import to fix circular import and ray init config support #236
v0.2 verl patch by @kylemontgomery1 in v0.2 verl patch #237
v0.2 masking/parsing fix by @kylemontgomery1 in v0.2 masking/parsing fix #238
v0.2 rollout upgrade by @kylemontgomery1 in v0.2 rollout upgrade #241
Feat: deepresearch integration by @yayashuxue in Feat: deepresearch integration #215
workflow updates by @kylemontgomery1 in workflow updates #244
added colab example of solver judge by @jeewoo-lee in added colab example of solver judge #246
v0.2 misc changes by @kylemontgomery1 in v0.2 misc changes #245
Add FireworksEngine for disaggregated rollout by @1stprinciple in Add FireworksEngine for disaggregated rollout #243
AppWorld Integration for rLLM by @sunan135 in AppWorld Integration for rLLM #235
V0.2 by @jeffreysijuntan in V0.2 #247
update solver judge workflow by @kylemontgomery1 in update solver judge workflow #248
update install instructions, update solver judge notebook by @kylemontgomery1 in update install instructions, update solver judge notebook #249

New Contributors

@wj-Mcat made their first contribution in fix <tool_calls_begin> variable #142
@annyan09023 made their first contribution in Fix not registered license from code #144
@tonyz0x0 made their first contribution in fix(syntax): Resolve invalid escape sequence warnings #154
@mananroongta made their first contribution in added Tools for SFT #160
@philippnormann made their first contribution in Add dark mode to docs #168
@VincentXWD made their first contribution in [FIX] Fix tool calling result parsing problem in tranjectory visualizer & MCP tool name fixing #174
@abrohamLee made their first contribution in [hotfix][miniwob] Fix gymnasium.error.NameNotFound #172
@yayashuxue made their first contribution in Add macOS compatibility: exclude GPU dependencies on darwin #180
@kylemontgomery1 made their first contribution in Migrate to verl v0.5.0 #193
@JasonWei05 made their first contribution in Terminal Bench Integration into rLLM (Simplified) #205
@James4Ever0 made their first contribution in Add VimGolf agent training example #209
@noiji made their first contribution in fix: update search engine source data path #216
@jeewoo-lee made their first contribution in [feature] Adding Megatron support for v0.2 #221
@1stprinciple made their first contribution in Use RolloutEngine for single_turn_workflow.py #223
@NIL-zhuang made their first contribution in Update pyproject.toml to v0.2.0 #229
@erranlli made their first contribution in proper handling the case that next_observation is empty dict #233
@listar2000 made their first contribution in [v0.2] Add lazy import to fix circular import and ray init config support #236
@sunan135 made their first contribution in AppWorld Integration for rLLM #235

Full Changelog: https://github.com/rllm-org/rllm/commits/v0.2.0

This discussion was created from the release rLLM v0.2 release.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

rLLM v0.2 release #251

Uh oh!

{{title}}

Uh oh!

Replies: 0 comments

Select a reply

Uh oh!

rLLM v0.2 release #251

Uh oh!

jeffreysijuntan Oct 16, 2025 Maintainer

rLLM v0.2 Release (Blog Post)

Key Features in v0.2

What's Changed

New Contributors

Replies: 0 comments

jeffreysijuntan
Oct 16, 2025
Maintainer