Skip to content

Conversation

@hamishivi
Copy link
Collaborator

@hamishivi hamishivi commented Oct 16, 2025

Long-promised tool refactor, coming from the rl-rag branch with some help from claude.
This refactors tools such that they become their own ray actor, with information passed via ray's methods. This makes life easier, since now we (a) dont have one tool instantiation for every thread, (b) don't have to pickle the instantiated tools and send them through ray when creating vllm engines.

Changes:

  • Generally restructure the tool folders to make more sense
  • Add a readme describing how the tool setup works and how to add more tools.
  • open_instruct/tools/tool_actor.py now contains logic for tool actors, which can be added to via TOOL_CLASS_REGISTRY. grpo_fast.py uses the registry to automatically update the available tools. kwargs are automatically passed to the tool so long as they match the kwarg name.
  • brought back tool_vllm as a basic way to run tools without running the full actor, which can be nice for debugging. It might be a pain to maintain, though (vllm updates...). Open to a better way to manage this!
  • Allow tool outputs to set their own output strings (this was useful for the rl-rag project).
  • Provides a new argument, system_prompt_override_file to grpo_fast, that allows the end user to provide their own system prompt in a text file that overrides the given data. This makes tool stuff easier since you can edit the system prompt file without editing the whole HF dataset.

And some non-tool fixes:

I'll plumb through these changes one PR at a time, and mark them as done here when merged.

hamishivi and others added 30 commits October 16, 2025 09:23
Apply tool refactor from rulin-add-longform-search-reward branch:
- Create ToolActor and ToolProxy for Ray-based tool execution
- Separate base Tool classes from vLLM-specific code
- Add tool_vllm.py for vLLM integration (ToolUseLLM)
- Update grpo_fast.py to use actor-based tool initialization
- Add tool_max_concurrency parameter (default: 512)
- Support search, code, and mcp tools via registry pattern

Benefits:
- Better resource isolation with Ray actors
- Cleaner architecture with lazy tool loading
- Improved concurrency control
- No circular import issues

🤖 Generated with [Claude Code](https://claude.com/claude-code)

Co-Authored-By: Claude <[email protected]>
- Create new `open_instruct/tools/` directory with subdirectories:
  - `utils/`: Tool utilities (tool_vllm, tool_proxy, tool_actor, base tools)
  - `python_tool/`: Python code execution tool with server
  - `search_tool/`: Search tool implementation

- Move files to new locations:
  - tool_vllm.py, tool_proxy.py, tool_actor.py, tools.py → tools/utils/
  - Python tool and server → tools/python_tool/
  - Search tool files → tools/search_tool/

- Update all import statements across codebase:
  - grpo_fast.py
  - ppo_fast.py
  - vllm_utils3.py
  - test_tools.py
  - Tool registry in tool_actor.py

- Add __init__.py files for all new packages

This reorganization improves code structure by:
- Grouping related functionality together
- Separating tool implementations from infrastructure
- Making the codebase more navigable and maintainable

🤖 Generated with [Claude Code](https://claude.com/claude-code)

Co-Authored-By: Claude <[email protected]>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants