-
Notifications
You must be signed in to change notification settings - Fork 452
Tool refactor #1086
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Draft
hamishivi
wants to merge
52
commits into
main
Choose a base branch
from
tool-refactor
base: main
Could not load branches
Branch not found: {{ refName }}
Loading
Could not load tags
Nothing to show
Loading
Are you sure you want to change the base?
Some commits from the old base branch may be removed from the timeline,
and old review comments may become outdated.
Draft
Tool refactor #1086
Conversation
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Apply tool refactor from rulin-add-longform-search-reward branch: - Create ToolActor and ToolProxy for Ray-based tool execution - Separate base Tool classes from vLLM-specific code - Add tool_vllm.py for vLLM integration (ToolUseLLM) - Update grpo_fast.py to use actor-based tool initialization - Add tool_max_concurrency parameter (default: 512) - Support search, code, and mcp tools via registry pattern Benefits: - Better resource isolation with Ray actors - Cleaner architecture with lazy tool loading - Improved concurrency control - No circular import issues 🤖 Generated with [Claude Code](https://claude.com/claude-code) Co-Authored-By: Claude <[email protected]>
- Create new `open_instruct/tools/` directory with subdirectories: - `utils/`: Tool utilities (tool_vllm, tool_proxy, tool_actor, base tools) - `python_tool/`: Python code execution tool with server - `search_tool/`: Search tool implementation - Move files to new locations: - tool_vllm.py, tool_proxy.py, tool_actor.py, tools.py → tools/utils/ - Python tool and server → tools/python_tool/ - Search tool files → tools/search_tool/ - Update all import statements across codebase: - grpo_fast.py - ppo_fast.py - vllm_utils3.py - test_tools.py - Tool registry in tool_actor.py - Add __init__.py files for all new packages This reorganization improves code structure by: - Grouping related functionality together - Separating tool implementations from infrastructure - Making the codebase more navigable and maintainable 🤖 Generated with [Claude Code](https://claude.com/claude-code) Co-Authored-By: Claude <[email protected]>
Merged
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
Long-promised tool refactor, coming from the rl-rag branch with some help from claude.
This refactors tools such that they become their own ray actor, with information passed via ray's methods. This makes life easier, since now we (a) dont have one tool instantiation for every thread, (b) don't have to pickle the instantiated tools and send them through ray when creating vllm engines.
Changes:
open_instruct/tools/tool_actor.pynow contains logic for tool actors, which can be added to viaTOOL_CLASS_REGISTRY.grpo_fast.pyuses the registry to automatically update the available tools. kwargs are automatically passed to the tool so long as they match the kwarg name.system_prompt_override_filetogrpo_fast, that allows the end user to provide their own system prompt in a text file that overrides the given data. This makes tool stuff easier since you can edit the system prompt file without editing the whole HF dataset.And some non-tool fixes:
.gitto runtime exclusion when initializing ray to avoid uploading the whole git file (ray complains this is too big).I'll plumb through these changes one PR at a time, and mark them as done here when merged.