(WIP) adding apply_patch to react agent#2791
(WIP) adding apply_patch to react agent#2791vncntt wants to merge 2 commits intoUKGovernmentBEIS:mainfrom
Conversation
pipmc
left a comment
There was a problem hiding this comment.
Quick review, will re-review and check the algorithm in detail either later today, or tomorrow if I can't get to it today
There was a problem hiding this comment.
@vncntt I haven't reviewed the patching algorithm in this file yet. Did you write it yourself, reuse another implementation or generate it using an LLM, and if you wrote all or some of it yourself what did you use as a specification/guide for implementing it?
| - OpenAI: Add native support for `apply_patch` tool calls in the Responses API, including status playback. | ||
| - Tools: Introduce an `apply_patch` tool harness for applying V4A diffs in sandbox environments. |
There was a problem hiding this comment.
Good that you've updated this. What does "status playback" mean here?
(Also, if the Inspect team approve this PR they might want the standard tools docs updated too)
| return [ | ||
| { | ||
| "type": "apply_patch_call_output", | ||
| "call_id": message.tool_call_id or str(message.function), | ||
| "status": status, | ||
| "output": output_text, | ||
| } | ||
| ] | ||
| else: |
There was a problem hiding this comment.
Is there some reason you're not returning a ResponseApplyPatchToolCallOutput in this list?
| try: | ||
| operation = validate_apply_patch_operation(operation_payload) | ||
| arguments = operation.as_arguments() | ||
| except ValueError as ex: | ||
| parse_error = str(ex) | ||
| arguments = {APPLY_PATCH_ARGUMENT_KEY: operation_payload} |
There was a problem hiding this comment.
Shouldn't the OpenAI SDK do this validation for you? (see response_apply_patch_tool_call.py in the OAI Python SDK - it apparently uses Pydantic models to validate the objects going to/from the API)
| operation_payload = output.operation.model_dump(exclude_none=True) | ||
| if output.id is not None: | ||
| assistant_internal().tool_calls[output.call_id] = { | ||
| "type": "apply_patch_call", | ||
| "call_id": output.call_id, | ||
| "status": output.status, | ||
| "operation": operation_payload, | ||
| } |
There was a problem hiding this comment.
Couldn't you just do:
if output.id is not None:
assistant_internal().tool_calls[output.call_id] = output.model_dump()or is there some reason that won't achieve the right result?
| else: | ||
| # create param | ||
| tool_call_param: ResponseFunctionToolCallParam = dict( | ||
| type="function_call", | ||
| call_id=call.id, | ||
| name=_responses_tool_alias(call.function), | ||
| arguments=json.dumps(call.arguments), | ||
| ) | ||
| if call.type == "apply_patch": | ||
| operation = parse_apply_patch_arguments(call.arguments) |
There was a problem hiding this comment.
I think that instead of having a whole indented if/else block under the existing else, you should remove the first else, dedent the remainingif and else and turn them into elif and else (easier to read)
| else: # delete_file | ||
| return await _delete_file(workspace, patch_operation) |
There was a problem hiding this comment.
Maybe good to explicitly handle delete_file with an elif and then inside the else error for an unrecognized op, to handle the possibility of new operations being added in future or something
| @dataclass | ||
| class ApplyPatchOperation: |
There was a problem hiding this comment.
Shouldn't this be a Pydantic class instead of a regular dataclass? Or at least a pydantic dataclass for the validation, rather than rolling your own validation?
| return [ | ||
| { | ||
| "type": "apply_patch_call_output", | ||
| "call_id": message.tool_call_id or str(message.function), |
There was a problem hiding this comment.
I can see that this (providing or str(message.function)) is what the other options in this part of the function do, but it's not obvious to me from the OAI docs that this would ever be acceptable to their API. Can you tell from the commit history to this file why the authors might have chosen to sometimes use message.function as the value of call_id?
https://platform.openai.com/docs/guides/tools-apply-patch
This PR contains:
What is the current behavior? (You can also link to an open issue here)
What is the new behavior?
Does this PR introduce a breaking change? (What changes might users need to make in their application due to this PR?)
Other information: