Skip to content

(WIP) adding apply_patch to react agent#2791

Draft
vncntt wants to merge 2 commits intoUKGovernmentBEIS:mainfrom
vncntt:apply_patch
Draft

(WIP) adding apply_patch to react agent#2791
vncntt wants to merge 2 commits intoUKGovernmentBEIS:mainfrom
vncntt:apply_patch

Conversation

@vncntt
Copy link
Contributor

@vncntt vncntt commented Nov 20, 2025

https://platform.openai.com/docs/guides/tools-apply-patch

This PR contains:

  • New features
  • Changes to dev-tools e.g. CI config / github tooling
  • Docs
  • Bug fixes
  • Code refactor

What is the current behavior? (You can also link to an open issue here)

What is the new behavior?

Does this PR introduce a breaking change? (What changes might users need to make in their application due to this PR?)

Other information:

Copy link
Contributor

@pipmc pipmc left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Quick review, will re-review and check the algorithm in detail either later today, or tomorrow if I can't get to it today

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@vncntt I haven't reviewed the patching algorithm in this file yet. Did you write it yourself, reuse another implementation or generate it using an LLM, and if you wrote all or some of it yourself what did you use as a specification/guide for implementing it?

Comment on lines +6 to +7
- OpenAI: Add native support for `apply_patch` tool calls in the Responses API, including status playback.
- Tools: Introduce an `apply_patch` tool harness for applying V4A diffs in sandbox environments.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Good that you've updated this. What does "status playback" mean here?

(Also, if the Inspect team approve this PR they might want the standard tools docs updated too)

Comment on lines +224 to 232
return [
{
"type": "apply_patch_call_output",
"call_id": message.tool_call_id or str(message.function),
"status": status,
"output": output_text,
}
]
else:
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is there some reason you're not returning a ResponseApplyPatchToolCallOutput in this list?

Comment on lines +632 to +637
try:
operation = validate_apply_patch_operation(operation_payload)
arguments = operation.as_arguments()
except ValueError as ex:
parse_error = str(ex)
arguments = {APPLY_PATCH_ARGUMENT_KEY: operation_payload}
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Shouldn't the OpenAI SDK do this validation for you? (see response_apply_patch_tool_call.py in the OAI Python SDK - it apparently uses Pydantic models to validate the objects going to/from the API)

Comment on lines +623 to +630
operation_payload = output.operation.model_dump(exclude_none=True)
if output.id is not None:
assistant_internal().tool_calls[output.call_id] = {
"type": "apply_patch_call",
"call_id": output.call_id,
"status": output.status,
"operation": operation_payload,
}
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Couldn't you just do:

if output.id is not None:
    assistant_internal().tool_calls[output.call_id] = output.model_dump()

or is there some reason that won't achieve the right result?

Comment on lines 952 to +1011
else:
# create param
tool_call_param: ResponseFunctionToolCallParam = dict(
type="function_call",
call_id=call.id,
name=_responses_tool_alias(call.function),
arguments=json.dumps(call.arguments),
)
if call.type == "apply_patch":
operation = parse_apply_patch_arguments(call.arguments)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think that instead of having a whole indented if/else block under the existing else, you should remove the first else, dedent the remainingif and else and turn them into elif and else (easier to read)

Comment on lines +117 to +118
else: # delete_file
return await _delete_file(workspace, patch_operation)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Maybe good to explicitly handle delete_file with an elif and then inside the else error for an unrecognized op, to handle the possibility of new operations being added in future or something

Comment on lines +74 to +75
@dataclass
class ApplyPatchOperation:
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Shouldn't this be a Pydantic class instead of a regular dataclass? Or at least a pydantic dataclass for the validation, rather than rolling your own validation?

return [
{
"type": "apply_patch_call_output",
"call_id": message.tool_call_id or str(message.function),
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I can see that this (providing or str(message.function)) is what the other options in this part of the function do, but it's not obvious to me from the OAI docs that this would ever be acceptable to their API. Can you tell from the commit history to this file why the authors might have chosen to sometimes use message.function as the value of call_id?

@vncntt vncntt marked this pull request as draft November 24, 2025 19:34
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants