(WIP) adding apply_patch to react agent by vncntt · Pull Request #2791 · UKGovernmentBEIS/inspect_ai

vncntt · 2025-11-20T03:22:05Z

https://platform.openai.com/docs/guides/tools-apply-patch

This PR contains:

What is the current behavior? (You can also link to an open issue here)

What is the new behavior?

Does this PR introduce a breaking change? (What changes might users need to make in their application due to this PR?)

Other information:

pipmc

Quick review, will re-review and check the algorithm in detail either later today, or tomorrow if I can't get to it today

pipmc · 2025-11-20T14:47:43Z

src/inspect_ai/tool/_tools/_apply_patch.py

@vncntt I haven't reviewed the patching algorithm in this file yet. Did you write it yourself, reuse another implementation or generate it using an LLM, and if you wrote all or some of it yourself what did you use as a specification/guide for implementing it?

pipmc · 2025-11-20T14:49:23Z

CHANGELOG.md

+- OpenAI: Add native support for `apply_patch` tool calls in the Responses API, including status playback.
+- Tools: Introduce an `apply_patch` tool harness for applying V4A diffs in sandbox environments.


Good that you've updated this. What does "status playback" mean here?

(Also, if the Inspect team approve this PR they might want the standard tools docs updated too)

pipmc · 2025-11-20T14:54:18Z

src/inspect_ai/model/_openai_responses.py

+            return [
+                {
+                    "type": "apply_patch_call_output",
+                    "call_id": message.tool_call_id or str(message.function),
+                    "status": status,
+                    "output": output_text,
+                }
+            ]
        else:


Is there some reason you're not returning a ResponseApplyPatchToolCallOutput in this list?

pipmc · 2025-11-20T14:58:43Z

src/inspect_ai/model/_openai_responses.py

+                try:
+                    operation = validate_apply_patch_operation(operation_payload)
+                    arguments = operation.as_arguments()
+                except ValueError as ex:
+                    parse_error = str(ex)
+                    arguments = {APPLY_PATCH_ARGUMENT_KEY: operation_payload}


Shouldn't the OpenAI SDK do this validation for you? (see response_apply_patch_tool_call.py in the OAI Python SDK - it apparently uses Pydantic models to validate the objects going to/from the API)

pipmc · 2025-11-20T15:01:38Z

src/inspect_ai/model/_openai_responses.py

+                operation_payload = output.operation.model_dump(exclude_none=True)
+                if output.id is not None:
+                    assistant_internal().tool_calls[output.call_id] = {
+                        "type": "apply_patch_call",
+                        "call_id": output.call_id,
+                        "status": output.status,
+                        "operation": operation_payload,
+                    }


Couldn't you just do:

if output.id is not None: assistant_internal().tool_calls[output.call_id] = output.model_dump()

or is there some reason that won't achieve the right result?

pipmc · 2025-11-20T15:06:49Z

src/inspect_ai/model/_openai_responses.py

        else:
            # create param
-            tool_call_param: ResponseFunctionToolCallParam = dict(
-                type="function_call",
-                call_id=call.id,
-                name=_responses_tool_alias(call.function),
-                arguments=json.dumps(call.arguments),
-            )
+            if call.type == "apply_patch":
+                operation = parse_apply_patch_arguments(call.arguments)


I think that instead of having a whole indented if/else block under the existing else, you should remove the first else, dedent the remainingif and else and turn them into elif and else (easier to read)

src/inspect_ai/tool/_tools/_apply_patch.py

pipmc · 2025-11-20T15:14:53Z

src/inspect_ai/tool/_tools/_apply_patch.py

+        else:  # delete_file
+            return await _delete_file(workspace, patch_operation)


Maybe good to explicitly handle delete_file with an elif and then inside the else error for an unrecognized op, to handle the possibility of new operations being added in future or something

pipmc · 2025-11-20T15:16:38Z

src/inspect_ai/tool/_tool_call.py

+@dataclass
+class ApplyPatchOperation:


Shouldn't this be a Pydantic class instead of a regular dataclass? Or at least a pydantic dataclass for the validation, rather than rolling your own validation?

pipmc · 2025-11-20T15:22:05Z

src/inspect_ai/model/_openai_responses.py

+            return [
+                {
+                    "type": "apply_patch_call_output",
+                    "call_id": message.tool_call_id or str(message.function),


I can see that this (providing or str(message.function)) is what the other options in this part of the function do, but it's not obvious to me from the OAI docs that this would ever be acceptable to their API. Can you tell from the commit history to this file why the authors might have chosen to sometimes use message.function as the value of call_id?

inital commit

26dfbf7

pipmc reviewed Nov 20, 2025

View reviewed changes

vncntt marked this pull request as draft November 24, 2025 19:34

addressing some pip's short comments

bc6b5ce

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

(WIP) adding apply_patch to react agent#2791

(WIP) adding apply_patch to react agent#2791
vncntt wants to merge 2 commits intoUKGovernmentBEIS:mainfrom
vncntt:apply_patch

vncntt commented Nov 20, 2025 •

edited

Loading

Uh oh!

pipmc left a comment

Uh oh!

pipmc Nov 20, 2025

Uh oh!

pipmc Nov 20, 2025

Uh oh!

pipmc Nov 20, 2025

Uh oh!

pipmc Nov 20, 2025

Uh oh!

pipmc Nov 20, 2025

Uh oh!

pipmc Nov 20, 2025

Uh oh!

Uh oh!

pipmc Nov 20, 2025

Uh oh!

pipmc Nov 20, 2025

Uh oh!

pipmc Nov 20, 2025

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

		- OpenAI: Add native support for `apply_patch` tool calls in the Responses API, including status playback.
		- Tools: Introduce an `apply_patch` tool harness for applying V4A diffs in sandbox environments.

		else: # delete_file
		return await _delete_file(workspace, patch_operation)

Conversation

vncntt commented Nov 20, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

This PR contains:

What is the current behavior? (You can also link to an open issue here)

What is the new behavior?

Does this PR introduce a breaking change? (What changes might users need to make in their application due to this PR?)

Other information:

Uh oh!

pipmc left a comment

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

vncntt commented Nov 20, 2025 •

edited

Loading