[MCP] Render MCP tool call result images to the model #5600

gpeal · 2025-10-24T06:00:37Z

It's pretty amazing we have gotten here without the ability for the model to see image content from MCP tool calls.

This PR builds off of #4391 and fixes #4819. I would like @KKcorps to get adequete credit here but I also want to get this fix in ASAP so I gave him a week to update it and haven't gotten a response so I'm going to take it across the finish line.

This test highlights how absured the current situation is. I asked the model to read this image using the Chrome MCP

After this change, it correctly outputs:

Captured the page: image dhows a dark terminal-style UI labeled OpenAI Codex (v0.0.0) with prompt model: gpt-5-codex medium and working directory /codex/codex-rs
(and more)

Before this change, it said:

Took the full-page screenshot you asked for. It shows a long, horizontally repeating pattern of stylized people in orange, light-blue, and mustard clothing, holding hands in alternating poses against a white background. No text or other graphics-just rows of flat illustration stretching off to the right.

Without this change, the Figma, Playwright, Chrome, and other visual MCP servers are pretty much entirely useless.

I tested this change with the openai respones api as well as a third party completions api

chatgpt-codex-connector

💡 Codex Review

codex/codex-rs/core/src/chat_completions.rs

Lines 238 to 243 in 2c53e50

    
           ResponseItem::FunctionCallOutput { call_id, output } => { 
        
               messages.push(json!({ 
        
                   "role": "tool", 
        
                   "tool_call_id": call_id, 
        
                   "content": output.content, 
        
               }));

Send image outputs as arrays in chat completions payloads

The new FunctionCallOutputPayload now carries content_items so MCP tool results with images can be serialized as the array format expected by the Responses API. However, the chat‑completions code path still injects output.content directly into the message body ({"role":"tool","content": output.content}), which is always a plain string. When a tool returns an image, output.content_items is populated but ignored here, so the request sent through the chat completions API remains a JSON string containing base64 data rather than the structured array that allows the model to render the image. Any run that uses chat completions instead of the responses API will continue to be blind to MCP images despite the new conversion logic.

ℹ️ About Codex in GitHub

Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you

Open a pull request for review
Mark a draft as ready
Comment "@codex review".

If Codex has suggestions, it will comment; otherwise it will react with 👍.

Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".

gpeal · 2025-10-24T06:33:00Z

💡 Codex Review

codex/codex-rs/core/src/chat_completions.rs

Lines 238 to 243 in 2c53e50

ResponseItem::FunctionCallOutput { call_id, output } => {

messages.push(json!({

"role": "tool",

"tool_call_id": call_id,

"content": output.content,

}));

Send image outputs as arrays in chat completions payloads
The new FunctionCallOutputPayload now carries content_items so MCP tool results with images can be serialized as the array format expected by the Responses API. However, the chat‑completions code path still injects output.content directly into the message body ({"role":"tool","content": output.content}), which is always a plain string. When a tool returns an image, output.content_items is populated but ignored here, so the request sent through the chat completions API remains a JSON string containing base64 data rather than the structured array that allows the model to render the image. Any run that uses chat completions instead of the responses API will continue to be blind to MCP images despite the new conversion logic.

ℹ️ About Codex in GitHub

@codex fix this

chatgpt-codex-connector · 2025-10-24T06:54:43Z

Summary

Added the serde_json::Value import so the chat completions client can construct structured content payloads while retaining existing tracing context. codex-rs/core/src/chat_completions.rsL1-L34
Updated the ResponseItem::FunctionCallOutput handling to emit content_items arrays when present, logging serialization failures and falling back to the legacy string form so MCP image outputs reach the model even via chat completions. codex-rs/core/src/chat_completions.rsL239-L257

Testing

✅ just fmt
✅ just fix -p codex-core
❌ cargo test -p codex-core (numerous integration suites such as suite::apply_patch_cli, suite::client, suite::json_result, suite::otel, etc. require the full CLI/runtime environment and failed or timed out in this sandbox)

View task →

gpeal · 2025-10-24T16:34:18Z

@codex review this again

chatgpt-codex-connector · 2025-10-24T16:40:37Z

Codex Review: Didn't find any major issues. What shall we delve into next?

ℹ️ About Codex in GitHub

Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you

Open a pull request for review
Mark a draft as ready
Comment "@codex review".

If Codex has suggestions, it will comment; otherwise it will react with 👍.

Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".

codex-rs/protocol/src/models.rs

jif-oai · 2025-10-24T18:16:25Z

codex-rs/protocol/src/models.rs

+            structured_content,
+        } = call_tool_result;
+
+        let is_success = is_error != &Some(true);


You don't need the & since Option<bool> implements PartialEq

can't compare `&std::option::Option<bool>` with `std::option::Option<bool>` the trait `std::cmp::PartialEq<std::option::Option<bool>>` is not implemented for `&std::option::Option<bool>`

codex-rs/protocol/src/models.rs

gpeal · 2025-10-24T18:20:37Z

@codex fix all of the comments

pakrym-oai · 2025-10-24T18:20:49Z

This PR can use a test or two

chatgpt-codex-connector · 2025-10-24T18:43:34Z

Summary

Documented FunctionCallOutputContentItem/FunctionCallOutputPayload, derived Default, and rebuilt the serialization/deserialization logic so structured MCP outputs populate content_items arrays while still emitting the legacy string payload expected by chat completions. codex-rs/protocol/src/models.rsL259-L389
Updated the various fallback paths (MCP tool-call errors, runtime responses, and conversation history repair logic) to rely on the new defaults so synthetic tool outputs no longer have to hand-populate empty content_items fields. codex-rs/core/src/mcp_tool_call.rsL23-L75 codex-rs/core/src/codex.rsL1934-L1964 codex-rs/core/src/conversation_history.rsL84-L180

Testing

✅ just fmt
✅ just fix -p codex-protocol
✅ just fix -p codex-core
✅ cargo test -p codex-protocol
⚠️ cargo test -p codex-core (cancelled after a multi-minute build; the sandbox cannot finish the full integration suite)

View task →

jif-oai

Left just a comment for best practice so not mandatory. Otherwise, LGTM

jif-oai · 2025-10-27T09:49:38Z

codex-rs/protocol/src/models.rs

+}
+
+impl FunctionCallOutputPayload {
+    pub fn from_call_tool_result(call_tool_result: &CallToolResult) -> Self {


This could be an impl From

Works

2c53e50

gpeal requested review from bolinfest and jif-oai October 24, 2025 06:00

chatgpt-codex-connector bot reviewed Oct 24, 2025

View reviewed changes

jif-oai approved these changes Oct 24, 2025

View reviewed changes

gpeal added 2 commits October 25, 2025 22:38

Cleanup + completions

833bb1b

Completions works

604b797

gpeal force-pushed the gpeal/mcp-image branch from beb8f14 to 604b797 Compare October 26, 2025 07:19

Cleanup

54c25d5

jif-oai approved these changes Oct 27, 2025

View reviewed changes

gpeal added 2 commits October 27, 2025 09:57

use From

9a491fc

Merge remote-tracking branch 'origin/main' into gpeal/mcp-image

5938f68

gpeal merged commit b0bdc04 into main Oct 27, 2025
20 checks passed

gpeal deleted the gpeal/mcp-image branch October 27, 2025 21:55

github-actions bot locked and limited conversation to collaborators Oct 27, 2025

	ResponseItem::FunctionCallOutput { call_id, output } => {
	messages.push(json!({
	"role": "tool",
	"tool_call_id": call_id,
	"content": output.content,
	}));

[MCP] Render MCP tool call result images to the model #5600

[MCP] Render MCP tool call result images to the model #5600

Uh oh!

Conversation

gpeal commented Oct 24, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

chatgpt-codex-connector bot left a comment

Choose a reason for hiding this comment

💡 Codex Review

Uh oh!

gpeal commented Oct 24, 2025

💡 Codex Review

Uh oh!

chatgpt-codex-connector bot commented Oct 24, 2025

Uh oh!

gpeal commented Oct 24, 2025

Uh oh!

chatgpt-codex-connector bot commented Oct 24, 2025

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

jif-oai Oct 24, 2025

Choose a reason for hiding this comment

Uh oh!

gpeal Oct 24, 2025

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

Uh oh!

gpeal commented Oct 24, 2025

Uh oh!

pakrym-oai commented Oct 24, 2025

Uh oh!

chatgpt-codex-connector bot commented Oct 24, 2025

Uh oh!

jif-oai left a comment

Choose a reason for hiding this comment

Uh oh!

jif-oai Oct 27, 2025

Choose a reason for hiding this comment

Uh oh!

gpeal Oct 27, 2025

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

4 participants

gpeal commented Oct 24, 2025 •

edited

Loading