fix: preserve responses continuation state for openai-compatible upstreams by xpcrack · Pull Request #2307 · router-for-me/CLIProxyAPI

xpcrack · 2026-03-24T04:05:36Z

Summary

preserve HTTP /v1/responses continuation state for openai-compatible upstreams that do not support previous_response_id
cache completed response turns for both non-streaming and streaming responses
normalize string shorthand input into a user message before merging continuation input
add regression coverage for HTTP continuation and streaming completion caching

Testing

go test ./sdk/api/handlers/openai -count=1
verified /v1/responses continuation against a live gpt-5.4 openai-compatible upstream for both non-streaming and streaming flows

…reams

gemini-code-assist · 2026-03-24T04:05:53Z

Summary of Changes

Hello, I'm Gemini Code Assist¹! I'm currently reviewing this pull request and will post my feedback shortly. In the meantime, here's a summary to help you and other reviewers quickly get up to speed!

This pull request enhances the API's ability to manage conversational continuity for OpenAI-compatible upstreams. By introducing a caching mechanism for previous request and response states, the system can now intelligently merge these states into subsequent continuation requests. This ensures that the full conversational context is preserved and correctly transmitted, even when the upstream service does not inherently support direct continuation IDs, leading to a more robust and seamless user experience for multi-turn interactions.

Highlights

Continuation State Preservation: Implemented a mechanism to preserve HTTP /v1/responses continuation state, especially for OpenAI-compatible upstreams that do not natively support previous_response_id.
Response Turn Caching: Introduced caching for completed response turns, covering both non-streaming and streaming response flows, to maintain conversational context.
Input Normalization: Added logic to normalize string shorthand input into a structured user message before merging it into continuation requests.
Regression Coverage: Included new regression tests to ensure the correct functioning of HTTP continuation and streaming completion caching.

🧠 New Feature in Public Preview: You can now enable Memory to help Gemini Code Assist learn from your team's feedback. This makes future code reviews more consistent and personalized to your project's style. Click here to enable Memory in your admin console.

Using Gemini Code Assist

The full guide for Gemini Code Assist can be found on our documentation page, here are some quick tips.

Invoking Gemini

You can request assistance from Gemini at any point by creating a comment using either /gemini <command> or @gemini-code-assist <command>. Below is a summary of the supported commands on the current page.

Feature	Command	Description
Code Review	`/gemini review`	Performs a code review for the current pull request in its current state.
Pull Request Summary	`/gemini summary`	Provides a summary of the current pull request in its current state.
Comment	@gemini-code-assist	Responds in comments when explicitly tagged, both in pull request comments and review comments.
Help	`/gemini help`	Displays a list of available commands.

Customization

To customize Gemini Code Assist for GitHub experience, repository maintainers can create a configuration file and/or provide a custom code review style guide (such as PEP-8 for Python) by creating and adding files to a .gemini/ folder in the base of the repository. Detailed instructions can be found here.

Limitations & Feedback

Gemini Code Assist may make mistakes. Please leave feedback on any instances where its feedback is incorrect or counter productive. You can react with 👍 and 👎 on @gemini-code-assist comments. If you're interested in giving your feedback about your experience with Gemini Code Assist for GitHub and other Google products, sign up here.

Review the Privacy Notices, Generative AI Prohibited Use Policy, Terms of Service, and learn how to configure Gemini Code Assist in GitHub here. Gemini can make mistakes, so double check it and use code with caution. ↩

gemini-code-assist

Code Review

This pull request introduces a valuable feature to preserve conversation state for OpenAI-compatible upstreams by caching response turns. The implementation correctly handles both streaming and non-streaming responses and includes good regression test coverage. My feedback focuses on improving the robustness of the new JSON manipulation logic by adding more comprehensive error handling, and I've also suggested a refactoring to improve code organization and reduce coupling between different parts of the handler logic.

gemini-code-assist · 2026-03-24T04:07:35Z

sdk/api/handlers/openai/openai_responses_turn_state.go

+	normalized, errDelete := sjson.DeleteBytes(rawJSON, "previous_response_id")
+	if errDelete != nil {
+		normalized = bytes.Clone(rawJSON)
+	}


The error from sjson.DeleteBytes is not handled correctly. If sjson.DeleteBytes fails, the code proceeds with a clone of the original rawJSON. This means the previous_response_id field is not removed and will be sent to the upstream, which this change is intended to prevent. This could lead to unexpected behavior or errors from upstreams that don't support this field. The error should be properly handled and returned to the caller.

normalized, errDelete := sjson.DeleteBytes(rawJSON, "previous_response_id") if errDelete != nil { return nil, &interfaces.ErrorMessage{ StatusCode: http.StatusBadRequest, Error: fmt.Errorf("failed to remove previous_response_id from request: %w", errDelete), } }

gemini-code-assist · 2026-03-24T04:07:35Z

sdk/api/handlers/openai/openai_responses_turn_state.go

+	if !gjson.GetBytes(normalized, "model").Exists() {
+		modelName := strings.TrimSpace(gjson.GetBytes(lastRequest, "model").String())
+		if modelName != "" {
+			normalized, _ = sjson.SetBytes(normalized, "model", modelName)
+		}
+	}
+	if !gjson.GetBytes(normalized, "instructions").Exists() {
+		instructions := gjson.GetBytes(lastRequest, "instructions")
+		if instructions.Exists() {
+			normalized, _ = sjson.SetRawBytes(normalized, "instructions", []byte(instructions.Raw))
+		}
+	}


Errors from sjson.SetBytes (line 127) and sjson.SetRawBytes (line 133) are ignored when setting the model and instructions from the previous request. If these operations fail, the normalized request might be missing a required model field, leading to a failure when the request is sent to the upstream. These errors should be handled to ensure request integrity.

gemini-code-assist · 2026-03-24T04:07:35Z

sdk/api/handlers/openai/openai_responses_turn_state.go

+	for _, payload := range websocketJSONPayloadsFromChunk(chunk) {
+		if gjson.GetBytes(payload, "type").String() != wsEventTypeCompleted {
+			continue
+		}
+		h.rememberCompletedResponse(requestJSON, payload)
+	}


This function rememberCompletedResponseFromChunk uses websocketJSONPayloadsFromChunk and wsEventTypeCompleted which are defined in openai_responses_websocket.go. This creates a tight coupling between the turn state logic and websocket-specific implementation details. Furthermore, the names are misleading as this logic is also used for HTTP SSE streaming.

To improve modularity and clarity, consider moving these shared components to a more neutral location. For example, websocketJSONPayloadsFromChunk could be renamed to parseSSEPayloads and moved to this file (openai_responses_turn_state.go) or a new sse.go file within the package. Similarly, wsEventTypeCompleted could be renamed to something like SSEEventResponseCompleted.

chatgpt-codex-connector

💡 Codex Review

Here are some automated review suggestions for this pull request.

Reviewed commit: caac256b33

ℹ️ About Codex in GitHub

Codex has been enabled to automatically review pull requests in this repo. Reviews are triggered when you

Open a pull request for review
Mark a draft as ready
Comment "@codex review".

If Codex has suggestions, it will comment; otherwise it will react with 👍.

When you sign up for Codex through ChatGPT, Codex can also answer questions or update the PR, like "@codex address that feedback".

chatgpt-codex-connector · 2026-03-24T04:09:15Z

sdk/api/handlers/openai/openai_responses_turn_state.go

+	previousResponseID := strings.TrimSpace(gjson.GetBytes(rawJSON, "previous_response_id").String())
+	if previousResponseID == "" {
+		return rawJSON, nil
+	}
+	lastRequest, lastResponseOutput, ok := h.turnState.load(previousResponseID)


Namespace continuation cache by auth context

Continuation lookup is keyed only by previous_response_id, so the handler applies cached turn state without checking which authenticated client created it. In a multi-tenant deployment, if a response ID is exposed across users (for example via logs or shared tooling), a different user can attach that ID and have another tenant’s prior input/output merged into their request, leaking prompt context and corrupting conversation state. Scope cache keys by auth/client identity before loading and merging.

Useful? React with 👍 / 👎.

chatgpt-codex-connector · 2026-03-24T04:09:15Z

sdk/api/handlers/openai/openai_responses_turn_state.go

+	c.entries.Store(responseID, responsesTurnStateEntry{
+		request: bytes.Clone(requestJSON),
+		output:  bytes.Clone(outputJSON),
+		expire:  time.Now().Add(responsesTurnStateTTL),
+	})


Add proactive eviction for turn-state cache

The cache stores every completed response with a TTL, but expired entries are only removed when that exact key is loaded later. Since many response IDs are never referenced again, stale entries remain in sync.Map indefinitely and memory usage will grow over time under normal traffic. Add proactive cleanup (background sweep or size-bounded eviction) so one-shot turns do not accumulate permanently.

Useful? React with 👍 / 👎.

luispater

Summary:
This PR improves /v1/responses continuation for OpenAI-compatible upstreams by caching completed turns and reconstructing continuation input when previous_response_id is present. The core behavior and targeted regression tests are solid.

Key findings:

Blocking: The new turn-state cache is unbounded in practice. Entries expire logically, but cleanup only happens on exact-key reads. Since response IDs are generally not read again after expiration, memory can grow without bound over time.
Non-blocking: Continuation normalization currently requires input to be an array and may reject valid string shorthand input in cache-hit flows.
Non-blocking: Missing cache-eviction/boundedness tests.

Test plan:

Reviewed PR metadata, file-level diff, and full patch via gh/script context collection.
Reviewed CI checks (intentionally ignored translator-path-guard per instruction).
Did not run local tests or checkout the PR branch in this review pass.

Requested changes:
Please add a deterministic cache cleanup/bounding strategy (for example periodic sweep, size cap + eviction policy, or both) and corresponding tests before approval.

fix: preserve responses continuation state for openai-compatible upst…

caac256

…reams

gemini-code-assist bot reviewed Mar 24, 2026

View reviewed changes

chatgpt-codex-connector bot reviewed Mar 24, 2026

View reviewed changes

luispater requested changes Mar 25, 2026

View reviewed changes

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

fix: preserve responses continuation state for openai-compatible upstreams#2307

fix: preserve responses continuation state for openai-compatible upstreams#2307
xpcrack wants to merge 1 commit intorouter-for-me:mainfrom
xpcrack:codex/fix-openai-responses-http-continuation

xpcrack commented Mar 24, 2026

Uh oh!

gemini-code-assist bot commented Mar 24, 2026

Uh oh!

gemini-code-assist bot left a comment

Uh oh!

gemini-code-assist bot Mar 24, 2026

Uh oh!

gemini-code-assist bot Mar 24, 2026

Uh oh!

gemini-code-assist bot Mar 24, 2026

Uh oh!

chatgpt-codex-connector bot left a comment

Uh oh!

chatgpt-codex-connector bot Mar 24, 2026

Uh oh!

chatgpt-codex-connector bot Mar 24, 2026

Uh oh!

luispater left a comment

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

Uh oh!

Conversation

xpcrack commented Mar 24, 2026

Summary

Testing

Uh oh!

gemini-code-assist bot commented Mar 24, 2026

Summary of Changes

Highlights

Footnotes

Uh oh!

gemini-code-assist bot left a comment

Choose a reason for hiding this comment

Code Review

Uh oh!

gemini-code-assist bot Mar 24, 2026

Choose a reason for hiding this comment

Uh oh!

gemini-code-assist bot Mar 24, 2026

Choose a reason for hiding this comment

Uh oh!

gemini-code-assist bot Mar 24, 2026

Choose a reason for hiding this comment

Uh oh!

chatgpt-codex-connector bot left a comment

Choose a reason for hiding this comment

💡 Codex Review

Uh oh!

chatgpt-codex-connector bot Mar 24, 2026

Choose a reason for hiding this comment

Uh oh!

chatgpt-codex-connector bot Mar 24, 2026

Choose a reason for hiding this comment

Uh oh!

luispater left a comment

Choose a reason for hiding this comment

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants