Skip to content

Align Gemini CLI requests with official Code Assist#2288

Open
ddmww wants to merge 3 commits intorouter-for-me:mainfrom
ddmww:codex/gemini-cli-parity
Open

Align Gemini CLI requests with official Code Assist#2288
ddmww wants to merge 3 commits intorouter-for-me:mainfrom
ddmww:codex/gemini-cli-parity

Conversation

@ddmww
Copy link
Copy Markdown

@ddmww ddmww commented Mar 23, 2026

No description provided.

@gemini-code-assist
Copy link
Copy Markdown
Contributor

Summary of Changes

Hello, I'm Gemini Code Assist1! I'm currently reviewing this pull request and will post my feedback shortly. In the meantime, here's a summary to help you and other reviewers quickly get up to speed!

This pull request significantly refactors the internal Gemini CLI integration to closely mirror the behavior of the official Code Assist client. The changes focus on standardizing HTTP request headers, particularly the User-Agent, and ensuring that request payloads for content generation and token counting adhere to the expected format. Furthermore, a robust and configurable retry mechanism has been implemented to improve the reliability of API calls, and new tests have been added to validate these alignments.

Highlights

  • Gemini CLI Version Update: Updated the GeminiCLIVersion constant to reflect a newer nightly build, aligning with the official Code Assist version.
  • Standardized User-Agent Generation: Refactored the GeminiCLIUserAgent function and introduced GeminiCLIUserAgentWithClient to allow for client-specific prefixes and surface metadata in the User-Agent string, enhancing compatibility with the official CLI.
  • Architecture Mapping Correction: Corrected the architecture mapping for 386 systems from x86 to ia32 to match Node.js-style platform strings used by Gemini CLI.
  • Enhanced Request Execution and Retry Logic: Introduced a new helper function executeGeminiCLIJSONRequest to centralize HTTP request execution, error handling, and retry logic for both Execute and CountTokens methods. This includes configurable retry attempts and delays.
  • Standardized Payload Construction: Created dedicated helper functions buildGeminiCLIGeneratePayload and buildGeminiCLICountTokensPayload to ensure consistent and official-like payload structures for generate content and count tokens requests, including the addition of user_prompt_id and session_id.
  • Removed X-Goog-Api-Client Header: The X-Goog-Api-Client header, previously sent to the Gemini CLI upstream, has been removed to align with the official Code Assist client behavior.
  • New Test Coverage: Added a new Go test file (gemini_cli_executor_test.go) to validate the updated request headers, body structure, and retry mechanisms. A Python script (test_gemini_cli_official.py) was also added to facilitate testing against the official Gemini CLI Code Assist upstream.

🧠 New Feature in Public Preview: You can now enable Memory to help Gemini Code Assist learn from your team's feedback. This makes future code reviews more consistent and personalized to your project's style. Click here to enable Memory in your admin console.

Ignored Files
  • Ignored by pattern: .github/workflows/** (1)
    • .github/workflows/docker-image.yml
Using Gemini Code Assist

The full guide for Gemini Code Assist can be found on our documentation page, here are some quick tips.

Invoking Gemini

You can request assistance from Gemini at any point by creating a comment using either /gemini <command> or @gemini-code-assist <command>. Below is a summary of the supported commands on the current page.

Feature Command Description
Code Review /gemini review Performs a code review for the current pull request in its current state.
Pull Request Summary /gemini summary Provides a summary of the current pull request in its current state.
Comment @gemini-code-assist Responds in comments when explicitly tagged, both in pull request comments and review comments.
Help /gemini help Displays a list of available commands.

Customization

To customize Gemini Code Assist for GitHub experience, repository maintainers can create a configuration file and/or provide a custom code review style guide (such as PEP-8 for Python) by creating and adding files to a .gemini/ folder in the base of the repository. Detailed instructions can be found here.

Limitations & Feedback

Gemini Code Assist may make mistakes. Please leave feedback on any instances where its feedback is incorrect or counter productive. You can react with 👍 and 👎 on @gemini-code-assist comments. If you're interested in giving your feedback about your experience with Gemini Code Assist for GitHub and other Google products, sign up here.

Footnotes

  1. Review the Privacy Notices, Generative AI Prohibited Use Policy, Terms of Service, and learn how to configure Gemini Code Assist in GitHub here. Gemini can make mistakes, so double check it and use code with caution.

Copy link
Copy Markdown
Contributor

@gemini-code-assist gemini-code-assist bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Code Review

This pull request is a significant and well-executed refactoring to align the Gemini CLI executor with the official Code Assist client. The changes introduce a centralized and robust request execution logic with retries, which greatly improves the clarity and maintainability of the Execute, ExecuteStream, and CountTokens methods. The addition of comprehensive tests is also a great improvement. I have a couple of minor suggestions for improvement.

Comment on lines +674 to +676
if updated, err := sjson.SetRawBytes(payload, "request.contents", []byte(contents.Raw)); err == nil {
payload = updated
}
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

medium

The error from sjson.SetRawBytes is silently ignored. While the contents.Raw is expected to be valid JSON, it's good practice to log errors for easier debugging in case of unexpected issues. Consider adding an else block to log the error.

if updated, err := sjson.SetRawBytes(payload, "request.contents", []byte(contents.Raw)); err == nil {
			payload = updated
		} else {
			log.Warnf("gemini cli executor: failed to set contents on countTokens payload: %v", err)
		}

}
}

return 0, nil, nil, nil
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

medium

This return statement appears to be unreachable. The for loop above it seems to have return statements on all execution paths for the final iteration. If this code is indeed unreachable, it should be removed to avoid confusion. If the Go compiler requires this for a subtle reason I've missed, a comment explaining why this return is needed would be helpful.

Copy link
Copy Markdown

@chatgpt-codex-connector chatgpt-codex-connector bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

💡 Codex Review

Here are some automated review suggestions for this pull request.

Reviewed commit: ef249b63f1

ℹ️ About Codex in GitHub

Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you

  • Open a pull request for review
  • Mark a draft as ready
  • Comment "@codex review".

If Codex has suggestions, it will comment; otherwise it will react with 👍.

Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".

Comment on lines +440 to +441
payload := sdktranslator.TranslateRequest(from, to, baseModel, req.Payload, false)
payload = buildGeminiCLICountTokensPayload(payload, baseModel)
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

P1 Badge Preserve thinking preprocessing in CountTokens

Please keep the same preprocessing here that Execute still applies before generateContent. CountTokens now jumps straight from TranslateRequest to buildGeminiCLICountTokensPayload, so suffix-driven/normalized thinking settings never pass through thinking.ApplyThinking anymore. A request like gemini-2.5-pro(8192) will therefore be counted without the extra reasoning budget even though the actual generation call still sends it upstream, which makes /countTokens underreport prompt size for quota and context-limit checks.

Useful? React with 👍 / 👎.

Comment on lines +671 to +675
payload := []byte(`{"request":{"contents":[]}}`)
contents := gjson.GetBytes(body, "request.contents")
if contents.Exists() && strings.TrimSpace(contents.Raw) != "" {
if updated, err := sjson.SetRawBytes(payload, "request.contents", []byte(contents.Raw)); err == nil {
payload = updated
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

P1 Badge Preserve system and tool fields in countTokens payload

buildGeminiCLICountTokensPayload reconstructs a fresh request and copies only request.contents before setting request.model. The Gemini CLI translators still place prompt-bearing data in other fields such as request.systemInstruction and request.tools (for example in internal/translator/gemini-cli/openai/chat-completions/gemini-cli_openai_request.go), so /countTokens now ignores system prompts and tool schemas that generateContent will actually send. That produces token counts that are too small for common requests using a system message or function tools.

Useful? React with 👍 / 👎.

Copy link
Copy Markdown

@chatgpt-codex-connector chatgpt-codex-connector bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

💡 Codex Review

Here are some automated review suggestions for this pull request.

Reviewed commit: b3aaf06859

ℹ️ About Codex in GitHub

Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you

  • Open a pull request for review
  • Mark a draft as ready
  • Comment "@codex review".

If Codex has suggestions, it will comment; otherwise it will react with 👍.

Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".

Comment on lines 11 to +12
env:
APP_NAME: CLIProxyAPI
DOCKERHUB_REPO: eceasy/cli-proxy-api
IMAGE_NAME: ghcr.io/ddmww/cliproxyapi
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

P1 Badge Keep publishing the default Docker Hub image

This workflow now only tags and pushes ghcr.io/ddmww/cliproxyapi, but the repo’s checked-in docker-compose.yml:3 still defaults to eceasy/cli-proxy-api:latest. After this lands, anyone following the default compose setup (or any automation still pulling the Docker Hub name) will stop receiving new releases even though the publish job succeeds, so the registry switch needs to be coordinated with the existing consumer defaults.

Useful? React with 👍 / 👎.

Comment on lines +442 to +443
payload := sdktranslator.TranslateRequest(from, to, baseModel, req.Payload, false)
payload = buildGeminiCLICountTokensPayload(payload, baseModel)
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

P2 Badge Apply payload rules before building countTokens body

CountTokens now goes straight from TranslateRequest to buildGeminiCLICountTokensPayload, so it no longer runs the applyPayloadConfigWithRoot(..., "gemini", "request", ...) step that Execute and ExecuteStream still use. Because payload.default/override/filter are first-class config features (internal/config/config.go:263-274), any deployment that injects or strips prompt-bearing fields via payload rules will get token counts for the pre-config request, and /countTokens can approve requests that the actual generateContent path later sends with a different token footprint.

Useful? React with 👍 / 👎.

Comment on lines +442 to +443
payload := sdktranslator.TranslateRequest(from, to, baseModel, req.Payload, false)
payload = buildGeminiCLICountTokensPayload(payload, baseModel)
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

P2 Badge Restore aspect-ratio preprocessing in CountTokens

Execute still calls fixGeminiCLIImageAspectRatio before sending gemini-2.5-flash-image-preview requests, and that helper adds both a long instruction and a synthetic inlineData part whenever request.generationConfig.imageConfig.aspectRatio is set without an uploaded image (internal/runtime/executor/gemini_cli_executor.go:897-936). CountTokens now skips that preprocessing and counts the unmodified request, so these image-generation prompts will be undercounted and can slip past quota/context checks.

Useful? React with 👍 / 👎.

Copy link
Copy Markdown
Collaborator

@luispater luispater left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Summary:
This PR moves the Gemini CLI executor closer to the official Code Assist request shape, but there are still a few blocking regressions that make it unsafe to merge as-is.

Key findings:

  • Blocking: .github/workflows/docker-image.yml now publishes to ghcr.io/ddmww/cliproxyapi. This repository belongs to router-for-me, so merging this would either push release images into a contributor-owned namespace or fail when the upstream repo tries to publish.
  • Blocking: buildGeminiCLICountTokensPayload only keeps request.contents and request.model, but OpenAI/Gemini system or developer messages are translated into request.systemInstruction. That means countTokens will undercount whenever a system prompt is present, so it no longer matches the actual generate request shape.
  • Major: the new stable session_id behavior is internally inconsistent. The implementation/test expect the same prompt to reuse the same session_id, but scripts/test_gemini_cli_official.py still generates a fresh UUID every time, which matches the current local behavior instead. The new Go test also fails locally for this reason.

Test plan:

  • Ran go test ./internal/runtime/executor -run 'TestGeminiCLI' and it failed at TestGeminiCLIExecute_UsesStableSessionIDForSamePrompt.
  • Ran go test ./internal/runtime/executor -run 'TestGeminiCLI(Execute_SendsOfficialHeadersAndBody|Execute_RetriesUsingConfiguredRequestRetry|ExecuteStream_UsesSSEPathWithoutLocalRetry|CountTokens_UsesOfficialBodyAndRetryConfig)$' and those passed.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants