Align Gemini CLI requests with official Code Assist#2288
Align Gemini CLI requests with official Code Assist#2288ddmww wants to merge 3 commits intorouter-for-me:mainfrom
Conversation
Summary of ChangesHello, I'm Gemini Code Assist1! I'm currently reviewing this pull request and will post my feedback shortly. In the meantime, here's a summary to help you and other reviewers quickly get up to speed! This pull request significantly refactors the internal Gemini CLI integration to closely mirror the behavior of the official Code Assist client. The changes focus on standardizing HTTP request headers, particularly the User-Agent, and ensuring that request payloads for content generation and token counting adhere to the expected format. Furthermore, a robust and configurable retry mechanism has been implemented to improve the reliability of API calls, and new tests have been added to validate these alignments. Highlights
🧠 New Feature in Public Preview: You can now enable Memory to help Gemini Code Assist learn from your team's feedback. This makes future code reviews more consistent and personalized to your project's style. Click here to enable Memory in your admin console. Ignored Files
Using Gemini Code AssistThe full guide for Gemini Code Assist can be found on our documentation page, here are some quick tips. Invoking Gemini You can request assistance from Gemini at any point by creating a comment using either
Customization To customize Gemini Code Assist for GitHub experience, repository maintainers can create a configuration file and/or provide a custom code review style guide (such as PEP-8 for Python) by creating and adding files to a Limitations & Feedback Gemini Code Assist may make mistakes. Please leave feedback on any instances where its feedback is incorrect or counter productive. You can react with 👍 and 👎 on @gemini-code-assist comments. If you're interested in giving your feedback about your experience with Gemini Code Assist for GitHub and other Google products, sign up here. Footnotes
|
There was a problem hiding this comment.
Code Review
This pull request is a significant and well-executed refactoring to align the Gemini CLI executor with the official Code Assist client. The changes introduce a centralized and robust request execution logic with retries, which greatly improves the clarity and maintainability of the Execute, ExecuteStream, and CountTokens methods. The addition of comprehensive tests is also a great improvement. I have a couple of minor suggestions for improvement.
| if updated, err := sjson.SetRawBytes(payload, "request.contents", []byte(contents.Raw)); err == nil { | ||
| payload = updated | ||
| } |
There was a problem hiding this comment.
The error from sjson.SetRawBytes is silently ignored. While the contents.Raw is expected to be valid JSON, it's good practice to log errors for easier debugging in case of unexpected issues. Consider adding an else block to log the error.
if updated, err := sjson.SetRawBytes(payload, "request.contents", []byte(contents.Raw)); err == nil {
payload = updated
} else {
log.Warnf("gemini cli executor: failed to set contents on countTokens payload: %v", err)
}| } | ||
| } | ||
|
|
||
| return 0, nil, nil, nil |
There was a problem hiding this comment.
This return statement appears to be unreachable. The for loop above it seems to have return statements on all execution paths for the final iteration. If this code is indeed unreachable, it should be removed to avoid confusion. If the Go compiler requires this for a subtle reason I've missed, a comment explaining why this return is needed would be helpful.
There was a problem hiding this comment.
💡 Codex Review
Here are some automated review suggestions for this pull request.
Reviewed commit: ef249b63f1
ℹ️ About Codex in GitHub
Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you
- Open a pull request for review
- Mark a draft as ready
- Comment "@codex review".
If Codex has suggestions, it will comment; otherwise it will react with 👍.
Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".
| payload := sdktranslator.TranslateRequest(from, to, baseModel, req.Payload, false) | ||
| payload = buildGeminiCLICountTokensPayload(payload, baseModel) |
There was a problem hiding this comment.
Preserve thinking preprocessing in CountTokens
Please keep the same preprocessing here that Execute still applies before generateContent. CountTokens now jumps straight from TranslateRequest to buildGeminiCLICountTokensPayload, so suffix-driven/normalized thinking settings never pass through thinking.ApplyThinking anymore. A request like gemini-2.5-pro(8192) will therefore be counted without the extra reasoning budget even though the actual generation call still sends it upstream, which makes /countTokens underreport prompt size for quota and context-limit checks.
Useful? React with 👍 / 👎.
| payload := []byte(`{"request":{"contents":[]}}`) | ||
| contents := gjson.GetBytes(body, "request.contents") | ||
| if contents.Exists() && strings.TrimSpace(contents.Raw) != "" { | ||
| if updated, err := sjson.SetRawBytes(payload, "request.contents", []byte(contents.Raw)); err == nil { | ||
| payload = updated |
There was a problem hiding this comment.
Preserve system and tool fields in countTokens payload
buildGeminiCLICountTokensPayload reconstructs a fresh request and copies only request.contents before setting request.model. The Gemini CLI translators still place prompt-bearing data in other fields such as request.systemInstruction and request.tools (for example in internal/translator/gemini-cli/openai/chat-completions/gemini-cli_openai_request.go), so /countTokens now ignores system prompts and tool schemas that generateContent will actually send. That produces token counts that are too small for common requests using a system message or function tools.
Useful? React with 👍 / 👎.
There was a problem hiding this comment.
💡 Codex Review
Here are some automated review suggestions for this pull request.
Reviewed commit: b3aaf06859
ℹ️ About Codex in GitHub
Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you
- Open a pull request for review
- Mark a draft as ready
- Comment "@codex review".
If Codex has suggestions, it will comment; otherwise it will react with 👍.
Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".
| env: | ||
| APP_NAME: CLIProxyAPI | ||
| DOCKERHUB_REPO: eceasy/cli-proxy-api | ||
| IMAGE_NAME: ghcr.io/ddmww/cliproxyapi |
There was a problem hiding this comment.
Keep publishing the default Docker Hub image
This workflow now only tags and pushes ghcr.io/ddmww/cliproxyapi, but the repo’s checked-in docker-compose.yml:3 still defaults to eceasy/cli-proxy-api:latest. After this lands, anyone following the default compose setup (or any automation still pulling the Docker Hub name) will stop receiving new releases even though the publish job succeeds, so the registry switch needs to be coordinated with the existing consumer defaults.
Useful? React with 👍 / 👎.
| payload := sdktranslator.TranslateRequest(from, to, baseModel, req.Payload, false) | ||
| payload = buildGeminiCLICountTokensPayload(payload, baseModel) |
There was a problem hiding this comment.
Apply payload rules before building countTokens body
CountTokens now goes straight from TranslateRequest to buildGeminiCLICountTokensPayload, so it no longer runs the applyPayloadConfigWithRoot(..., "gemini", "request", ...) step that Execute and ExecuteStream still use. Because payload.default/override/filter are first-class config features (internal/config/config.go:263-274), any deployment that injects or strips prompt-bearing fields via payload rules will get token counts for the pre-config request, and /countTokens can approve requests that the actual generateContent path later sends with a different token footprint.
Useful? React with 👍 / 👎.
| payload := sdktranslator.TranslateRequest(from, to, baseModel, req.Payload, false) | ||
| payload = buildGeminiCLICountTokensPayload(payload, baseModel) |
There was a problem hiding this comment.
Restore aspect-ratio preprocessing in CountTokens
Execute still calls fixGeminiCLIImageAspectRatio before sending gemini-2.5-flash-image-preview requests, and that helper adds both a long instruction and a synthetic inlineData part whenever request.generationConfig.imageConfig.aspectRatio is set without an uploaded image (internal/runtime/executor/gemini_cli_executor.go:897-936). CountTokens now skips that preprocessing and counts the unmodified request, so these image-generation prompts will be undercounted and can slip past quota/context checks.
Useful? React with 👍 / 👎.
luispater
left a comment
There was a problem hiding this comment.
Summary:
This PR moves the Gemini CLI executor closer to the official Code Assist request shape, but there are still a few blocking regressions that make it unsafe to merge as-is.
Key findings:
- Blocking:
.github/workflows/docker-image.ymlnow publishes toghcr.io/ddmww/cliproxyapi. This repository belongs torouter-for-me, so merging this would either push release images into a contributor-owned namespace or fail when the upstream repo tries to publish. - Blocking:
buildGeminiCLICountTokensPayloadonly keepsrequest.contentsandrequest.model, but OpenAI/Gemini system or developer messages are translated intorequest.systemInstruction. That meanscountTokenswill undercount whenever a system prompt is present, so it no longer matches the actual generate request shape. - Major: the new stable
session_idbehavior is internally inconsistent. The implementation/test expect the same prompt to reuse the samesession_id, butscripts/test_gemini_cli_official.pystill generates a fresh UUID every time, which matches the current local behavior instead. The new Go test also fails locally for this reason.
Test plan:
- Ran
go test ./internal/runtime/executor -run 'TestGeminiCLI'and it failed atTestGeminiCLIExecute_UsesStableSessionIDForSamePrompt. - Ran
go test ./internal/runtime/executor -run 'TestGeminiCLI(Execute_SendsOfficialHeadersAndBody|Execute_RetriesUsingConfiguredRequestRetry|ExecuteStream_UsesSSEPathWithoutLocalRetry|CountTokens_UsesOfficialBodyAndRetryConfig)$'and those passed.
No description provided.