v1.1.7: automated test suite, cross-platform hardening, internal doctor tooling by jamubc · Pull Request #81 · jamubc/gemini-mcp-tool

jamubc · 2026-06-01T05:05:35Z

Summary

Reliability patch plus the project's first automated test suite. Hardens cross-platform execution and adds a categorized node:test suite that gates CI. No runtime or default-config changes vs 1.1.6 — the only new knob is the opt-in GEMINI_CLI_PATH.

Closes / References

Closes Bug: changeMode+cache path passes empty string to parser when cache misses #67 — changeMode continuation with a missing/expired cache returned the generic "No edits found…" (an empty string fell through to the parser). It now returns an explicit cache-miss / invalid-chunk-index message.
Closes Bug: Deprecated -p flag causes "Cannot use both positional prompt and --prompt flag" error #48 — the deprecated -p "cannot use both a positional prompt and --prompt" error: @file / changeMode prompts are now delivered on stdin (no -p), and simple prompts pass only -p (never a positional), so the conflict can't arise.
Closes 1.1.4 is published on NPM but there's no changelog or way to figure out what changed #39 — there is now a maintained CHANGELOG.md documenting 1.1.1 → 1.1.7.
References Keeps disconnecting after every few successful calls in Claude Code #64 — candidate fix for the "keeps disconnecting" reports: an uncaught EPIPE when a child closes stdin early could drop the long-lived server. Hardened here; left open pending reporter confirmation.
Supersedes feat(windows): stdin prompt passing + windowsHide (harvested from #27) #77 and Main windows patch #27 — their Windows fixes (stdin prompt passing + windowsHide) are harvested into this branch. Close those as superseded once this merges.

Testing infrastructure

A test/ tree segmented into four categories by how much of the real world they touch:

Category	Folder	Touches real gemini?	Runs in CI?	Command
unit	`test/unit/`	No	Yes (gates merges)	`npm run test:unit`
integration	`test/integration/`	No	Yes (gates merges)	`npm run test:integration`
e2e	`test/e2e/`	Yes (live, opt-in)	No	`npm run test:e2e`
judge	`test/judge/`	Yes + LLM judge	No	`npm run doctor:judge`

57 hermetic unit + integration tests gate CI on Node 18/20/22.
E2E suite drives the real Gemini CLI through the built MCP server over stdio with the MCP SDK client — the automated replacement for manual mcpjam testing. Auto-skips when gemini isn't on PATH.
LLM-as-a-Judge (test/judge/, WIP) semantically evaluates tool output against rubrics via DeepSeek/OpenRouter. Opt-in, key-gated, never in CI.
Self-contained test env: .env / .env.example live in test/; shared config via test/envParser.ts.

Developer tooling

scripts/run-tests.mjs — category-aware test runner.
scripts/doctor.mjs — internal preflight diagnostics + e2e/judge runner (npm run doctor, doctor test, doctor:judge). Unpublished (excluded from files/bin).
tsconfig.test.json — type-checks both src/ and test/.
test/README.md — documents the categories, commands, the hermetic boundary, and how to add tests.

Windows & cross-platform hardening

Deliver changeMode / @file prompts on stdin instead of -p, avoiding cmd.exe argument parsing and the command-line length limit.
Resolve the gemini .cmd shim via where, honour GEMINI_CLI_PATH; never select an unlaunchable .ps1/extensionless shim.
Add windowsHide to suppress the popup console window.
Guard against uncaught EPIPE when a child closes stdin early.
Fix the Help tool: --help (was -help, which yargs split into -h -e -l -p).

Internal / non-breaking

Logger mutes routine [GMCPT] output under NODE_ENV=test (errors still print; production unchanged).
Export buildBrainstormPrompt / getMethodologyInstructions for unit tests.
Expose CLI flags + GEMINI_CLI_PATH in constants.
CI: drop Node 16 (no node:test), add Node 22, add a type-check step, remove continue-on-error.

Verification

npm test → 57/57 pass (exit 0)
npm run lint → clean (src + tests)
npm run build → clean
E2E verified locally against the gemini CLI with real LLM round-trips (npm run doctor test).

gemini-code-assist

Code Review

This pull request updates the tool to version 1.1.7, focusing on cross-platform reliability and introducing a comprehensive automated test suite (unit, integration, e2e, and LLM-as-a-judge). Key changes include hardening Windows execution by passing prompts via stdin, improving executable resolution, adding clearer ENOENT guidance, and introducing a diagnostic doctor script. The review feedback highlights a few areas for improvement: first, the doctor script should mirror the robust Windows executable resolution logic to avoid execution failures when checking the Gemini version; second, the test configuration parser should resolve the .env file path relative to the module's directory using import.meta.url rather than relying on process.cwd(), which improves the portability of the test suite.

Copilot

Pull request overview

This PR bumps the project to v1.1.7 and focuses on reliability and cross-platform hardening (notably Windows execution), while introducing the project’s first automated node:test suite (unit/integration CI-gating, plus opt-in e2e/judge suites) and internal “doctor” diagnostics tooling.

Changes:

Add categorized unit + integration tests (CI gating) plus opt-in e2e and LLM judge suites, with a category-aware test runner and tsconfig.test.json.
Harden Gemini CLI execution cross-platform (stdin prompt passing for changeMode/@file, Windows shim resolution, windowsHide, EPIPE handling, improved ENOENT guidance, --help fix).
Add internal doctor tooling and update CI + package scripts for new validation flow.

Reviewed changes

Copilot reviewed 30 out of 31 changed files in this pull request and generated 7 comments.

Show a summary per file

File	Description
tsconfig.test.json	Adds a dedicated TS config to type-check `src/` + `test/` with `noEmit`.
test/unit/utils/geminiExecutor.test.ts	Unit coverage for `@file` reference safety guard.
test/unit/utils/commandExecutor.test.ts	Unit coverage for Windows quoting, command resolution, and ENOENT guidance.
test/unit/utils/chunkCache.test.ts	Unit coverage for chunk cache key shape, TTL expiry, eviction, and stats.
test/unit/utils/changeModeTranslator.test.ts	Unit coverage for changeMode response formatting and summaries.
test/unit/utils/changeModeParser.test.ts	Unit coverage for parsing/validating OLD/NEW blocks.
test/unit/utils/changeModeChunker.test.ts	Unit coverage for chunking behavior under budget constraints.
test/unit/tools/registry.test.ts	Validates registry tool schema and prompt formatting helpers.
test/unit/tools/brainstorm.test.ts	Unit coverage for brainstorm prompt construction + methodology selection.
test/README.md	Documents test categories/commands and hermetic boundaries.
test/judge/judge.test.ts	Adds opt-in semantic “LLM-as-a-Judge” suite for tool outputs.
test/integration/tool-contract.test.ts	Hermetic integration tests for registry→tool validation and non-spawn paths.
test/integration/changeMode-pipeline.test.ts	Hermetic integration test wiring the full changeMode parse→chunk→cache→fetch flow.
test/envParser.ts	Shared configuration loader for e2e/judge suites via `test/.env` and env vars.
test/e2e/server.e2e.test.ts	E2E protocol/system tool checks that don’t require Gemini.
test/e2e/harness.ts	E2E harness for spawning built server and producing rich diagnostics.
test/e2e/fixtures/sentinel.txt	Fixture for validating `@file` inlining in live Gemini runs.
test/e2e/ask-gemini.e2e.test.ts	Opt-in live Gemini CLI E2E coverage (auto-skip when Gemini missing).
test/.env.example	Example env for enabling judge runs (API keys + optional settings).
src/utils/logger.ts	Mutes routine logs under `NODE_ENV=test` to keep test output readable.
src/utils/geminiExecutor.ts	Sends `changeMode`/`@file` prompts via stdin; integrates stdin path into executor; changeMode pipeline tweaks.
src/utils/commandExecutor.ts	Windows shim resolution, safer quoting, stdin support, `windowsHide`, EPIPE guard, ENOENT guidance.
src/tools/simple-tools.ts	Uses constants for command names/flags; fixes Help to use `--help`.
src/tools/brainstorm.tool.ts	Exports prompt builders for unit testing.
src/tools/ask-gemini.tool.ts	Minor formatting changes; continuation validation remains in place.
src/constants.ts	Adds `ENV.GEMINI_CLI_PATH` and corrects `--help` flag constant.
scripts/run-tests.mjs	Category-aware test runner for unit/integration/e2e/judge suites.
scripts/doctor.mjs	Adds internal diagnostics runner and helpers for running e2e/judge suites.
package.json	Updates scripts to run real tests, adds doctor commands, updates lint to include tests, bumps version to 1.1.7.
CHANGELOG.md	Documents v1.1.7 changes and cross-platform hardening.
.github/workflows/ci.yml	Updates CI matrix (Node 18/20/22) and gates merges on build + type-check + hermetic tests.

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

Copilot

Pull request overview

Copilot reviewed 30 out of 31 changed files in this pull request and generated 6 comments.

…or tooling Introduce categorized node:test coverage (unit / integration / e2e / judge), cross-platform reliability fixes, and internal developer tooling. ### Testing - Add 56 hermetic unit + integration tests gating CI (Node 18/20/22) - Add e2e suite driving the real Gemini CLI through the built MCP server (auto-skips when gemini is not on PATH) - Add LLM-as-a-Judge semantic evaluation suite (test/judge/) using DeepSeek or OpenRouter against validation rubrics (WIP) - Add scripts/run-tests.mjs (category-aware runner) and scripts/doctor.mjs (preflight / e2e / judge helper) - Add test README, tsconfig.test.json, test/.env.example, and e2e harness with diagnostic logging (spawned CWD, tool args, raw responses) ### Windows & cross-platform hardening - Deliver changeMode and @file prompts on stdin instead of -p flag, avoiding cmd.exe argument parsing and command-line length limits - Resolve the gemini .cmd shim via `where`, honour GEMINI_CLI_PATH - Add windowsHide to suppress popup console windows - Guard against uncaught EPIPE when a child closes stdin early - Fix --help flag (was -help, parsed by yargs as -h -e -l -p) ### Internal - Logger mutes routine output under NODE_ENV=test - Expose CLI flags and GEMINI_CLI_PATH in constants - Update CHANGELOG for 1.1.7

Copilot

Pull request overview

Copilot reviewed 30 out of 31 changed files in this pull request and generated 5 comments.

- scripts/doctor.mjs:47: version probing now runs the resolved primary path, handles Windows paths with spaces, and reports a bad GEMINI_CLI_PATH clearly. - src/utils/geminiExecutor.ts:196: changeMode continuation with an empty raw result now returns an explicit cache miss or invalid chunk index. - test/integration/tool-contract.test.ts:42: updated the missing-cache assertion. - test/envParser.ts:24: cleaned import placement, trailing comma, and precedence docs. - test/.env.example:2: fixed doctor:judge command and DeepSeek default model comment.

Copilot

Pull request overview

Copilot reviewed 30 out of 31 changed files in this pull request and generated 6 comments.

…emini detection, doc accuracy - commandExecutor: extract selectWindowsGeminiCandidate() (pure, unit-tested); drop the candidates[0] fallback so a lone .ps1/extensionless shim is never selected (it isn't launchable via shell:true) — falls back to gemini.cmd. - e2e harness: hasGemini() now reuses resolveCommandForExecution + spawnSync with Windows-safe quoting, so the live-suite skip matches the server's real resolution (fixes false auto-skip on Windows .cmd shims). - geminiExecutor: correct the prompt-quoting comment (executeCommand uses shell:true on Windows, not shell:false). - run-tests.mjs + test/README.md: document the judge category; Node >= 18.19 (the --import tsx / --test-concurrency floor).

jamubc · 2026-06-01T07:58:13Z

Review feedback addressed

All automated review findings (gemini-code-assist + Copilot) are resolved as of 6f05b95. Mapping for the record:

Fixed earlier in this branch (547cfd8):

doctor.mjs — version probe now runs the resolved primary/override path (mirrors commandExecutor resolution), and reports a bad GEMINI_CLI_PATH explicitly. (gemini-code-assist high-priority)
geminiExecutor — changeMode continuation with an empty/expired cache now returns an explicit cache-miss / invalid-chunk-index message; removed the stray leading + from the "no edits" message.
tool-contract.test.ts — asserts the new cache-miss message.
envParser.ts — .env path resolved via import.meta.url (cwd-independent); added the missing trailing comma; precedence doc matches the implementation (system env wins).
test/README.md — "four categories"; relative link instead of an absolute file:// path.
test/.env.example — uses npm run doctor:judge.

Fixed in 6f05b95 (this push):

commandExecutor — extracted selectWindowsGeminiCandidate() (now unit-tested) and dropped the candidates[0] fallback, so a lone .ps1/extensionless shim is never selected; falls back to gemini.cmd.
e2e/harness.ts — hasGemini() reuses resolveCommandForExecution + spawnSync with Windows-safe quoting, so the live suite no longer false-skips on a Windows .cmd shim.
geminiExecutor — corrected the prompt-quoting comment (executeCommand uses shell:true on Windows, not shell:false).
run-tests.mjs + README — document the judge category; Node ≥ 18.19 (the --import tsx / --test-concurrency floor).

Intentional (not a defect):

DeepSeek judge default model deepseek-v4-flash is correct and confirmed working; test/.env.example and judge.test.ts are aligned on it.

Resolving these threads.

jamubc · 2026-06-01T08:11:24Z

Thanks to contributors

1.1.7 builds on reports and review from the community:

@samuelgudi (Bug: Deprecated -p flag causes "Cannot use both positional prompt and --prompt flag" error #48): pinpointed the deprecated -p "cannot use both a positional prompt and --prompt" failure with a clear root cause and suggested fix. This shaped routing @file and changeMode prompts onto stdin.
@toller892 (Bug: changeMode+cache path passes empty string to parser when cache misses #67): caught the changeMode cache miss returning a misleading "No edits found" message during code review. Now fixed with an explicit cache-miss response and a regression test.
@jacobcxdev (Keeps disconnecting after every few successful calls in Claude Code #64): reported the repeated disconnects in Claude Code, which motivated the stdin EPIPE and spawn-error hardening in this release. This is a candidate fix; confirmation on your setup is welcome.
@henricook (1.1.4 is published on NPM but there's no changelog or way to figure out what changed #39): flagged the missing changelog. There is now a maintained CHANGELOG.md covering 1.1.1 through 1.1.7.

Copilot AI review requested due to automatic review settings June 1, 2026 05:05

Copilot started reviewing on behalf of jamubc June 1, 2026 05:05 View session

gemini-code-assist Bot reviewed Jun 1, 2026

View reviewed changes

Comment thread scripts/doctor.mjs

Comment thread test/envParser.ts

Comment thread test/envParser.ts Outdated

Copilot AI reviewed Jun 1, 2026

View reviewed changes

Comment thread test/envParser.ts Outdated

Comment thread test/envParser.ts Outdated

Comment thread test/README.md Outdated

Comment thread test/README.md Outdated

Comment thread scripts/doctor.mjs Outdated

Comment thread scripts/doctor.mjs Outdated

Comment thread src/utils/geminiExecutor.ts

jamubc force-pushed the release/1.1.7 branch 2 times, most recently from 9b76827 to 24a3c12 Compare June 1, 2026 05:20

jamubc requested a review from Copilot June 1, 2026 05:21

Copilot started reviewing on behalf of jamubc June 1, 2026 05:21 View session

jamubc force-pushed the release/1.1.7 branch from 24a3c12 to 880a15a Compare June 1, 2026 05:25

Copilot AI reviewed Jun 1, 2026

View reviewed changes

Comment thread src/utils/geminiExecutor.ts

Comment thread test/README.md Outdated

Comment thread test/.env.example Outdated

Comment thread test/.env.example

Comment thread test/envParser.ts Outdated

Comment thread scripts/doctor.mjs

jamubc force-pushed the release/1.1.7 branch from 880a15a to 1d6bf32 Compare June 1, 2026 05:29

jamubc requested a review from Copilot June 1, 2026 05:31

Copilot started reviewing on behalf of jamubc June 1, 2026 05:31 View session

Copilot AI reviewed Jun 1, 2026

View reviewed changes

Comment thread test/e2e/harness.ts Outdated

Comment thread src/utils/geminiExecutor.ts

Comment thread test/integration/tool-contract.test.ts Outdated

Comment thread test/.env.example

Comment thread test/README.md Outdated

jamubc force-pushed the release/1.1.7 branch from 2fc3cf2 to 547cfd8 Compare June 1, 2026 05:42

jamubc requested a review from Copilot June 1, 2026 05:42

Copilot started reviewing on behalf of jamubc June 1, 2026 05:43 View session

Copilot AI reviewed Jun 1, 2026

View reviewed changes

Comment thread src/utils/commandExecutor.ts Outdated

Comment thread src/utils/geminiExecutor.ts Outdated

Comment thread scripts/run-tests.mjs

Comment thread src/utils/commandExecutor.ts Outdated

Comment thread src/utils/geminiExecutor.ts Outdated

Comment thread scripts/run-tests.mjs

jamubc merged commit c3d33ee into main Jun 1, 2026
4 checks passed

Conversation

jamubc commented Jun 1, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Summary

Closes / References

Testing infrastructure

Developer tooling

Windows & cross-platform hardening

Internal / non-breaking

Verification

Uh oh!

gemini-code-assist Bot left a comment

Choose a reason for hiding this comment

Code Review

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Copilot AI left a comment

Choose a reason for hiding this comment

Pull request overview

Reviewed changes

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Copilot AI left a comment

Choose a reason for hiding this comment

Pull request overview

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Copilot AI left a comment

Choose a reason for hiding this comment

Pull request overview

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Copilot AI left a comment

Choose a reason for hiding this comment

Pull request overview

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

jamubc commented Jun 1, 2026

Review feedback addressed

Uh oh!

jamubc commented Jun 1, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Thanks to contributors

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

jamubc commented Jun 1, 2026 •

edited

Loading

jamubc commented Jun 1, 2026 •

edited

Loading