Skip to content

fix: add missing items field to two array schemas (Gemini Pro strict JSON Schema compat)#910

Open
DmitryBMsk wants to merge 2 commits into
garrytan:masterfrom
DmitryBMsk:fix/array-schema-missing-items
Open

fix: add missing items field to two array schemas (Gemini Pro strict JSON Schema compat)#910
DmitryBMsk wants to merge 2 commits into
garrytan:masterfrom
DmitryBMsk:fix/array-schema-missing-items

Conversation

@DmitryBMsk
Copy link
Copy Markdown

@DmitryBMsk DmitryBMsk commented May 12, 2026

Problem

Two array properties in src/core/ are declared with type: 'array' but no items field, which strict JSON Schema validators reject. Most notably, Gemini Pro's tool-call schema validator throws:

LLM request rejected: Invalid schema for function 'gbrain__extract_facts':
In context=('properties', 'entity_hints'), array schema missing items.

Repro

  1. Run gbrain serve as a stdio MCP child of any host that uses Gemini Pro as its chat model (OpenClaw 2026.5.4 with mcp.servers.<name>.command config is one example; ChatGPT Tool calls and other strict-schema LLM gateways behave the same).
  2. The daemon enumerates tools and forwards schemas to the LLM.
  3. The whole gbrain MCP surface (60 tools) becomes unavailable for that session because the LLM API rejects the request before any tool can be called.

End-to-end repro on my live OpenClaw deployment: bundle-mcp logs show failed to start server "gbrain": McpError: MCP error -32000: Connection closed, and the Telegram bot returns the Invalid schema for function 'gbrain__extract_facts' message verbatim. After the patch is applied locally and the stdio child is restarted, the bot enumerates all 60 gbrain__* tools and successfully calls e.g. gbrain__get_stats returning real numbers from Supabase.

Fix

Add the missing items field to both schemas. Behavior unchanged — these are JSON Schema metadata, not runtime contracts.

src/core/operations.tsextract_facts.params.entity_hints

The description already says "canonical entity slugs" and the handler casts the value via p.entity_hints as string[]. Schema now matches:

-    entity_hints: { type: 'array', description: 'Existing canonical entity slugs ...' },
+    entity_hints: { type: 'array', items: { type: 'string' }, description: 'Existing canonical entity slugs ...' },

src/core/resolvers/builtin/x-api/handle-to-tweet.tsoutputSchema.candidates

The TypeScript interface XTweetCandidate already defines the per-element shape; the JSON Schema now spells the same fields out for any consumer that runs a JSON Schema validator on the resolver output:

-      candidates: { type: 'array' },
+      candidates: {
+        type: 'array',
+        items: {
+          type: 'object',
+          properties: {
+            tweet_id: { type: 'string' },
+            text: { type: 'string' },
+            created_at: { type: 'string', format: 'date-time' },
+            score: { type: 'number' },
+            url: { type: 'string', format: 'uri' },
+          },
+          required: ['tweet_id', 'text', 'created_at', 'score', 'url'],
+        },
+      },

Why not a typedef-driven generator?

Could be a future improvement (run the same TS interface through a to-json-schema step) — out of scope for this fix. Both schemas now match the TypeScript types that already exist alongside them.

Test plan

  • Reproduced the gbrain__extract_facts rejection on a live OpenClaw 2026.5.4 + Gemini Pro deployment (private OCI setup).
  • Applied the patch, restarted the OpenClaw stdio child, confirmed bundle-mcp no longer errors and all 60 gbrain__* tools are surfaced into the LLM tool inventory.
  • Confirmed gbrain__get_stats and gbrain__search round-trip successfully end-to-end (host → MCP → Supabase) after the patch.
  • Repo-level test suite — I trust the existing CI to catch regressions; both diffs are pure JSON Schema metadata, no code path changes.

Notes

  • I grepped for other type: 'array' declarations without items in src/; only these two showed up in master at the time of writing (commit 17b190e).
  • Other MCP libraries may be more lenient — Claude Sonnet 4.6 accepts the unfixed schema, which is likely why this slipped through. Gemini Pro / Google's strict validator catches it.

View in Codesmith
Need help on this PR? Tag @codesmith with what you need.

  • Let Codesmith autofix CI failures and bot reviews

… Schema compat)

Two array properties were declared with type: 'array' but no items field, which
Gemini Pro's strict JSON Schema validator rejects:

  Invalid schema for function 'gbrain__extract_facts': In context=('properties',
  'entity_hints'), array schema missing items.

Effect on real deployments: when OpenClaw 2026.5.4 (or any host that uses Gemini
Pro as the chat model) registers gbrain via stdio MCP, the daemon spawns
'gbrain serve', enumerates tools, and forwards their JSON Schemas to the LLM
API. The schema is rejected on every request, blocking the entire gbrain__*
tool surface (60 tools) for that session.

Fixes:
- src/core/operations.ts: entity_hints (extract_facts input) gets items: { type: 'string' }
  — matches the existing description ('canonical entity slugs') and runtime
  cast 'p.entity_hints as string[]'.
- src/core/resolvers/builtin/x-api/handle-to-tweet.ts: candidates output gets
  items matching XTweetCandidate interface (tweet_id, text, created_at, score, url).

No runtime behavior change. Schema metadata only.
…chemas

Adds invariant tests that walk every operation inputSchema and every builtin
resolver inputSchema/outputSchema, collecting paths where { type: 'array' }
lacks an items field. The arrays.length === 0 assertion is the regression
guard — it would have caught both schemas fixed in the previous commit, and
will catch any future drift on the same class of bug.

- test/mcp-tool-defs.test.ts: walks buildToolDefs(operations). Catches input
  schema arrays missing items (e.g. extract_facts.entity_hints).
- test/resolvers.test.ts: walks xHandleToTweetResolver and urlReachableResolver
  schemas. Catches output schema arrays missing items (e.g. handle-to-tweet
  outputSchema.candidates), which buildToolDefs doesn't cover.

Pure unit tests, no network/db required. Local run: 6 pass mcp-tool-defs, 55
pass resolvers.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant