Skip to content

fix(http): bridge ChatGPT MCP connector + Claude confidential-client OAuth#746

Open
panda850819 wants to merge 3 commits intogarrytan:masterfrom
panda850819:fix/chatgpt-mcp-bridge
Open

fix(http): bridge ChatGPT MCP connector + Claude confidential-client OAuth#746
panda850819 wants to merge 3 commits intogarrytan:masterfrom
panda850819:fix/chatgpt-mcp-bridge

Conversation

@panda850819
Copy link
Copy Markdown

@panda850819 panda850819 commented May 8, 2026

ChatGPT's Custom MCP Connector and Claude (client_secret_post) OAuth both fail end-to-end against gbrain serve --http --enable-dcr today. Twelve targeted patches in serve-http.ts + oauth-provider.ts close the gaps.

Hit while wiring a self-hosted gbrain MCP server (v0.30.0) into ChatGPT's Custom Connector and verifying Claude.ai web + Claude Code stayed working. Each patch was triaged from real edge traces + Postgres oauth_clients/oauth_codes/oauth_tokens introspection rather than guessing from client-side error messages — those messages are systematically misleading (one root cause cycled through "doesn't support DCR" → "DCR endpoint 404" → "doesn't implement OAuth" → "invalid_mcp_response 405").

Verified end-to-end: ChatGPT lists search + fetch and successfully calls search against the brain; Claude.ai web + Claude Code DCR + token exchange + tools/list all green.

Happy to split this into three PRs if you'd prefer (oauth security / chatgpt compat / search-fetch shim) — flagged the natural split below.

What's actually landing

1. OAuth correctness (every confidential or PKCE-only DCR client)

/token pre-hash middleware. SDK clientAuth.js:45 strict-compares client.client_secret returned by clientsStore.getClient() to the request body's client_secret. gbrain's oauth_clients.client_secret_hash column stores sha256 hex; the client holds the plaintext returned at DCR time. The literal string compare always fails for client_secret_post clients (Claude.ai web, Claude Code) → every authorization_code / refresh_token exchange gets 400 invalid_client: \"Invalid client_secret\". SHA-256 the request's plaintext before SDK clientAuth runs so the comparison is hash-vs-hash. Skip for client_credentials grant — gbrain's own handler hashes itself and would double-hash.

getClient() strips client_secret for none clients. Same SDK clientAuth path: it demands a secret whenever client.client_secret is truthy, regardless of token_endpoint_auth_method. PKCE-only public clients (ChatGPT, mcporter, Hermes Agent, codex-cli) registered with none therefore get rejected with \"Client secret is required\". Hide the stored hash so SDK falls through to the PKCE-only path. (Durable fix: registerClient should not generate or store a secret for none clients in the first place — left for a follow-up.)

2. ChatGPT MCP connector compatibility

Catch-all .well-known rewrite middleware. ChatGPT exhaustively probes nine metadata URL variants before considering discovery complete:

oauth-authorization-server  × { root, /.well-known/<doc>/mcp, /mcp/.well-known/<doc> }
openid-configuration        × { same three forms }
oauth-protected-resource    × { same three forms }

Any 404 makes ChatGPT abort and surface a misleading "DCR endpoint 404". Single regex rewrites every variant onto the SDK's canonical paths so the same metadata body answers every probe — beats enumerated alias whack-a-mole.

resourceServerUrl: '/mcp' so PRM resource matches the URL users enter. Both confirmed-working open-source ChatGPT MCP connector references (tae0y/real-estate-mcp + Auth0, GetLarge fastify-mcp + Ory Hydra) publish PRM resource as the /mcp URL. With this set, SDK serves PRM body resource: \"https://<host>/mcp\".

UA-conditional OIDC stub fields. OpenAI's Apps SDK auth doc says ChatGPT accepts OAuth 2.0 metadata or OIDC metadata. Empirically, ChatGPT silently aborts DCR if the AS metadata document lacks subject_types_supported, id_token_signing_alg_values_supported, userinfo_endpoint, jwks_uri — both confirmed-working references are full OIDC providers, not coincidence. Inject these fields via res.json patch, gated on User-Agent matching /aiohttp|openai-mcp/i so non-ChatGPT clients keep clean OAuth 2.1 metadata. SDK's metadata document is a shared singleton, so we clone-before-mutate (otherwise one ChatGPT request would leak OIDC fields into every subsequent client's response).

/userinfo and /.well-known/jwks.json stub routes back the OIDC pointers without changing token semantics. Userinfo returns soft 200 with { sub: \"anonymous\" } (401 reads as auth failure to ChatGPT and aborts token exchange); jwks returns { keys: [] } since gbrain uses opaque tokens, not JWTs.

WWW-Authenticate on /mcp 401 carries resource_metadata= per RFC 9728 / MCP 2025-06-18 authorization spec.

Slash-collapse middleware strips leading // from req.url. gbrain publishes issuer with a trailing slash (URL canonical form), so naive issuer + \"/register\" concat in clients produces //register → Express 404.

GET /mcp returns 200 + idle SSE stream (15s heartbeat) instead of 405. MCP 2025-06-18 §StreamableHTTP permits 405 when no SSE is offered, but ChatGPT's openai-mcp/1.0.0 treats it as invalid_mcp_response fatal error. Bearer-gated so unauth probes still get the spec'd 401 challenge.

DELETE /mcp returns 405 + Allow header instead of Express's default 404.

3. Opt-in ChatGPT search/fetch shim

ChatGPT Connector mode only displays tools named exactly search and fetch with specific input schemas; everything else is silently filtered client-side. With gbrain's 30+ ops surfaced raw, ChatGPT shows zero tools.

agentName.startsWith('ChatGPT') triggers a two-tool mode:

  • tools/list returns only search + fetch with inputSchema matching OpenAI's connector spec.
  • tools/call rewrites fetchget_page and projects results via toChatgptShape():
    • search{ results: [{ id, title, text, url }] }
    • fetch{ id, title, text, url, metadata }
  • Returns both structuredContent (machine-read) and content[].text (legacy JSON string) for max compat.

Other MCP clients see the full op surface unchanged.

Diagnostics

appendFileSync edge-trace logger writes every inbound request to ~/.gbrain/logs/edge-trace.log synchronously, bypassing bun's block-buffered stdout under launchd StandardOutPath. Without this, gbrain serve --http's log file lags real activity by minutes-to-hours and live debug is blind. Cheap (one fs call per request, no formatting). Happy to gate behind --debug-edge-trace if you'd prefer it not be always-on.

Codex review follow-ups (not in this PR)

External review of the diff flagged 5 items I'd address in follow-up PRs once the foundation lands:

  1. DCR registerClient should not generate or store a client_secret_hash for token_endpoint_auth_method='none' clients (durable fix for the PKCE secret-leak workaround).
  2. /mcp should enforce the RFC 8707 resource indicator as token audience (currently stored, not validated).
  3. Search result IDs should encode source_id:slug for cross-source dedup safety.
  4. ChatGPT shim trigger should not rely on client_name.startsWith('ChatGPT') (DCR client_name is client-controlled). Better gated by --enable-chatgpt-compat flag or explicit /mcp/chatgpt route.
  5. Tool definitions should include outputSchema.

Verification

End-to-end against a self-hosted production deploy:

  • DCR + OAuth flow. Fresh ChatGPT custom connector add → POST /register (201) → GET /authorize (302) → POST /token (200) → oauth_tokens row issued with both access + refresh.
  • MCP transport. GET /mcp opens SSE stream with bearer; POST /mcp JSON-RPC dispatches.
  • Tool surface. ChatGPT UI displays search + fetch; search(\"<query>\") returns { results: [...] } populated from gbrain.
  • Claude.ai web + Claude Code. Both regained DCR + token exchange after the /token pre-hash + getClient strip patches (had been silently 401'ing on every confidential-client client_secret_post exchange).

View in Codesmith
Need help on this PR? Tag @codesmith with what you need.

  • Let Codesmith autofix CI failures and bot reviews

…tive_date backfill

Two upstream races blocking 0.28.x → 0.30.0 upgrade on existing Postgres brains:

1. v0_29_1.ts orchestrator Phase B/C call createEngine but skip engine.connect(),
   so the first executeRaw throws "No database connection: connect() has not
   been called". Added explicit connect() in both phases.

2. backfill-effective-date.ts wraps per-batch UPDATEs with executeRaw('BEGIN')
   + executeRaw('SET LOCAL statement_timeout=...') + executeRaw('COMMIT').
   postgres.js routes each executeRaw to a separate pool connection, so the
   BEGIN never wraps anything and postgres.js refuses with UNSAFE_TRANSACTION.
   Removed the BEGIN/COMMIT (per-row UPDATEs are short enough that the
   statement_timeout protection wasn't load-bearing at typical brain scale).

Local fork patch — separate from the schema.sql ADD COLUMN race that was
worked around manually with a one-shot bun script.
…1.1.1-fixwave

# Conflicts:
#	src/commands/migrations/v0_29_1.ts
#	src/commands/serve-http.ts
#	src/core/backfill-effective-date.ts
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant