fix(serve-http): production hardening for PaaS proxies + DCR rate limiter + Supabase pool docs#759
Open
knee5 wants to merge 3 commits intogarrytan:masterfrom
Open
fix(serve-http): production hardening for PaaS proxies + DCR rate limiter + Supabase pool docs#759knee5 wants to merge 3 commits intogarrytan:masterfrom
knee5 wants to merge 3 commits intogarrytan:masterfrom
Conversation
`trust proxy: 'loopback'` only trusts 127.0.0.1/::1, so PaaS proxies (Fly.io, Render, Heroku, Railway) whose internal hops are on private RFC-1918 addresses are never trusted. Every external client then appears to share the same IP (the PaaS internal hop's address), which makes ccRateLimiter, adminAuthRateLimiter, and registerRateLimiter (added in the next commit) share a single rate-limit bucket across all clients — effectively bypassing them. Default changes to `1` (trust exactly one upstream hop), the canonical Fly.io / standard PaaS pattern. Self-hosted operators who sit behind no proxy should set GBRAIN_TRUST_PROXY=false; multi-hop CDN setups can set 2 or 3. Ref: https://expressjs.com/en/guide/behind-proxies.html Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
When --enable-dcr is active, the SDK's mcpAuthRouter mounts /register with no rate limiter. Combined with the open CORS policy on /register, any internet client can flood it. Each registration hashes a secret, INSERTs into oauth_clients, and writes to the database. At typical Supabase transaction-pooler sizes a sustained flood exhausts the pool and causes 503s for all other requests. Adds a 10-req/min/IP limiter scoped to /register, inserted before app.use(authRouter) so it fires regardless of SDK routing internals. The limiter is a no-op when --enable-dcr is not passed (SDK never mounts /register). Depends on the trust-proxy fix in the previous commit: without correct client IPs, all registrations share one rate-limit bucket and the limiter is trivially bypassed. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
…ments v0.30.1's dual-pool routing tries to connect to the Postgres direct host (only reachable inside Supabase's VPC). External deployments (Fly.io, Render, Railway, VPS) time out or fail at startup with no clear error. The fix — GBRAIN_DISABLE_DIRECT_POOL=1 + GBRAIN_POOL_SIZE=1 with the transaction pooler URL on port 6543 — is undocumented and counterintuitive. Every Supabase user deploying externally hits this. Adds a "Supabase Deployment Caveat" section to docs/mcp/DEPLOY.md with: - required env vars and why - correct connection string (port 6543, not 5432) - minimal Fly.io fly.toml env block showing GBRAIN_TRUST_PROXY=1 alongside Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Summary
Three security findings from a production deployment audit of v0.30.1:
trust proxy: 'loopback'makes all rate limiters trivially bypassable on PaaS deployments (Fly.io, Render, Heroku, Railway)/register(DCR) endpoint has no rate limiter, enabling database pool exhaustion when--enable-dcris onGBRAIN_DISABLE_DIRECT_POOLis undocumented, causing silent startup failures for all Supabase users deploying externallyEach fix is a separate commit so they can be cherry-picked independently.
Issue 1 —
trust proxy: 'loopback'is wrong for PaaS deploymentsBefore:
app.set('trust proxy', 'loopback')— only trusts 127.0.0.1/::1.Problem: PaaS proxies (Fly.io, Render, Heroku, Railway) route requests through private RFC-1918 hops, not loopback. With
'loopback', Express never trusts theX-Forwarded-Forheader these providers inject, soreq.ipresolves to the PaaS internal hop's address. Every external client appears to originate from the same IP address.Downstream effects:
ccRateLimiteron/token,adminAuthRateLimiteron/admin/auth/:token, and the newregisterRateLimiteron/registerall share a single bucket per process. An attacker can send 50 token requests in 15 minutes from any IP — the bucket covers all clients globally, effectively bypassing the limiter.After: Defaults to
1(trust exactly one upstream hop — the canonical Fly.io pattern and what Express recommends for single-proxy PaaS). Configurable viaGBRAIN_TRUST_PROXYenv var:1(default)2,3, ...falsetrueOperators running behind a local Caddy/Tailscale reverse proxy on the same host should set
GBRAIN_TRUST_PROXY=false.Ref: Express docs — behind proxies
Issue 2 —
/register(DCR) has no rate limiterProblem: When
--enable-dcris on (required for ChatGPT and Claude.ai Custom Connectors), the SDK'smcpAuthRoutermounts/registerwith no rate limiting. Combined with the open CORS policy already present on/register, any internet client can flood it. Each registration: hashes a secret, INSERTs intooauth_clients, writes to the database. At Supabase transaction-pooler sizes (5–15 connections typical), a sustained flood exhausts the pool and causes 503s for all other requests.Fix: Adds
registerRateLimiter(10 registrations/minute/IP) mounted on/registerbeforeapp.use(authRouter), so it fires regardless of SDK routing internals. No-op when--enable-dcris not passed (SDK never mounts/register).Explicit dependency on Issue 1: Without correct client IPs from a fixed trust-proxy setting, all registrations share one rate-limit bucket and the limiter is trivially bypassed. Both commits must be deployed together to be effective.
Issue 3 —
GBRAIN_DISABLE_DIRECT_POOLundocumented for Supabase usersv0.30.1's dual-pool routing tries to open a second connection to the direct Postgres host (
db.<project-ref>.supabase.co:5432) which is only reachable inside Supabase's VPC. External deployments (Fly.io, Render, Railway, VPS) time out or receive connection-refused at startup with no clear error message. The workaround —GBRAIN_DISABLE_DIRECT_POOL=1+GBRAIN_POOL_SIZE=1+ transaction pooler URL on port 6543 — was absent from all documentation.Adds a "Supabase Deployment Caveat" section to
docs/mcp/DEPLOY.mdwith the required env vars, correct connection string, and a minimalfly.tomlenv block showingGBRAIN_TRUST_PROXY=1alongside.A future enhancement could auto-detect pooler hosts (
*.pooler.supabase.com) and emit a startup warning whenGBRAIN_DISABLE_DIRECT_POOLisn't set — that's left as follow-on work; the doc fix alone closes the immediate gap.Testing
Issue 1: Deploy behind a single PaaS proxy (Fly.io or Render). Before this patch,
req.ipon/tokenreturns the PaaS internal address for all clients. After, it returns real client IPs. Automated unit testing requires a mock proxy chain — none currently exists in the test suite, so integration verification requires an actual PaaS deploy.Issue 2: With
--enable-dcr, send >10 POST /register requests/minute from one IP. Before this patch, all succeed. After, requests 11+ receive HTTP 429. Depends on Issue 1 for correct IP resolution.Issue 3: Documentation-only change. Verify by deploying against Supabase without
GBRAIN_DISABLE_DIRECT_POOL(fails at startup), then with it set (succeeds).Type-check:
bun x tsc --noEmitpasses with zero errors on both patched files.Discovered during
Audit of v0.22 → v0.30 migration and production deployment on Fly.io (2026-05-08). The pool-exhaustion failure mode was observed in the field; the rate-limiter bypass was confirmed analytically.
Branch note: Pushed from
knee5/gbrain(a GitHub fork of this repo, public). Commit author metadata reflects the fork owner. The branch contains only the three upstream patches — no private content.Co-Authored-By: Claude Opus 4.7 (1M context) noreply@anthropic.com
Need help on this PR? Tag
@codesmithwith what you need.