feat: per-key daily rate limits with whitelist tier by wms2537 · Pull Request #34 · Quantum3-Labs/ARBuilder

wms2537 · 2026-05-01T16:38:27Z

Summary

Adds enforced per-key daily rate limits across all paid endpoints (chat, REST tools, MCP), with a 3-tier system (free / pro / unlimited) configurable per-key from the admin dashboard. Counters reset at UTC midnight.

Closes the gap where the README claimed "100 calls/day" but no code enforced it.

Tiers

Tier	Chat / day	Tool / day	Worst-case spend per key
`free` (default)	30	100	~$1.50
`pro`	300	1000	~$15
`unlimited`	10000	10000	uncapped-ish

Numbers tuned against openai/gpt-oss-120b pricing. Tiers live in code (apps/web/src/lib/rateLimit.ts) so they can be tuned without a migration; the schema only stores the tier name.

Storage

KV counters at rl:{subject}:{category}:{YYYY-MM-DD} with 48h TTL. Subject is key:{keyId} for API-key auth and user:{userId} for session auth (playground). Admin Bearer auth bypasses entirely.

KV is eventually consistent across regions — small overshoot under bursty traffic is acceptable for a per-day quota.

Enforcement points

POST /api/v1/chat/completions — 429 returns OpenAI-shape {error:{type:"rate_limit_exceeded"}}
All 18 POST /api/v1/tools/* routes — 429 returns {error, type, limit, used, resetSeconds, tier}
POST /mcp tools/call — 429 returns JSON-RPC error code -32002

Every response (200 and 429) carries X-RateLimit-Limit, X-RateLimit-Remaining, X-RateLimit-Reset, X-RateLimit-Tier. 429s additionally carry Retry-After: <seconds>.

What's new

Migration apps/web/migrations/0005_rate_limits.sql — ALTER TABLE api_keys ADD COLUMN rate_limit_tier TEXT NOT NULL DEFAULT 'free'
Library apps/web/src/lib/rateLimit.ts — enforceRateLimit(), checkToolRateLimit(), subjectFor(), tier table
Auth validateRequest.ts returns rateLimitTier for API-key auth; MCP validateApiKey() likewise
Admin API apps/web/src/app/api/admin/rate-limits/route.ts — GET (list keys + 24h call counts), PATCH (bump tier). Gated by X-Admin-Secret
Admin UI /dashboard/admin adds a "Rate Limits" tab with a searchable/tier-filterable table and per-key dropdown. Reuses the existing AUTH_SECRET flow
User UI /dashboard/keys shows a tier badge on each key (read-only)
Docs docs/api/chat-completions.md rate-limits section, README.md, CLAUDE.md

Deployment status

Migration 0005_rate_limits.sql already applied to remote D1 arbbuilder
Worker deployed (version 20f85e74-ea76-4e91-98ef-0ecea741b079)
Verified live with a real arb_ key:
- Chat 200 → X-RateLimit-Limit: 30, Remaining: 29, Tier: free
- Tool 200 → X-RateLimit-Limit: 100, Remaining: 99, Tier: free

Test plan

cd apps/web && npm test → 20/20 pass
npx tsc --noEmit clean
Hit chat endpoint with valid key → 200, headers present, counter increments per call
Hit any /api/v1/tools/* → 200, headers present, counter increments
Bump a key to pro from /dashboard/admin (Rate Limits tab) → next call shows X-RateLimit-Limit: 300
Confirm /dashboard/keys shows a tier badge

🤖 Generated with Claude Code

Adds tier-based daily quotas to /api/v1/chat/completions, every /api/v1/tools/* route, and /mcp tools/call. Counters live in KV (rl:{subject}:{category}:{YYYY-MM-DD}, 48h TTL); admin Bearer auth bypasses; session auth counts per user under the free tier. Tiers (code-defined, tunable without migration): free -> 30 chat / 100 tool per day pro -> 300 chat / 1000 tool per day unlimited -> 10K / 10K (effectively uncapped) Schema: 0005_rate_limits.sql adds api_keys.rate_limit_tier (default 'free'). Admin UX: new "Rate Limits" tab on /dashboard/admin lists all keys with 24h call counts and a tier dropdown. Backed by GET/PATCH /api/admin/rate-limits (X-Admin-Secret). User-facing: /dashboard/keys shows a tier badge per key. Headers on every response: X-RateLimit-Limit, -Remaining, -Reset, -Tier. 429 also carries Retry-After. Chat returns OpenAI-shape { error: { type: "rate_limit_exceeded" }}; tool routes return JSON 429; MCP returns JSON-RPC error code -32002. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

…total) Both windows must allow a request; whichever is exhausted first triggers 429. Per-minute catches abuse bursts; per-day caps total cost. free -> 100/min, 1000/day pro -> 500/min, 10K/day unlimited -> 10K/min, 1M/day (effectively uncapped) KV keys: rl:{subject}:{category}:m:{YYYY-MM-DDTHH:MM} TTL 120s rl:{subject}:{category}:d:{YYYY-MM-DD} TTL 48h Headers expose both windows (X-RateLimit-Limit-Minute / -Day) plus the canonical bottleneck triplet for clients that only check the standard names. Retry-After uses the denying window. Error messages name the window that denied. Tier names and DB schema unchanged. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

Endpoint reports current chat + tool counters (minute and day windows) plus 24h activity summary from usage_logs. Does not increment any counter, so clients can poll it freely to plan around the limits. For admin Bearer auth, returns tier='unlimited' with empty counters since admin requests bypass enforcement entirely. Session auth gets per-window counters but no recent summary (no key_id to filter usage_logs on). Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

GET /api/keys/usage returns counters and 24h activity for every active key the session user owns; /dashboard/keys polls it every 15s and renders a two-row bar widget per key (chat + tool, minute + day windows) with 24h call count and success rate. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

wms2537 and others added 4 commits May 2, 2026 00:37

wms2537 mentioned this pull request May 8, 2026

feat: per-key CORS origin allowlist for browser embedding #36

Open

7 tasks

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

feat: per-key daily rate limits with whitelist tier#34

feat: per-key daily rate limits with whitelist tier#34
wms2537 wants to merge 4 commits into
mainfrom
feat/rate-limits

wms2537 commented May 1, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

Conversation

wms2537 commented May 1, 2026

Summary

Tiers

Storage

Enforcement points

What's new

Deployment status

Test plan

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant