feat: per-key daily rate limits with whitelist tier#34
Open
wms2537 wants to merge 4 commits into
Open
Conversation
Adds tier-based daily quotas to /api/v1/chat/completions, every
/api/v1/tools/* route, and /mcp tools/call. Counters live in KV
(rl:{subject}:{category}:{YYYY-MM-DD}, 48h TTL); admin Bearer auth
bypasses; session auth counts per user under the free tier.
Tiers (code-defined, tunable without migration):
free -> 30 chat / 100 tool per day
pro -> 300 chat / 1000 tool per day
unlimited -> 10K / 10K (effectively uncapped)
Schema: 0005_rate_limits.sql adds api_keys.rate_limit_tier (default 'free').
Admin UX: new "Rate Limits" tab on /dashboard/admin lists all keys with
24h call counts and a tier dropdown. Backed by GET/PATCH /api/admin/rate-limits
(X-Admin-Secret). User-facing: /dashboard/keys shows a tier badge per key.
Headers on every response: X-RateLimit-Limit, -Remaining, -Reset, -Tier.
429 also carries Retry-After. Chat returns OpenAI-shape
{ error: { type: "rate_limit_exceeded" }}; tool routes return JSON 429;
MCP returns JSON-RPC error code -32002.
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
…total)
Both windows must allow a request; whichever is exhausted first triggers 429.
Per-minute catches abuse bursts; per-day caps total cost.
free -> 100/min, 1000/day
pro -> 500/min, 10K/day
unlimited -> 10K/min, 1M/day (effectively uncapped)
KV keys:
rl:{subject}:{category}:m:{YYYY-MM-DDTHH:MM} TTL 120s
rl:{subject}:{category}:d:{YYYY-MM-DD} TTL 48h
Headers expose both windows (X-RateLimit-Limit-Minute / -Day) plus the
canonical bottleneck triplet for clients that only check the standard
names. Retry-After uses the denying window. Error messages name the
window that denied.
Tier names and DB schema unchanged.
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Endpoint reports current chat + tool counters (minute and day windows) plus 24h activity summary from usage_logs. Does not increment any counter, so clients can poll it freely to plan around the limits. For admin Bearer auth, returns tier='unlimited' with empty counters since admin requests bypass enforcement entirely. Session auth gets per-window counters but no recent summary (no key_id to filter usage_logs on). Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
GET /api/keys/usage returns counters and 24h activity for every active key the session user owns; /dashboard/keys polls it every 15s and renders a two-row bar widget per key (chat + tool, minute + day windows) with 24h call count and success rate. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
7 tasks
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Summary
Adds enforced per-key daily rate limits across all paid endpoints (chat, REST tools, MCP), with a 3-tier system (
free/pro/unlimited) configurable per-key from the admin dashboard. Counters reset at UTC midnight.Closes the gap where the README claimed "100 calls/day" but no code enforced it.
Tiers
free(default)prounlimitedNumbers tuned against
openai/gpt-oss-120bpricing. Tiers live in code (apps/web/src/lib/rateLimit.ts) so they can be tuned without a migration; the schema only stores the tier name.Storage
KV counters at
rl:{subject}:{category}:{YYYY-MM-DD}with 48h TTL. Subject iskey:{keyId}for API-key auth anduser:{userId}for session auth (playground). Admin Bearer auth bypasses entirely.KV is eventually consistent across regions — small overshoot under bursty traffic is acceptable for a per-day quota.
Enforcement points
POST /api/v1/chat/completions— 429 returns OpenAI-shape{error:{type:"rate_limit_exceeded"}}POST /api/v1/tools/*routes — 429 returns{error, type, limit, used, resetSeconds, tier}POST /mcptools/call— 429 returns JSON-RPC error code-32002Every response (200 and 429) carries
X-RateLimit-Limit,X-RateLimit-Remaining,X-RateLimit-Reset,X-RateLimit-Tier. 429s additionally carryRetry-After: <seconds>.What's new
apps/web/migrations/0005_rate_limits.sql—ALTER TABLE api_keys ADD COLUMN rate_limit_tier TEXT NOT NULL DEFAULT 'free'apps/web/src/lib/rateLimit.ts—enforceRateLimit(),checkToolRateLimit(),subjectFor(), tier tablevalidateRequest.tsreturnsrateLimitTierfor API-key auth; MCPvalidateApiKey()likewiseapps/web/src/app/api/admin/rate-limits/route.ts—GET(list keys + 24h call counts),PATCH(bump tier). Gated byX-Admin-Secret/dashboard/adminadds a "Rate Limits" tab with a searchable/tier-filterable table and per-key dropdown. Reuses the existingAUTH_SECRETflow/dashboard/keysshows a tier badge on each key (read-only)docs/api/chat-completions.mdrate-limits section,README.md,CLAUDE.mdDeployment status
0005_rate_limits.sqlalready applied to remote D1arbbuilder20f85e74-ea76-4e91-98ef-0ecea741b079)arb_key:X-RateLimit-Limit: 30, Remaining: 29, Tier: freeX-RateLimit-Limit: 100, Remaining: 99, Tier: freeTest plan
cd apps/web && npm test→ 20/20 passnpx tsc --noEmitclean/api/v1/tools/*→ 200, headers present, counter incrementsprofrom/dashboard/admin(Rate Limits tab) → next call showsX-RateLimit-Limit: 300/dashboard/keysshows a tier badge🤖 Generated with Claude Code