Skip to content

fix(models, oauth): placeholder-key suppression, subscription badge, 401 hint, OAuth flow timeout#7

Merged
bjb2 merged 3 commits into
mainfrom
fix/api-key-vs-subscription-confusion
May 29, 2026
Merged

fix(models, oauth): placeholder-key suppression, subscription badge, 401 hint, OAuth flow timeout#7
bjb2 merged 3 commits into
mainfrom
fix/api-key-vs-subscription-confusion

Conversation

@bjb2
Copy link
Copy Markdown
Owner

@bjb2 bjb2 commented May 29, 2026

Closes #4 and #5.

Two related bug reports from the same demo session, both tied to provider/auth confusion. Bundled in one PR because the second fix (#5) was discovered while reviewing the surface area for #4.


#4gpt-5 routes through API key instead of ChatGPT Plus subscription

A viewer with a ChatGPT Plus subscription tried to send a prompt during a live demo. They picked gpt-5 in the model picker. OpenAI rejected the call with:

401 Incorrect API key provided: sk-your-****************here

sk-your-...here is a common .env.example placeholder. The viewer had it lying around in their env (typical tutorial leftover). The SDK saw a non-empty OPENAI_API_KEY, marked the openai-provider variant of gpt-5 as available, and that beat their actual openai-codex (ChatGPT Plus OAuth) subscription credential in the picker because nothing in the UI distinguished the two.

Root cause: SDK ships overlapping model namespaces. Same gpt-5 exists under openai (API key) AND openai-codex (subscription). hasConfiguredAuth only checks "env var non-empty"; UI didn't distinguish providers.

Fixes:

  • A. Placeholder-key suppression (apps/server/src/credential-quality.ts): looksLikePlaceholderKey matches sk-your-*, sk-XXXX, your-api-key, <your-key>, changeme, unsubstituted shell vars, all-X strings, and per-prefix-family length floors. Wired into modelInfoFromSdk. OAuth credentials in auth.db bypass the check.
  • B. Subscription badge (ModelPickerModal.tsx + ModelInfo.isSubscription): green subscription badge on rows for real consumer-subscription providers. Uses an explicit allowlist (anthropic, openai-codex, github-copilot, cursor, perplexity, coding-plan providers, etc.) — not the SDK's broader getOAuthProviders() which would have falsely badged Ollama, LM Studio, and gateway services.
  • C. 401-recovery hint: bridge listens for SDK notice events with auth-shaped error messages; when current model is API-key and a subscription provider carries the same model id with a real OAuth credential, fires a deck notificationService.notify telling the operator to switch.

#5 — OAuth flow gets stuck "already in progress" forever

User closed the Ollama OAuth modal without typing in an endpoint URL. Subsequent attempts to sign in to Ollama returned 409 {"error":"already-in-flight"} until the deck process was restarted.

Root cause: Ollama's OAuth flow uses onPrompt to ask for the local endpoint URL — not the loopback callback listener that the SDK's DEFAULT_TIMEOUT (5 min) covers. Closing the modal without responding leaves the SDK's login promise pending forever, the flow stays in the flows map, and every subsequent /start 409s.

The pre-fix cancel handler was also incomplete — only rejected manualCode, never iterated promptResolvers. So even hitting Cancel wouldn't have unstuck this case.

Fixes (in apps/server/src/routes-auth-oauth.ts):

  1. 5-min server-side timeout per flow. Force-cancels if not naturally completed. Cleared on natural completion.
  2. Stale-flow eviction on /start. If a duplicate start arrives and the held flow is past 5 min, evict it inline (defensive belt-and-suspenders if the timer somehow lost).
  3. abortFlow helper that drains promptResolvers too (resolves with empty string — rejecting would surface as uncaught inside the SDK). Used by all three teardown paths (cancel, timeout, stale eviction). Idempotent.

Verification

  • 12 new tests: 6 for looksLikePlaceholderKey (44 assertions), 6 for abortFlow cleanup (9 assertions)
  • Full server suite: 173 / 173 pass (was 161 before this PR)
  • Typecheck across all 4 packages: clean
  • Manual sanity: OPENAI_API_KEY=sk-your-XXXXhere correctly hides openai/gpt-5 from default picker view while leaving openai-codex/gpt-5 visible with the subscription badge

Files touched

apps/server/src/credential-quality.ts            (new — #4)
apps/server/src/credential-quality.test.ts       (new — #4)
apps/server/src/bridge/in-process.ts             (#4: placeholder + badge + 401 hint)
apps/server/src/routes-auth-oauth.ts             (#5: timeout + eviction + cleanup)
apps/server/src/routes-auth-oauth.test.ts        (new — #5)
apps/web/src/components/chat/ModelPickerModal.tsx (#4: subscription badge)
packages/protocol/src/index.ts                   (#4: +isSubscription)

Out of scope

  • SDK upstream change to dedupe overlapping model namespaces (would need a pi-ai change)
  • Auto-switching model on 401 (notification-only for now; mid-turn auto-switch risks surprise behavior)
  • Settings → Env input validation (validate-on-paste for placeholder values — separate UX work)
  • Per-flow timeout tuning UI (5 min hardcoded; if users hit this regularly we can expose a knob)

bjb2 added 3 commits May 29, 2026 15:45
…ers, hint on 401 (#4)

Closes the loop on the 'demoed deck rejected a viewer's prompt with
401' bug from issue #4.

Root cause: the SDK ships overlapping model namespaces — `openai`
(uses `OPENAI_API_KEY` against platform.openai.com) and `openai-codex`
(uses the ChatGPT Plus/Pro OAuth subscription). Both expose `gpt-5`,
`gpt-5.1`, `gpt-5.2`, etc. A common `.env.example` placeholder like
`OPENAI_API_KEY=sk-your-...here` in the viewer's environment was
enough for the SDK's `hasConfiguredAuth` to mark the openai-provider
variant as 'available' (it only checks 'env var non-empty'). The
picker then surfaced `openai/gpt-5` next to `openai-codex/gpt-5` with
nothing distinguishing them; the click routed through the API-key
path; OpenAI rejected the placeholder with 401.

Three deck-side fixes:

A. Suppress placeholder API keys when computing model availability.
New `apps/server/src/credential-quality.ts` with
`looksLikePlaceholderKey` — matches `sk-your-*`, `sk-XXXX`,
`your-api-key`, `<your-key>`, `changeme`, unsubstituted shell vars,
all-X strings, and length-too-short-to-be-real keys per provider prefix
family. Wired into `modelInfoFromSdk` in bridge/in-process.ts. OAuth
credentials in auth.db (`isUsingOAuth`) always bypass the check —
they're orthogonal to env values.

B. Badge subscription providers in the model picker. Added
`isSubscription?: boolean` to the ModelInfo protocol type, computed
server-side from the SDK's `getOAuthProviders()` list, rendered as a
green 'subscription' badge in ModelPickerModal. Users can now see at a
glance which `gpt-5` row uses their ChatGPT Plus subscription vs which
uses an API key.

C. 401-recovery hint. Bridge listens for SDK `notice` events with
`level: 'error'` and an auth-shaped message (401, 'incorrect api key',
'unauthorized', etc.). When the current session model is on an API-key
provider AND a subscription provider carries the same model id with a
real OAuth credential, fires a deck notification telling the operator
to switch.

Tests: 6 new credential-quality tests (44 assertions). Full server
suite still green (167/167). Typecheck across 4 packages clean.
Initial PR used the SDK's `getOAuthProviders()` to decide which models
get the 'subscription' badge. That list is broader than 'consumer
subscription'; it includes:

  - Local runtimes:    ollama, lm-studio, vllm
  - Gateway services:  litellm, kilo, cloudflare-ai-gateway, zenmux
  - API-tier providers: cerebras, fireworks, together, huggingface,
                        nanogpt, nvidia, venice, moonshot, ...

None of these are 'subscriptions' in the user-facing sense. Labeling
Ollama as a subscription is actively misleading (it runs on your
laptop).

Replace with an explicit allowlist of providers that genuinely require
a paid consumer subscription:

  anthropic, openai-codex, github-copilot, cursor, perplexity,
  alibaba-coding-plan, zai, minimax-code, minimax-code-cn, kimi-code,
  google-antigravity

Allowlist is intentional — false negatives (missing a real
subscription) are graceful (no badge), false positives are confusing.
When the SDK adds a new subscription-style provider upstream, add it
here.

Also drops the `getOAuthProviders` import that's no longer used and
the broken memoization layer (set is now a module-level const).

Caught by: user noticed 'ollama' was tagged subscription in the picker.
…l cleanup (#5)

Closes #5.

Three related fixes to the OAuth route's flow lifecycle:

1. Server-side 5-minute timeout per flow.

The SDK's own DEFAULT_TIMEOUT (5 min) only fires on the loopback
callback listener — it never fires on flows driven by `onPrompt` (e.g.
ollama's 'where's your local endpoint?' prompt). Result: closing the
ollama OAuth modal without typing a URL leaves the SDK's login promise
pending forever, the flow stays in the deck's `flows` map, and every
subsequent `POST /:provider/start` returns 409 'already in progress'
until the deck process restarts.

The deck now schedules its own `setTimeout(... OAUTH_FLOW_MAX_MS)` per
flow that force-cancels and broadcasts `oauth_failed` if the flow
hasn't naturally completed in time. Cleared on natural completion via
`.finally()`.

2. Stale-flow eviction on `/start`.

Defensive belt-and-suspenders: if a duplicate `start` arrives and the
held flow is past OAUTH_FLOW_MAX_MS, the timer should already have
fired — but evict it inline here too so a wedged flow (e.g. timer
somehow lost, event loop starved) can't block new attempts
indefinitely.

3. `cancel` cleans up promptResolvers too (latent bug).

The pre-fix `cancel` handler only aborted the AbortController and
rejected `manualCode`. It never touched `promptResolvers`, so
cancelling a flow blocked on an `onPrompt` (ollama's endpoint input)
left the SDK's promise pending and the flow effectively un-cleaned.

Factored the teardown into a single `abortFlow(flow, reason)` helper
used by all three paths (cancel, timeout, stale eviction). The helper:
- Aborts the SDK AbortController
- Clears the lifetime timer
- Rejects `consentReady` and `manualCode` deferreds
- Resolves all pending `promptResolvers` with empty string (rejecting
  would surface as an uncaught error inside the SDK's onPrompt caller)
- Removes from both `flows` and `flowsById` maps
- Is idempotent — second call is a no-op

Tests: 6 new tests for the cleanup helper covering pending-deferred
rejection, promptResolvers drain, AbortController abort, idempotency,
and survival across a throwing resolver. Full server suite still green
(173/173).
@bjb2 bjb2 changed the title fix(models): suppress placeholder API keys, badge subscription providers, hint on 401 fix(models, oauth): placeholder-key suppression, subscription badge, 401 hint, OAuth flow timeout May 29, 2026
@bjb2 bjb2 merged commit 676168a into main May 29, 2026
4 checks passed
@bjb2 bjb2 deleted the fix/api-key-vs-subscription-confusion branch May 29, 2026 21:21
bjb2 added a commit that referenced this pull request May 29, 2026
…fixes

See CHANGELOG.md for the full release notes. Bumps all 4 package.json
versions to 0.6.0 in lockstep (root, server, web, protocol). Adds
docs/upgrading.md as the canonical place for version-by-version
migration notes — explicitly documents that the new onboarding wizard
silently settles for existing users (any session OR moved welcome task
short-circuits the gate), so this is a non-breaking upgrade.

Headline changes since v0.5.0:
- Onboarding wizard at /onboarding for fresh installs (#9)
- Model picker subscription badge + placeholder-key suppression +
  401-recovery hint (#7, closes #4)
- OAuth flow 5-min timeout + stale eviction + drained cancel (#7,
  closes #5)
- Bun executable PATH fallback (#8, closes #6)
- README install section rewritten + LF normalization + private
  template stash on pack

177 -> 181 server tests (4 new from user's parallel Windows
rename-fix WIP that landed; my onboarding work adds 4 of those).
Typecheck clean across 4 packages.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

omp-deck or omp codex provider is trying to pull an API key rather than leveraging subscription

1 participant