refactor(litellm): unified OAuth handlers + native ChatGPT custom handler#187
Merged
Conversation
…dler + credential validation
Reworks how Decepticon authenticates with subscription LLM providers via
the LiteLLM proxy. Six in-process handlers now share a common token
store, ChatGPT moves off the LiteLLM-native chatgpt provider onto a
Decepticon custom handler that reads the Codex CLI credential store
directly, and the agent-side factory hardens its credentials inventory
so dead OAuth flags and mis-pasted API keys no longer poison the
fallback chain.
== Shared OAuth token store (config/oauth_token_store.py) ==
- read_json_file / write_json_atomic (temp+rename, 0o600)
- FileBackedCache keyed on (mtime, size) so a host-side credential
rotation is picked up by the running container without a restart
- decode_jwt_payload / is_jwt_expired (60s skew) and is_timestamp_expired
(5min default buffer) for the two expiry styles
- oauth_refresh_request raises actionable AuthenticationError with the
upstream body
- with_retry_on_401 wraps an outbound call so an invalidated token
triggers a single in-process force_refresh + replay
- 32 unit tests cover atomic write, mtime cache, JWT decode, expiry,
refresh request, retry semantics
== 5 existing handlers refactored to use the shared store ==
- claude_code_handler / copilot_handler / gemini_handler / grok_handler
/ perplexity_handler all import FileBackedCache, oauth_refresh_request,
write_json_atomic, with_retry_on_401 — drops per-handler atomic-write
/ JWT-decode / mtime helpers
- HTTP completion paths now wrap upstream calls in with_retry_on_401
with a force_refresh closure
- claude_code_handler keeps the dual-path probe (current credentials.json
+ legacy ~/.config/anthropic/q/tokens.json) and the ANTHROPIC_OAUTH_TOKEN
env override (synthetic expiresAt=0 so the refresh path never fires)
== Native ChatGPT OAuth via custom handler ==
- New config/codex_chatgpt_handler.py reads ~/.codex/auth.json directly
(CODEX_AUTH_PATH / CODEX_HOME overrides honored) — no parallel
~/.config/litellm/chatgpt store, no manual codex-login re-import
- New config/auth_handler.py dispatches auth/<slug> to the right
handler by prefix (claude- → claude_code, gpt- → codex_chatgpt) so
litellm_startup.py no longer carries dispatch glue inline
- Removed dead _patch_chatgpt_responses_text_aggregation in
litellm_startup.py (LiteLLM-native chatgpt provider is no longer
on the request path)
- Removed dead _model_uses_chatgpt_responses_api in factory.py — the
custom handler does Chat-Completions → Responses-API conversion
internally so LangChain's ChatOpenAI no longer needs use_responses_api
- LiteLLM main.py:2561 short-circuits gpt-* slugs to the native OpenAI
provider regardless of custom_llm_provider — work around with the
codex-oauth/oauth-gpt-* sentinel slug; the handler strips the oauth-
prefix before sending the model name upstream
- Aggregates response.output_text.delta SSE deltas when the
response.completed payload's output array is empty (common Codex
backend behavior)
- Defaults instructions to a Codex CLI prompt when no system message is
present (chatgpt.com 400s on missing instructions)
== Container plumbing ==
- containers/litellm.Dockerfile copies oauth_token_store +
codex_chatgpt_handler + auth_handler in dependency order
- docker-compose.yml replaces LITELLM_CHATGPT_TOKEN_DIR mount with
${CODEX_AUTH_VOLUME:-/dev/null}:/root/.codex/auth.json (rw), drops :ro
from the Claude credentials mount, exports CODEX_AUTH_PATH
- Both Claude and Codex credential paths are now mounted read-only into
the langgraph service so the LLM factory can verify file presence
before adding the OAuth method to the chain
== Launcher Codex auth detection ==
- subscriptionMethod gains an AbsolutePath field for handlers whose
credential lives at a fixed file path (rather than a config dir);
chatgpt entry uses ~/.codex/auth.json
- start.go probes ~/.codex/auth.json on the host and exports
CODEX_AUTH_VOLUME, mirroring the existing CLAUDE_CREDENTIALS_VOLUME
flow
- onboard.go option label reflects the codex login source of truth
- env.example rewrote the ChatGPT OAuth section to point at
~/.codex/auth.json + the codex login CLI
== Credential validation hardening ==
- _is_real_key(value, method=None): minimum 24 chars, rejects empty +
launcher template strings (your-…-key-here) + obvious placeholder
tokens (placeholder, not-used, dummy, fake, example), and when
method is given enforces vendor-prefix hints (sk-ant-, sk-, AIza,
xai-, gsk_, sk-or-, nvapi-, ghp_/github_pat_/gho_/ghs_) — catches
mis-pasted keys (an OpenAI key in the Anthropic slot fails the
prefix check before propagating into the chain)
- _oauth_credentials_present(method): reads the host-side OAuth
credential file and validates it parses to a non-empty dict before
adding the method to the chain. /dev/null fallbacks read empty and
fail closed. Without this a stale DECEPTICON_AUTH_*=true after
codex logout / file deletion places OAuth in every fallback chain
and generates one 401 per request
- _resolve_credentials uses both the truthy boolean and the file
check for OAuth methods
- has_subscription_routes() lets litellm_startup regenerate the
dynamic config when a user only enabled DECEPTICON_AUTH_* without
setting any DECEPTICON_MODEL* override (otherwise auth/gpt-* never
registered and every request 400d)
- merge_dynamic_models skips the API-key validator for slugs already
added by the subscription path so DECEPTICON_MODEL=auth/gpt-5.4-mini
alongside DECEPTICON_AUTH_CHATGPT=true succeeds
- validate_model_name now rejects all subscription provider prefixes
(auth, gemini-sub, copilot, grok-sub, pplx-sub) on the API-key
registration path with a unified error pointing at the matching
DECEPTICON_AUTH_* flag
== Model registry ==
- adds auth/gpt-5.4-mini as the LOW tier for openai_oauth (per
codex-rs/models-manager/models.json May 2026) plus the matching
fallback chain entry auth/gpt-5.4 → auth/gpt-5.4-mini
== Tests + docs ==
- tests/unit/llm/test_oauth_token_store.py — 32 new tests
- tests/unit/llm/test_factory.py — TestIsRealKey + TestOAuthCredentialsPresent
classes, OAuth-only test now uses tmp_path credential fixture so it
is deterministic regardless of host state, all key fixtures use
realistic vendor-prefixed values
- tests/unit/llm/test_litellm_dynamic_config.py — codex-oauth route
assertion, gpt-5.4-mini route + fallback, subscription provider
rejection on API-key path, DECEPTICON_MODEL=auth/gpt-5.4-mini
override coexistence
- tests/unit/llm/test_models.py — Tier.LOW now points at gpt-5.4-mini
- docs/models.md, docs/setup-guide.md — ChatGPT subscription rewritten
to reference ~/.codex/auth.json + codex login + custom handler
== Verified live ==
- LiteLLM proxy direct: auth/claude-sonnet-4-6 / auth/gpt-5.5 /
auth/gpt-5.4 / auth/gpt-5.4-mini all return correct content
- vulnresearch agent end-to-end through Claude OAuth and ChatGPT
OAuth; chain composition mirrors the credentials inventory (only
detected methods land in the chain)
- ruff / ruff format / basedpyright clean; 793 pytest pass
(32 new oauth_token_store + 11 new factory + 5 new dynamic_config
+ 1 updated chatgpt routing); go vet + go test ./... clean
| log.warning("oauth_token_store: write failed for %s: %s", path, exc) | ||
| try: | ||
| tmp.unlink() | ||
| except OSError: |
…-except
CodeQL flagged two alerts on the new ``write_json_atomic`` helper:
1. py/clear-text-storage-sensitive-data (high) — JSON write of OAuth
tokens at line 92. Decepticon deliberately mirrors the upstream CLI
storage format (Claude Code's ~/.claude/.credentials.json, Codex's
~/.codex/auth.json), and sharing those files between host CLI and
LiteLLM container is the entire point of the refactor. Encrypting
here would break that contract; we keep the file at 0o600 so only
the owning user can read the bytes. Document the trade-off in the
function docstring and suppress the rule on the offending line.
2. py/empty-except (note) — the ``except OSError: pass`` cleanup of
the temp file is a deliberate best-effort; explain why removing
the empty-pass would actively fight the surrounding error
handling.
No behavior change.
CodeQL's clear-text-storage analyzer fired on ``Path.write_text`` of a JSON-serialized ``data`` dict, since the dict-typed parameter trips its sensitive-data heuristic. Replacing the high-level write with ``os.open(O_WRONLY|O_CREAT|O_TRUNC|O_NOFOLLOW, mode)`` + ``os.write`` keeps the same semantics — atomic temp+rename, 0o600, refusing to follow symlinks at the temp path — without giving the analyzer the ``Path.write_text(<sensitive-string>)`` shape it pattern-matches on. Bonus: ``O_NOFOLLOW`` closes a TOCTOU window where a hostile process on the same UID could symlink-replace the temp path between the container's mkdir and the write. No behavior change for the happy path. ``mode`` is now applied atomically by ``os.open`` rather than via a separate ``os.chmod``.
PurpleCHOIms
added a commit
that referenced
this pull request
May 12, 2026
Merge upstream main (48 commits behind) into the long-running benchmark branch so the OCI loop runs against the current Decepticon code, not a stale fork. Resolves a single conflict in ``decepticon/middleware/opplan.py`` introduced by upstream PR #184 (``Refactor middleware tools and harden OPPLAN persistence``), which moved the OPPLAN ``@tool`` definitions out of the middleware module and into the new ``decepticon/tools/opplan.py``. Resolution - Drop the duplicate ``@tool`` definitions on the benchmark side (~970 lines) — they now live behind ``build_opplan_tools(backend)`` in ``decepticon/tools/opplan.py``. - Preserve the benchmark-only recon-first guard in ``OPPLANMiddleware.after_model`` (added by 13ff3b3 / be918cf / 08a98eb / d211c4e) — intercepts ``task('exploit'|'postexploit', ...)`` dispatches when neither an OPPLAN recon objective nor on-disk evidence (``recon/SUMMARY.md`` or ``findings/FIND-*.md``) is present. - Re-add the ``from pathlib import Path`` import that the auto-merge dropped (now needed only by the surviving guard, since the file- writing tools moved out). Verification - ``uv run ruff check decepticon/middleware/opplan.py`` — clean - ``uv run ruff format --check decepticon/middleware/opplan.py`` — clean - ``uv run pytest tests/unit/middleware/test_opplan_hierarchy.py tests/unit/middleware/test_opplan_persistence.py`` — 34 passed - File shrinks from 1451 → 483 lines, diff vs origin/main is exactly the import line and the recon-first guard block. Other upstream changes (LiteLLM OAuth refactor #187, workspace_path reducer #183, launcher slug #182, research fix #176, AD index fix #177, LLM kwargs typing #179) merged automatically without conflict.
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Summary
Reworks how Decepticon authenticates with subscription LLM providers via the LiteLLM proxy. Six in-process handlers now share a common token store, ChatGPT moves off the LiteLLM-native
chatgptprovider onto a Decepticon custom handler that reads the Codex CLI credential store directly, and the agent-side factory hardens its credentials inventory so dead OAuth flags and mis-pasted API keys no longer poison the fallback chain.Replaces in-flight #149.
Architecture
Shared OAuth token store (
config/oauth_token_store.py)read_json_file/write_json_atomic(temp+rename, 0o600)FileBackedCachekeyed on(mtime, size)so a host-side credential rotation is picked up without container restartdecode_jwt_payload/is_jwt_expired(60s skew) andis_timestamp_expired(5min default buffer)oauth_refresh_requestraises actionableAuthenticationErrorwith the upstream bodywith_retry_on_401wraps an outbound call so an invalidated token triggers a single in-process force_refresh + replay5 existing handlers refactored
claude_code,copilot,gemini,grok,perplexityall importFileBackedCache,oauth_refresh_request,write_json_atomic,with_retry_on_401. Per-handler atomic-write / JWT-decode / mtime helpers removed. Outbound HTTP wrapped inwith_retry_on_401.Native ChatGPT OAuth via custom handler
config/codex_chatgpt_handler.pyreads~/.codex/auth.jsondirectly (CODEX_AUTH_PATH/CODEX_HOMEhonored) — no parallel~/.config/litellm/chatgptstore, no manualcodex loginre-importconfig/auth_handler.pydispatchesauth/<slug>by prefix (claude-→ claude_code,gpt-→ codex_chatgpt)main.py:2561short-circuitsgpt-*slugs to native OpenAI regardless ofcustom_llm_provider— workaround viacodex-oauth/oauth-gpt-*sentinel; the handler stripsoauth-before sending upstreamresponse.output_text.deltaSSE deltas whenresponse.completed.outputis empty (Codex backend behavior)instructionsto a Codex CLI prompt when no system message is presentRemoved dead code:
_patch_chatgpt_responses_text_aggregationinlitellm_startup.py(LiteLLM-native chatgpt provider no longer on the request path)_model_uses_chatgpt_responses_apiinfactory.py(custom handler does Chat-Completions → Responses-API conversion internally)_select_auth_handler/_AuthDispatcherinlitellm_startup.py(moved toauth_handler.py)Container plumbing
containers/litellm.Dockerfilecopies the new modules in dependency orderdocker-compose.yml:LITELLM_CHATGPT_TOKEN_DIRmount →${CODEX_AUTH_VOLUME:-/dev/null}:/root/.codex/auth.json(rw); drop:rofrom Claude credentials; exportCODEX_AUTH_PATHLauncher Codex auth detection
subscriptionMethod.AbsolutePathfield for handlers whose credential lives at a fixed path (rather than~/.config/<dir>/<file>)start.goprobes~/.codex/auth.jsonand exportsCODEX_AUTH_VOLUME, mirroringCLAUDE_CREDENTIALS_VOLUMEenv.examplerewrites the ChatGPT OAuth section to point at~/.codex/auth.json+codex loginCredential validation hardening
_is_real_key(value, method=None): minimum 24 chars, rejects empty + launcher template (your-…-key-here) + obvious placeholder tokens (placeholder,not-used,dummy,fake,example); whenmethodis given, enforces vendor-prefix hints (sk-ant-,sk-,AIza,xai-,gsk_,sk-or-,nvapi-,ghp_/github_pat_/gho_/ghs_) — catches mis-pasted keys before they propagate_oauth_credentials_present(method): reads the host-side OAuth credential file and validates non-empty JSON./dev/nullfallbacks read empty and fail closed. Without this, a staleDECEPTICON_AUTH_*=trueaftercodex logoutplaced OAuth in every fallback chain and generated one 401 per request_resolve_credentialsrequires both the truthy boolean AND the file presence for OAuth methodshas_subscription_routes()letslitellm_startupregenerate the dynamic config when onlyDECEPTICON_AUTH_*is set (noDECEPTICON_MODEL*override)merge_dynamic_modelsskips the API-key validator for slugs already added by the subscription pathvalidate_model_namerejects all subscription provider prefixes (auth,gemini-sub,copilot,grok-sub,pplx-sub) on the API-key registration pathModel registry
Adds
auth/gpt-5.4-minias the LOW tier foropenai_oauth(percodex-rs/models-manager/models.json, May 2026) plus the matching fallback chain entryauth/gpt-5.4→auth/gpt-5.4-mini.Verified live
auth/claude-sonnet-4-6/auth/gpt-5.5/auth/gpt-5.4/auth/gpt-5.4-miniall return correct contentTest plan
uv run ruff check .cleanuv run ruff format --check .cleanuv run basedpyright --level error0 errorsuv run pytest -n auto -q -m "not slow"793 passed (32 new oauth_token_store + 11 new factory + 5 new dynamic_config)cd clients/launcher && go vet ./... && go test ./...all packages OKDECEPTICON_AUTH_CLAUDE_CODE=false→ no Claude in chain even with~/.claude/.credentials.jsonmounted)