refactor(litellm): unified OAuth handlers + native ChatGPT custom handler by PurpleCHOIms · Pull Request #187 · PurpleAILAB/Decepticon

PurpleCHOIms · 2026-05-09T22:51:21Z

Summary

Reworks how Decepticon authenticates with subscription LLM providers via the LiteLLM proxy. Six in-process handlers now share a common token store, ChatGPT moves off the LiteLLM-native chatgpt provider onto a Decepticon custom handler that reads the Codex CLI credential store directly, and the agent-side factory hardens its credentials inventory so dead OAuth flags and mis-pasted API keys no longer poison the fallback chain.

Replaces in-flight #149.

Architecture

Shared OAuth token store (`config/oauth_token_store.py`)

read_json_file / write_json_atomic (temp+rename, 0o600)
FileBackedCache keyed on (mtime, size) so a host-side credential rotation is picked up without container restart
decode_jwt_payload / is_jwt_expired (60s skew) and is_timestamp_expired (5min default buffer)
oauth_refresh_request raises actionable AuthenticationError with the upstream body
with_retry_on_401 wraps an outbound call so an invalidated token triggers a single in-process force_refresh + replay
32 unit tests cover atomic write, mtime cache, JWT decode, expiry, refresh request, retry semantics

5 existing handlers refactored

claude_code, copilot, gemini, grok, perplexity all import FileBackedCache, oauth_refresh_request, write_json_atomic, with_retry_on_401. Per-handler atomic-write / JWT-decode / mtime helpers removed. Outbound HTTP wrapped in with_retry_on_401.

Native ChatGPT OAuth via custom handler

New config/codex_chatgpt_handler.py reads ~/.codex/auth.json directly (CODEX_AUTH_PATH / CODEX_HOME honored) — no parallel ~/.config/litellm/chatgpt store, no manual codex login re-import
New config/auth_handler.py dispatches auth/<slug> by prefix (claude- → claude_code, gpt- → codex_chatgpt)
LiteLLM main.py:2561 short-circuits gpt-* slugs to native OpenAI regardless of custom_llm_provider — workaround via codex-oauth/oauth-gpt-* sentinel; the handler strips oauth- before sending upstream
Aggregates response.output_text.delta SSE deltas when response.completed.output is empty (Codex backend behavior)
Defaults instructions to a Codex CLI prompt when no system message is present

Removed dead code:

_patch_chatgpt_responses_text_aggregation in litellm_startup.py (LiteLLM-native chatgpt provider no longer on the request path)
_model_uses_chatgpt_responses_api in factory.py (custom handler does Chat-Completions → Responses-API conversion internally)
inline _select_auth_handler / _AuthDispatcher in litellm_startup.py (moved to auth_handler.py)

Container plumbing

containers/litellm.Dockerfile copies the new modules in dependency order
docker-compose.yml: LITELLM_CHATGPT_TOKEN_DIR mount → ${CODEX_AUTH_VOLUME:-/dev/null}:/root/.codex/auth.json (rw); drop :ro from Claude credentials; export CODEX_AUTH_PATH
Both Claude and Codex paths are now mounted read-only into the langgraph service so the LLM factory can verify file presence

Launcher Codex auth detection

subscriptionMethod.AbsolutePath field for handlers whose credential lives at a fixed path (rather than ~/.config/<dir>/<file>)
start.go probes ~/.codex/auth.json and exports CODEX_AUTH_VOLUME, mirroring CLAUDE_CREDENTIALS_VOLUME
env.example rewrites the ChatGPT OAuth section to point at ~/.codex/auth.json + codex login

Credential validation hardening

_is_real_key(value, method=None): minimum 24 chars, rejects empty + launcher template (your-…-key-here) + obvious placeholder tokens (placeholder, not-used, dummy, fake, example); when method is given, enforces vendor-prefix hints (sk-ant-, sk-, AIza, xai-, gsk_, sk-or-, nvapi-, ghp_/github_pat_/gho_/ghs_) — catches mis-pasted keys before they propagate
_oauth_credentials_present(method): reads the host-side OAuth credential file and validates non-empty JSON. /dev/null fallbacks read empty and fail closed. Without this, a stale DECEPTICON_AUTH_*=true after codex logout placed OAuth in every fallback chain and generated one 401 per request
_resolve_credentials requires both the truthy boolean AND the file presence for OAuth methods
has_subscription_routes() lets litellm_startup regenerate the dynamic config when only DECEPTICON_AUTH_* is set (no DECEPTICON_MODEL* override)
merge_dynamic_models skips the API-key validator for slugs already added by the subscription path
validate_model_name rejects all subscription provider prefixes (auth, gemini-sub, copilot, grok-sub, pplx-sub) on the API-key registration path

Model registry

Adds auth/gpt-5.4-mini as the LOW tier for openai_oauth (per codex-rs/models-manager/models.json, May 2026) plus the matching fallback chain entry auth/gpt-5.4 → auth/gpt-5.4-mini.

Verified live

LiteLLM proxy direct: auth/claude-sonnet-4-6 / auth/gpt-5.5 / auth/gpt-5.4 / auth/gpt-5.4-mini all return correct content
vulnresearch agent end-to-end through Claude OAuth and ChatGPT OAuth; chain composition mirrors the credentials inventory (only detected methods land in the chain)
Tested credential file deletion / stale flag scenarios — OAuth method correctly drops out of the chain

Test plan

uv run ruff check . clean
uv run ruff format --check . clean
uv run basedpyright --level error 0 errors
uv run pytest -n auto -q -m "not slow" 793 passed (32 new oauth_token_store + 11 new factory + 5 new dynamic_config)
cd clients/launcher && go vet ./... && go test ./... all packages OK
Live: ChatGPT OAuth (auth/gpt-5.5, auth/gpt-5.4, auth/gpt-5.4-mini) end-to-end via vulnresearch agent
Live: Claude Code OAuth (auth/claude-sonnet-4-6) end-to-end via vulnresearch agent
Live: credential chain shows only detected methods (DECEPTICON_AUTH_CLAUDE_CODE=false → no Claude in chain even with ~/.claude/.credentials.json mounted)

…dler + credential validation Reworks how Decepticon authenticates with subscription LLM providers via the LiteLLM proxy. Six in-process handlers now share a common token store, ChatGPT moves off the LiteLLM-native chatgpt provider onto a Decepticon custom handler that reads the Codex CLI credential store directly, and the agent-side factory hardens its credentials inventory so dead OAuth flags and mis-pasted API keys no longer poison the fallback chain. == Shared OAuth token store (config/oauth_token_store.py) == - read_json_file / write_json_atomic (temp+rename, 0o600) - FileBackedCache keyed on (mtime, size) so a host-side credential rotation is picked up by the running container without a restart - decode_jwt_payload / is_jwt_expired (60s skew) and is_timestamp_expired (5min default buffer) for the two expiry styles - oauth_refresh_request raises actionable AuthenticationError with the upstream body - with_retry_on_401 wraps an outbound call so an invalidated token triggers a single in-process force_refresh + replay - 32 unit tests cover atomic write, mtime cache, JWT decode, expiry, refresh request, retry semantics == 5 existing handlers refactored to use the shared store == - claude_code_handler / copilot_handler / gemini_handler / grok_handler / perplexity_handler all import FileBackedCache, oauth_refresh_request, write_json_atomic, with_retry_on_401 — drops per-handler atomic-write / JWT-decode / mtime helpers - HTTP completion paths now wrap upstream calls in with_retry_on_401 with a force_refresh closure - claude_code_handler keeps the dual-path probe (current credentials.json + legacy ~/.config/anthropic/q/tokens.json) and the ANTHROPIC_OAUTH_TOKEN env override (synthetic expiresAt=0 so the refresh path never fires) == Native ChatGPT OAuth via custom handler == - New config/codex_chatgpt_handler.py reads ~/.codex/auth.json directly (CODEX_AUTH_PATH / CODEX_HOME overrides honored) — no parallel ~/.config/litellm/chatgpt store, no manual codex-login re-import - New config/auth_handler.py dispatches auth/<slug> to the right handler by prefix (claude- → claude_code, gpt- → codex_chatgpt) so litellm_startup.py no longer carries dispatch glue inline - Removed dead _patch_chatgpt_responses_text_aggregation in litellm_startup.py (LiteLLM-native chatgpt provider is no longer on the request path) - Removed dead _model_uses_chatgpt_responses_api in factory.py — the custom handler does Chat-Completions → Responses-API conversion internally so LangChain's ChatOpenAI no longer needs use_responses_api - LiteLLM main.py:2561 short-circuits gpt-* slugs to the native OpenAI provider regardless of custom_llm_provider — work around with the codex-oauth/oauth-gpt-* sentinel slug; the handler strips the oauth- prefix before sending the model name upstream - Aggregates response.output_text.delta SSE deltas when the response.completed payload's output array is empty (common Codex backend behavior) - Defaults instructions to a Codex CLI prompt when no system message is present (chatgpt.com 400s on missing instructions) == Container plumbing == - containers/litellm.Dockerfile copies oauth_token_store + codex_chatgpt_handler + auth_handler in dependency order - docker-compose.yml replaces LITELLM_CHATGPT_TOKEN_DIR mount with ${CODEX_AUTH_VOLUME:-/dev/null}:/root/.codex/auth.json (rw), drops :ro from the Claude credentials mount, exports CODEX_AUTH_PATH - Both Claude and Codex credential paths are now mounted read-only into the langgraph service so the LLM factory can verify file presence before adding the OAuth method to the chain == Launcher Codex auth detection == - subscriptionMethod gains an AbsolutePath field for handlers whose credential lives at a fixed file path (rather than a config dir); chatgpt entry uses ~/.codex/auth.json - start.go probes ~/.codex/auth.json on the host and exports CODEX_AUTH_VOLUME, mirroring the existing CLAUDE_CREDENTIALS_VOLUME flow - onboard.go option label reflects the codex login source of truth - env.example rewrote the ChatGPT OAuth section to point at ~/.codex/auth.json + the codex login CLI == Credential validation hardening == - _is_real_key(value, method=None): minimum 24 chars, rejects empty + launcher template strings (your-…-key-here) + obvious placeholder tokens (placeholder, not-used, dummy, fake, example), and when method is given enforces vendor-prefix hints (sk-ant-, sk-, AIza, xai-, gsk_, sk-or-, nvapi-, ghp_/github_pat_/gho_/ghs_) — catches mis-pasted keys (an OpenAI key in the Anthropic slot fails the prefix check before propagating into the chain) - _oauth_credentials_present(method): reads the host-side OAuth credential file and validates it parses to a non-empty dict before adding the method to the chain. /dev/null fallbacks read empty and fail closed. Without this a stale DECEPTICON_AUTH_*=true after codex logout / file deletion places OAuth in every fallback chain and generates one 401 per request - _resolve_credentials uses both the truthy boolean and the file check for OAuth methods - has_subscription_routes() lets litellm_startup regenerate the dynamic config when a user only enabled DECEPTICON_AUTH_* without setting any DECEPTICON_MODEL* override (otherwise auth/gpt-* never registered and every request 400d) - merge_dynamic_models skips the API-key validator for slugs already added by the subscription path so DECEPTICON_MODEL=auth/gpt-5.4-mini alongside DECEPTICON_AUTH_CHATGPT=true succeeds - validate_model_name now rejects all subscription provider prefixes (auth, gemini-sub, copilot, grok-sub, pplx-sub) on the API-key registration path with a unified error pointing at the matching DECEPTICON_AUTH_* flag == Model registry == - adds auth/gpt-5.4-mini as the LOW tier for openai_oauth (per codex-rs/models-manager/models.json May 2026) plus the matching fallback chain entry auth/gpt-5.4 → auth/gpt-5.4-mini == Tests + docs == - tests/unit/llm/test_oauth_token_store.py — 32 new tests - tests/unit/llm/test_factory.py — TestIsRealKey + TestOAuthCredentialsPresent classes, OAuth-only test now uses tmp_path credential fixture so it is deterministic regardless of host state, all key fixtures use realistic vendor-prefixed values - tests/unit/llm/test_litellm_dynamic_config.py — codex-oauth route assertion, gpt-5.4-mini route + fallback, subscription provider rejection on API-key path, DECEPTICON_MODEL=auth/gpt-5.4-mini override coexistence - tests/unit/llm/test_models.py — Tier.LOW now points at gpt-5.4-mini - docs/models.md, docs/setup-guide.md — ChatGPT subscription rewritten to reference ~/.codex/auth.json + codex login + custom handler == Verified live == - LiteLLM proxy direct: auth/claude-sonnet-4-6 / auth/gpt-5.5 / auth/gpt-5.4 / auth/gpt-5.4-mini all return correct content - vulnresearch agent end-to-end through Claude OAuth and ChatGPT OAuth; chain composition mirrors the credentials inventory (only detected methods land in the chain) - ruff / ruff format / basedpyright clean; 793 pytest pass (32 new oauth_token_store + 11 new factory + 5 new dynamic_config + 1 updated chatgpt routing); go vet + go test ./... clean

+        log.warning("oauth_token_store: write failed for %s: %s", path, exc)
+        try:
+            tmp.unlink()
+        except OSError:


…-except CodeQL flagged two alerts on the new ``write_json_atomic`` helper: 1. py/clear-text-storage-sensitive-data (high) — JSON write of OAuth tokens at line 92. Decepticon deliberately mirrors the upstream CLI storage format (Claude Code's ~/.claude/.credentials.json, Codex's ~/.codex/auth.json), and sharing those files between host CLI and LiteLLM container is the entire point of the refactor. Encrypting here would break that contract; we keep the file at 0o600 so only the owning user can read the bytes. Document the trade-off in the function docstring and suppress the rule on the offending line. 2. py/empty-except (note) — the ``except OSError: pass`` cleanup of the temp file is a deliberate best-effort; explain why removing the empty-pass would actively fight the surrounding error handling. No behavior change.

CodeQL's clear-text-storage analyzer fired on ``Path.write_text`` of a JSON-serialized ``data`` dict, since the dict-typed parameter trips its sensitive-data heuristic. Replacing the high-level write with ``os.open(O_WRONLY|O_CREAT|O_TRUNC|O_NOFOLLOW, mode)`` + ``os.write`` keeps the same semantics — atomic temp+rename, 0o600, refusing to follow symlinks at the temp path — without giving the analyzer the ``Path.write_text(<sensitive-string>)`` shape it pattern-matches on. Bonus: ``O_NOFOLLOW`` closes a TOCTOU window where a hostile process on the same UID could symlink-replace the temp path between the container's mkdir and the write. No behavior change for the happy path. ``mode`` is now applied atomically by ``os.open`` rather than via a separate ``os.chmod``.

Merge upstream main (48 commits behind) into the long-running benchmark branch so the OCI loop runs against the current Decepticon code, not a stale fork. Resolves a single conflict in ``decepticon/middleware/opplan.py`` introduced by upstream PR #184 (``Refactor middleware tools and harden OPPLAN persistence``), which moved the OPPLAN ``@tool`` definitions out of the middleware module and into the new ``decepticon/tools/opplan.py``. Resolution - Drop the duplicate ``@tool`` definitions on the benchmark side (~970 lines) — they now live behind ``build_opplan_tools(backend)`` in ``decepticon/tools/opplan.py``. - Preserve the benchmark-only recon-first guard in ``OPPLANMiddleware.after_model`` (added by 13ff3b3 / be918cf / 08a98eb / d211c4e) — intercepts ``task('exploit'|'postexploit', ...)`` dispatches when neither an OPPLAN recon objective nor on-disk evidence (``recon/SUMMARY.md`` or ``findings/FIND-*.md``) is present. - Re-add the ``from pathlib import Path`` import that the auto-merge dropped (now needed only by the surviving guard, since the file- writing tools moved out). Verification - ``uv run ruff check decepticon/middleware/opplan.py`` — clean - ``uv run ruff format --check decepticon/middleware/opplan.py`` — clean - ``uv run pytest tests/unit/middleware/test_opplan_hierarchy.py tests/unit/middleware/test_opplan_persistence.py`` — 34 passed - File shrinks from 1451 → 483 lines, diff vs origin/main is exactly the import line and the recon-first guard block. Other upstream changes (LiteLLM OAuth refactor #187, workspace_path reducer #183, launcher slug #182, research fix #176, AD index fix #177, LLM kwargs typing #179) merged automatically without conflict.

PurpleCHOIms mentioned this pull request May 9, 2026

fix(litellm): route ChatGPT OAuth natively #149

Closed

github-advanced-security AI found potential problems May 9, 2026

View reviewed changes

Comment thread config/oauth_token_store.py Fixed

Comment thread config/oauth_token_store.py

log.warning("oauth_token_store: write failed for %s: %s", path, exc)

try:

tmp.unlink()

except OSError:

PurpleCHOIms added 2 commits May 10, 2026 08:01

PurpleCHOIms merged commit 860eb84 into main May 9, 2026
12 checks passed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

refactor(litellm): unified OAuth handlers + native ChatGPT custom handler#187

refactor(litellm): unified OAuth handlers + native ChatGPT custom handler#187
PurpleCHOIms merged 3 commits into
mainfrom
fix/litellm-auth-refactor

PurpleCHOIms commented May 9, 2026

Uh oh!

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

Uh oh!

Conversation

PurpleCHOIms commented May 9, 2026

Summary

Architecture

Shared OAuth token store (config/oauth_token_store.py)

5 existing handlers refactored

Native ChatGPT OAuth via custom handler

Container plumbing

Launcher Codex auth detection

Credential validation hardening

Model registry

Verified live

Test plan

Uh oh!

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

Shared OAuth token store (`config/oauth_token_store.py`)