fix(llm): authenticate Ollama Cloud via OLLAMA_CLOUD_API_KEY + add cloud probe by VoidChecksum · Pull Request #441 · PurpleAILAB/Decepticon

VoidChecksum · 2026-05-30T15:15:38Z

Problem

A Discord user on Ollama Cloud (WSL2) is stuck in the Soundwave engagement-planning interview loop on 1.1.3, even after reinstalling. The 1.1.3 changelog fix for the "Soundwave interview loop" only addressed the " (Recommended)" suffix leak (#328) — a different root cause.

The actual bug for Ollama Cloud is an environment-variable name mismatch:

The onboard wizard (clients/launcher/cmd/onboard.go) and setup docs (docs/setup-guide.md) write the key as OLLAMA_CLOUD_API_KEY.
The runtime read OLLAMA_API_KEY in config/litellm_dynamic_config.py and the cloud model resolver.

Nothing connected the two (grep confirms: OLLAMA_CLOUD_API_KEY was only ever written, never read). So a user who completes decepticon onboard for Ollama Cloud sends an empty Bearer token to https://ollama.com/v1 → 401 on every turn → the agent never advances past the Soundwave interview = "stuck in the loop."

Fix

1. Key mismatch (the loop) — read OLLAMA_CLOUD_API_KEY first, fall back to OLLAMA_API_KEY (the official Ollama convention), in:

config/litellm_dynamic_config.py (Bearer auth wiring)
packages/decepticon-core/.../types/llm.py (_resolve_ollama_cloud_model)
packages/decepticon/.../llm/factory.py (_ollama_cloud_configured opt-in — now consistent with the resolver, which already treated the key as an opt-in signal)

No existing setup breaks; the wizard/docs already use the _CLOUD_ form, which now works.

2. Cloud tool-capability probe — the startup probe (config/ollama_probe.py / litellm_startup.py) only ever ran against local Ollama (OLLAMA_API_BASE, host.docker.internal). A cloud-only user (empty local base) short-circuited on "OLLAMA_API_BASE is empty" and got no cloud diagnostic. Added:

probe_cloud() — Bearer-authenticated reachability + /api/show capability check, targeting the native API root (strips the OpenAI-compat /v1 path).
Local/cloud prefix separation so litellm_startup probes each family independently.
Clear remediation lines for a missing key (401) or a non-tool-capable cloud model at boot.

Out of scope (flagged, not changed)

The default cloud model _OLLAMA_CLOUD_DEFAULT_MODEL = "llama3.2" (llm.py) isn't a real Ollama Cloud model — a user who sets base/key but omits OLLAMA_CLOUD_MODEL will 404. The new probe now surfaces this, but the default itself is left unchanged pending a maintainer decision.

Immediate workaround (for affected users, no rebuild)

Edit ~/.decepticon/.env:

OLLAMA_API_KEY=<same value as OLLAMA_CLOUD_API_KEY>
OLLAMA_CLOUD_API_BASE=https://ollama.com/v1
OLLAMA_CLOUD_MODEL=qwen3-coder:480b   # a real tool-capable cloud model, not the llama3.2 default

then decepticon restart.

Verification

+8 unit tests (key precedence, prefix filtering, native-base normalization, probe_cloud auth/401/capability).
Full tests/unit/llm/ + test_auth.py suites: 234 passed, 2 skipped (Windows-only).
ruff check clean; basedpyright 0 errors/0 warnings.

🤖 Generated with Claude Code

…al red-team references A disciplined harvest of the provided reference repos: each was gap-checked against the existing 227-skill KB; only genuine, non-duplicative capability gaps were authored as operator-grade, ATT&CK-mapped playbooks (with authorized-use caveats). Duplicative/meta candidates were dropped, and narrower techniques deferred with rationale (see PR body). Added (source → skill): - PayloadsAllTheThings → exploit/web/php-type-juggling - Strix → exploit/web/header-injection (CRLF + Host-header poisoning / password-reset / X-Forwarded), exploit/web/bfla - hackingBuddyGPT → post-exploit/linux-privesc-enum - 13o-bbr ML-Security → analyst/adversarial-ml-evasion, analyst/ml-model-extraction - HexStrike-AI → reverser/ctf-triage - PentestGPT → decepticon/pentest-task-tree (PTT reasoning) - MCP-Kali-Server → decepticon/kali-mcp-bridge (drives the new MCP client) - CAI (alias robotics)→ iot/ros2-dds-attack, post-exploit/network-replay Dropped as duplicative: crlf-injection (subsumed by header-injection), reva-ghidra-mcp + reva-pyghidra-scripting (reverser/ghidra already covers the Ghidra MCP bridge + scripting), autonomous-pentest-agent-ops + ai-reactive-loop (meta; overlap decepticon/orchestration). Verified: all 11 parse via both frontmatter parsers + discovered by iter_skill_records; ruff clean. Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>

…probe The onboarding wizard and setup docs write the Ollama Cloud key as OLLAMA_CLOUD_API_KEY, but the runtime only ever read OLLAMA_API_KEY. Nothing connected the two, so a user who completed `decepticon onboard` sent an empty Bearer token to https://ollama.com/v1 and 401'd on every turn — never advancing past the Soundwave interview (reported on Discord; distinct from the " (Recommended)" suffix loop fixed in 1.1.3 / #328). - Read OLLAMA_CLOUD_API_KEY first, fall back to OLLAMA_API_KEY (the official Ollama var) in litellm_dynamic_config, the cloud model resolver, and the cloud opt-in detector. No existing setup breaks; wizard/docs already use the _CLOUD_ form. - Add probe_cloud(): Bearer-authenticated reachability + tool-capability check against the native /api/show endpoint (strips the OpenAI-compat /v1 path). litellm_startup now probes local and cloud separately, so a cloud-only user (empty OLLAMA_API_BASE) gets a real diagnostic instead of the silent "OLLAMA_API_BASE is empty" short-circuit. Missing key or non-tool-capable cloud model prints clear remediation at boot. Tests: +8 (key precedence, prefix filtering, native-base, probe_cloud auth/401/capability). Full llm + auth suites green; ruff + basedpyright clean. Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>

VoidChecksum and others added 2 commits May 30, 2026 03:29

VoidChecksum requested a review from PurpleCHOIms as a code owner May 30, 2026 15:15

VoidChecksum mentioned this pull request May 30, 2026

Integrate all 80 open PRs (#352-#441): conflict-free, fully tested #442

Open

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

fix(llm): authenticate Ollama Cloud via OLLAMA_CLOUD_API_KEY + add cloud probe#441

fix(llm): authenticate Ollama Cloud via OLLAMA_CLOUD_API_KEY + add cloud probe#441
VoidChecksum wants to merge 2 commits into
mainfrom
fix/ollama-cloud-api-key

VoidChecksum commented May 30, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

Uh oh!

Conversation

VoidChecksum commented May 30, 2026

Problem

Fix

Out of scope (flagged, not changed)

Immediate workaround (for affected users, no rebuild)

Verification

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant