fix(llm): authenticate Ollama Cloud via OLLAMA_CLOUD_API_KEY + add cloud probe#441
Open
VoidChecksum wants to merge 2 commits into
Open
fix(llm): authenticate Ollama Cloud via OLLAMA_CLOUD_API_KEY + add cloud probe#441VoidChecksum wants to merge 2 commits into
VoidChecksum wants to merge 2 commits into
Conversation
…al red-team references
A disciplined harvest of the provided reference repos: each was gap-checked
against the existing 227-skill KB; only genuine, non-duplicative capability
gaps were authored as operator-grade, ATT&CK-mapped playbooks (with
authorized-use caveats). Duplicative/meta candidates were dropped, and
narrower techniques deferred with rationale (see PR body).
Added (source → skill):
- PayloadsAllTheThings → exploit/web/php-type-juggling
- Strix → exploit/web/header-injection (CRLF + Host-header
poisoning / password-reset / X-Forwarded), exploit/web/bfla
- hackingBuddyGPT → post-exploit/linux-privesc-enum
- 13o-bbr ML-Security → analyst/adversarial-ml-evasion, analyst/ml-model-extraction
- HexStrike-AI → reverser/ctf-triage
- PentestGPT → decepticon/pentest-task-tree (PTT reasoning)
- MCP-Kali-Server → decepticon/kali-mcp-bridge (drives the new MCP client)
- CAI (alias robotics)→ iot/ros2-dds-attack, post-exploit/network-replay
Dropped as duplicative: crlf-injection (subsumed by header-injection),
reva-ghidra-mcp + reva-pyghidra-scripting (reverser/ghidra already covers the
Ghidra MCP bridge + scripting), autonomous-pentest-agent-ops + ai-reactive-loop
(meta; overlap decepticon/orchestration).
Verified: all 11 parse via both frontmatter parsers + discovered by
iter_skill_records; ruff clean.
Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
…probe The onboarding wizard and setup docs write the Ollama Cloud key as OLLAMA_CLOUD_API_KEY, but the runtime only ever read OLLAMA_API_KEY. Nothing connected the two, so a user who completed `decepticon onboard` sent an empty Bearer token to https://ollama.com/v1 and 401'd on every turn — never advancing past the Soundwave interview (reported on Discord; distinct from the " (Recommended)" suffix loop fixed in 1.1.3 / #328). - Read OLLAMA_CLOUD_API_KEY first, fall back to OLLAMA_API_KEY (the official Ollama var) in litellm_dynamic_config, the cloud model resolver, and the cloud opt-in detector. No existing setup breaks; wizard/docs already use the _CLOUD_ form. - Add probe_cloud(): Bearer-authenticated reachability + tool-capability check against the native /api/show endpoint (strips the OpenAI-compat /v1 path). litellm_startup now probes local and cloud separately, so a cloud-only user (empty OLLAMA_API_BASE) gets a real diagnostic instead of the silent "OLLAMA_API_BASE is empty" short-circuit. Missing key or non-tool-capable cloud model prints clear remediation at boot. Tests: +8 (key precedence, prefix filtering, native-base, probe_cloud auth/401/capability). Full llm + auth suites green; ruff + basedpyright clean. Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Problem
A Discord user on Ollama Cloud (WSL2) is stuck in the Soundwave engagement-planning interview loop on 1.1.3, even after reinstalling. The 1.1.3 changelog fix for the "Soundwave interview loop" only addressed the
" (Recommended)"suffix leak (#328) — a different root cause.The actual bug for Ollama Cloud is an environment-variable name mismatch:
clients/launcher/cmd/onboard.go) and setup docs (docs/setup-guide.md) write the key asOLLAMA_CLOUD_API_KEY.OLLAMA_API_KEYinconfig/litellm_dynamic_config.pyand the cloud model resolver.Nothing connected the two (grep confirms:
OLLAMA_CLOUD_API_KEYwas only ever written, never read). So a user who completesdecepticon onboardfor Ollama Cloud sends an empty Bearer token tohttps://ollama.com/v1→ 401 on every turn → the agent never advances past the Soundwave interview = "stuck in the loop."Fix
1. Key mismatch (the loop) — read
OLLAMA_CLOUD_API_KEYfirst, fall back toOLLAMA_API_KEY(the official Ollama convention), in:config/litellm_dynamic_config.py(Bearer auth wiring)packages/decepticon-core/.../types/llm.py(_resolve_ollama_cloud_model)packages/decepticon/.../llm/factory.py(_ollama_cloud_configuredopt-in — now consistent with the resolver, which already treated the key as an opt-in signal)No existing setup breaks; the wizard/docs already use the
_CLOUD_form, which now works.2. Cloud tool-capability probe — the startup probe (
config/ollama_probe.py/litellm_startup.py) only ever ran against local Ollama (OLLAMA_API_BASE,host.docker.internal). A cloud-only user (empty local base) short-circuited on "OLLAMA_API_BASE is empty" and got no cloud diagnostic. Added:probe_cloud()— Bearer-authenticated reachability +/api/showcapability check, targeting the native API root (strips the OpenAI-compat/v1path).litellm_startupprobes each family independently.Out of scope (flagged, not changed)
The default cloud model
_OLLAMA_CLOUD_DEFAULT_MODEL = "llama3.2"(llm.py) isn't a real Ollama Cloud model — a user who sets base/key but omitsOLLAMA_CLOUD_MODELwill 404. The new probe now surfaces this, but the default itself is left unchanged pending a maintainer decision.Immediate workaround (for affected users, no rebuild)
Edit
~/.decepticon/.env:then
decepticon restart.Verification
probe_cloudauth/401/capability).tests/unit/llm/+test_auth.pysuites: 234 passed, 2 skipped (Windows-only).ruff checkclean;basedpyright0 errors/0 warnings.🤖 Generated with Claude Code