Skip to content

Integrate all 80 open PRs (#352-#441): conflict-free, fully tested#442

Merged
VoidChecksum merged 186 commits into
mainfrom
integration/merge-all-open-prs
Jun 1, 2026
Merged

Integrate all 80 open PRs (#352-#441): conflict-free, fully tested#442
VoidChecksum merged 186 commits into
mainfrom
integration/merge-all-open-prs

Conversation

@VoidChecksum
Copy link
Copy Markdown
Collaborator

Integrates all 80 open PRs (#352#441) into a single, conflict-free, fully-tested tree.

Each PR was individually green, but the repo ruleset uses strict_required_status_checks_policy: false, so per-PR checks cannot detect cross-PR breakage. This branch merges every PR locally, resolves conflicts, and reconciles the cross-PR interactions so the combined tree is green.

Textual conflicts resolved

Cross-PR behavior/isolation reconciliations (stale test expectations updated to match intended new source; no tests removed/gamed)

Verification (mirrors the CI Python gate)

  • ruff check . — pass
  • ruff format --check . — pass
  • basedpyright — 0 errors
  • pytest -n auto -m "not slow"3510 passed, 28 skipped, 0 failed

Merging this with a merge commit makes every PR's head commit reachable from main, so all 80 PRs will be auto-marked Merged.

VoidChecksum and others added 30 commits May 29, 2026 18:33
…_UPDATE)

Two halves of the release/update lifecycle, both built to leave the proven
release pipeline and self-update flow intact.

Auto-release on merge
  - .github/workflows/auto-tag.yml: on push to main, computes the next semver
    from conventional commits since the last tag (scripts/next_version.py:
    feat->minor, fix/perf/revert->patch, !/BREAKING CHANGE->major, docs/chore/
    ci/test->no release) and pushes vX.Y.Z. That tag drives the EXISTING
    release.yml (GoReleaser draft -> image build -> verify -> publish)
    unchanged, so release atomicity is preserved — auto-tag never builds.
  - One-time PAT (RELEASE_PLEASE_TOKEN, contents:write) is required for the
    auto-tag's tag to trigger release.yml (GitHub suppresses workflow triggers
    from GITHUB_TOKEN-pushed tags). Without it the tag is still created; build
    via release-recover.yml. Documented in RELEASE.md.

Unattended auto-update
  - The launcher had a full updater (FetchLatestRelease/ApplyUpdate/SelfUpdate/
    re-exec) and a startup PromptIfUpdateAvailable, but the documented
    AUTO_UPDATE flag was wired NOWHERE. start.go now honors it:
      unset  -> prompt (today's behavior)   true -> unattended apply+restart
      false  -> skip entirely (air-gapped / pinned)
  - updater.AutoUpdateIfAvailable + shared applyAndReexec/resolveUpdateRef
    (refactored out of PromptIfUpdateAvailable so both paths are identical).

Tests: scripts/next_version.py (9 cases incl. lookalike-type guards) and the
Go dev-build skip gate. actionlint clean across all workflows.
…tag)

release-recover.yml only verifies already-built images and finalizes the
release; it cannot build one. When RELEASE_PLEASE_TOKEN is absent, a
GITHUB_TOKEN-pushed tag never triggers release.yml, so no images exist and
release-recover fails at its verify step. Document the real recovery:
re-push the tag manually (delete remote + push) to trigger release.yml.
…dening

Wires modules that were built+tested but never registered, then fixes 15
defects found in a focused security review of the result + adjacent code.

P0 wiring
- EventLogMiddleware: emit engagement events to <workspace>/events.jsonl
  (tool/llm/finding), registered as a slot; adds EventLog.for_workspace.
- Register budget, prompt-injection-shield, HITL, and event-log as
  middleware slots (every role; HITL opt-in via DECEPTICON_HITL__ENABLED).
- HITL web bridge: GET/POST /api/engagements/[id]/approvals + ApprovalGate
  UI (polling), reading/writing the SDK's FileBackedApprovalTransport JSONL.
- CART live replay: SubAgentTaskSpec contract + make_replay_dispatcher;
  ReplayRunner.execute now dispatches real specs (was a 'live_unwired' stub).
- Skillogy cutover hook in build_middleware (no-op unless DECEPTICON_USE_SKILLOGY).
- PromptInjectionShield: registry-driven trusted-tool lookup (drops the
  load_skill/list_skills hardcode) + dedup vs UntrustedOutput (no double-wrap).

Security / correctness fixes
- event log no longer persists tool-arg VALUE contents (was leaking bash
  commands, Authorization/Cookie headers, session tokens to events.jsonl);
  finding.created emitted only on a successful tool result, after tool.result.
- web: close authenticated path traversal via engagement name — PATCH now
  slug-validates name (root cause); approvals route re-validates (defense).
- config/oauth_token_store: unique-temp (mkstemp) atomic write — fixes a
  concurrent-refresh race that corrupted/lost tokens.
- config/codex_chatgpt_handler: stop embedding raw token data in refresh errors.
- config/copilot_handler + grok_handler: preserve streamed tool_calls and
  upstream finish_reason (were dropping every streamed tool call).
- config/gemini_handler: map candidates[].finishReason (was hardcoded 'stop').
- llm/factory: detect env-backed OAuth credentials as configured.
- PROMPT_INJECTION_SHIELD added to SAFETY_CRITICAL_SLOTS (gate disable/replace).
- HITL: resolve approval transport per-request from workspace_path (was frozen
  at graph import) and offload the blocking wait via asyncio.to_thread.
- CART make_replay_dispatcher uses ReplayMiddleware(strict=False) (partial
  replay; misses fall through to live) instead of strict default.
- skillogy swap-only (never resurrect an intentionally-disabled SKILLS slot).

Verification: py_compile (all changed .py), a functional harness exercising the
real logic with heavy deps stubbed, and Bun for the web slug test; unit tests
added/updated. The project's full pytest/tsc/eslint/pre-commit were not run in
this environment (no installed deps) and must run in CI.
…iring branch

The Python CI job failed at the `ruff check` step, which masked two
further breakages that the job never reached (basedpyright/pytest run
after lint).

- ruff: 7 lint errors (import ordering, missing EOF newline) plus 10
  files needing `ruff format`. All autofixed; no logic changes.
- test_prompt_injection_shield: three tests asserted the shield wraps
  `bash` output, but this branch adds the anti-double-wrap dedup that
  routes `bash`/`read_file`/`kg_*` to UntrustedOutputMiddleware. The
  shield correctly skips them now, so the tests point at a genuinely
  shield-owned external tool (`http_fetch`) instead. The dedup itself is
  already covered by test_shield_skips_untrusted_output_tools_no_double_wrap.
- test_build._build_exploit_stack: the exploit role includes the
  SANDBOX_NOTIFICATION slot, whose factory now requires a non-None
  `sandbox` kwarg (it forwards the real HTTPSandbox the agent builds).
  Default a mock in the helper so the stack assembles.

Local gate green: ruff check + ruff format --check clean, basedpyright 0
errors, pytest -n auto -m "not slow" -> 1781 passed, 26 skipped.

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
`_build_engagement_injection` accepted a `workspace` argument but
hardcoded `/workspace` in three places: the "Workspace root" line, the
"Treat … as the only engagement directory" line, and the
`/workspace/plan/` guidance. Multi-tenant / SaaS launchers mount each
engagement under a distinct root (via `config.configurable.workspace_path`,
resolved by `_resolve_workspace_path`), so the agent was pointed at the
wrong directory for any non-default workspace.

Template all three occurrences with the passed root (trailing slash
trimmed so `{root}/plan/` never doubles up). Default single-tenant
behavior is unchanged — `workspace_path` still defaults to `/workspace`.

Adds a regression test asserting a custom `workspace_path` round-trips
into the injection text and that the stale `/workspace` default does not
leak.

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
Two unreachable-code defects surfaced by a code audit; no behavior change:

- tools/reversing/tools.py: `bin_r2_script` had a duplicate, unreachable
  second `return _json({"source": r2_recon_script(binary)})` (copy-paste).
  Removed.
- blue_cell/rule_match.py: in `_evaluate_condition`, the
  `elif token in {"(", ")"}` branch is unreachable — the preceding
  `if token in {"and","or","not","(",")"}` already matches "(" and ")".
  Removed; parenthesis handling is unchanged (the first branch appends
  them and they reach the whitelisted eval as before).

Verified: ruff clean, basedpyright 0 errors, blue_cell + reversing unit
tests pass (41), full fast-lane pytest green.

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
…gnostic + add tests

`deriveSubAgentSessions` (shared by the Web + CLI clients) determined a
finished session's terminal status by reading an *undeclared* `status`
field off the event via a cast. That works for the CLI, which normalizes
the backend's `error` boolean into `status: "error" | "success"` before
events reach the shared utilities — but a consumer that forwards the raw
backend event (whose contract is `error: boolean`, per
`SubagentCustomEvent`) would have its errored sessions silently rendered
as "completed".

- Model both signals on the shared `StreamEvent` type (`status?`,
  `error?`), removing the unsafe cast.
- Detect failure from either `status === "error"` or `error === true`, so
  the result is correct regardless of which client shape feeds it.
- Add the first unit tests for the (previously untested) shared session
  derivation: error via both shapes, running vs completed, tool counting,
  orphan-end handling, interleaved subagents, default description.

No behavior change for the CLI (its events already carry the normalized
`status`). Verified: cli typecheck clean, vitest 29 passed (8 new), web
eslint clean.

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
…ublic surface

`decepticon_core.types` re-exported `engagement`/`llm`/`kg` but not
`roe`, even though `types/roe.py` is a fourth submodule of the contract
layer and is consumed across the package boundary (the framework's
RoE-enforcement middleware imports `decepticon_core.types.roe`). The
package docstring also claimed "Three submodules".

- Re-export `roe` alongside its siblings (import + `__all__`); fix the
  docstring to describe all four and disambiguate the enforcement schema
  from `engagement.RoE` (the planning document).
- Lock roe's six public symbols (EnforcementMode, ScopeRule,
  MachineEnforcement, Decision, evaluate_target, evaluate_command) in the
  public-API stability manifest and bump its count 69 -> 75, per the
  manifest's documented update process.

CHANGELOG is release-curated (no Unreleased section), so it's left to the
release flow. Verified: ruff clean, basedpyright 0 errors,
decepticon-core suite green (88 passed) incl. the langchain-free guarantee.

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
…vent loop)

`http_request` was a *sync* `@tool` that drove its async `HTTPSession`
via `asyncio.get_event_loop().run_until_complete(_do())`. Inside
LangGraph's running event loop that raises `RuntimeError: ... cannot be
called from a running event loop`, so the tool would fail the moment a
web agent is wired to it. It is latent today only because no standard
agent imports `WEB_TOOLS`. Every other network tool (`bash_*`) is already
an async `@tool`.

Convert `http_request` to `async def` and `await` the session directly,
dropping the `_do` wrapper, the `run_until_complete` call, and the local
`asyncio` import. Behavior is otherwise identical.

Adds the first tests for the web tool layer: invoking the tool from
*within* a running loop (the exact failure condition) with a stubbed
session, plus the invalid-headers path.

Verified: ruff clean, basedpyright 0 errors, new tests pass (2).

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
Adds a self-contained, additive MCP (Model Context Protocol) client so operators can connect external MCP tool servers (Kali MCP, HexStrike, etc.) and expose their tools to Decepticon agents.

- Config via DECEPTICON_MCP__SERVERS (JSON), matching the existing DECEPTICON_<SECTION>__<KEY> convention.

- langchain-mcp-adapters imported lazily inside load_mcp_tools(); absent package => one-line warning + empty list (agents keep working).

- Per-server isolation: one bad endpoint logs+skips, others still load.

- No dependency/lockfile changes (optional package documented as manual install; locked extra to follow). 17 mocked tests, no network.
…reds

Covers analyze_gpo_abuse, analyze_delegation, analyze_shadow_credentials
and all pure helpers (_is_sensitive_ou, _is_dc, _spn_targets_dc) with
happy paths, edge cases, and dangling-node robustness tests (66 tests).

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
…rs and tool builder

59 tests covering _strip_frontmatter, _read_via_backend, _list_dir_via_backend,
_validate_skill_path, _format_skill_body, and build_load_skill_tool end-to-end
using a fake backend — no external services required.

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
…ilent dead port

`build_grpc_server` constructed a `grpc.Server`, bound `0.0.0.0:<port>`,
and returned an UNREGISTERED servicer — it never called
`add_*Servicer_to_server` (there are no protoc-generated bindings for
`skillogy.proto`) and the hand-rolled servicer returned plain Python
objects with no message serializers. So when grpcio was installed the
launcher opened a port that accepted connections but answered every RPC
(ListSkills / LoadSkill / IngestSkill / Health) with `UNIMPLEMENTED` — a
silent dead port that *looks* healthy. REST is the only wired transport
(`RestSkillogyClient` → :9100; the skillogy middleware connects over REST).

Make `build_grpc_server` raise `RuntimeError` with a clear message.
`__main__._start_grpc` already catches `RuntimeError` and degrades to
REST, so the service boots correctly with no dead port. Removed the dead
servicer + `_RawResponse`/`_HealthResp` scaffolding and the now-unused
`SkillMeta` import; updated the module docstring.

Adds tests: `build_grpc_server` raises (message names gRPC + REST), and
`_start_grpc` returns None (graceful REST degrade).

Verified: ruff clean, basedpyright 0 errors, skillogy suite 36 passed.

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
…d) and BlueCellTap transforms

Covers JSON output shapes and error paths for solidity_scan, solidity_scan_file,
slither_ingest, foundry_* wrappers, export_session_asciicast, list_session_recordings,
iam_policy_audit, s3_buckets_from_text, user_data_secrets, k8s_audit, tfstate_audit,
metadata_endpoints, plus _strip_ansi, _parse_line_to_event, TapEvent.to_dict, and
BlueCellTap.read_batch/follow. All IO mocked; passes fully offline.

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
Adds test_defense_push.py covering HTTP push functions (mocked requests)
for splunk, elastic, sentinel, and EDR (Defender XDR + CrowdStrike), plus
converter edge cases, severity/logsource mappings, and ConOps helper paths
not covered by the existing test_defense.py — 104 tests total.

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
…ing internals

Adds 130 new tests covering binary.py, strings.py, symbols.py, rop.py,
packer.py, and scripts.py — ELF32/big-endian/Mach-O/WASM formats, all
four RET opcodes, every packer signature, crypto/email/version/import
string categories, UTF-16LE extraction, symbol buckets and risk scoring,
and to_dict serialization for every dataclass.

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
…ty modules

Adds 111 new tests across three files:
- test_chain.py: full coverage of chain.py (ChainStep, Chain dataclass,
  compute_edge_cost, plan_chains, promote_chain, critical_path_score,
  impact_analysis, unexplored_surface, credential_reachability) — all
  Neo4j calls mocked via _FakeStore.
- test_cve_extended.py: _Cache (TTL, LRU eviction, persistence, flush),
  _rehydrate, async lookup_cve/lookup_cves/lookup_package with mocked httpx,
  plus NVD/EPSS parser edge cases.
- test_bounty_extended.py: bounty_scope_check (domain wildcards, exclusions,
  normalization) and format_bounty_report (validated/unvalidated findings,
  severity labels, title stripping) with mocked KG store and filesystem.

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
…and executive.py helpers

63 deterministic tests covering all LangChain @tool wrappers (report_hackerone,
report_bugcrowd_csv, report_executive, report_timeline) with mocked _load, plus
deep edge-case coverage of _count_by_severity, _top_chains, _top_cves, and
render_executive_summary (validated-findings cap, severity ordering, default
fallbacks, graph stats, empty-graph branches). All pass offline with no services.

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
Covers _default_cache_root env-var override, ReferenceCache.to_dict,
_entry, _dir_size (OSError paths), _run_git hardened-env contract,
ensure_cached clone/pull/URL-mismatch/symlink/non-git-hint paths,
_which, _parse_grep_line edge cases, _pyfind pure-Python fallback,
and search_cache grep/timeout/FileNotFoundError/max_results paths.
All network and subprocess calls mocked; 55 pass, 2 platform-skipped.

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
Add packages/decepticon/tests/unit/cli/test_scan_extra.py covering
_git_diff_files (subprocess success/failure/timeout/OSError), _emit_jsonl_event,
_dispatch_scan_via_sdk (missing SDK, happy-path streaming, asyncio timeout),
_load_findings_graph (missing file, env-var workspace, load exception), and
main() end-to-end paths (no-target, instruction-file error, diff-scope fallback,
SDK RuntimeError/generic exception, happy-path, SARIF output, engagement-name/timeout
overrides, non-interactive forwarding).

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
…y ad wrappers

Covers ImportStats, _node_kind_for_bh, _BH_EDGE_MAP, _upsert_bh_object,
_build_bh_index, _ingest_aces, _ingest_memberships, merge_bloodhound_json
(dict/str/list/error paths), ingest_bloodhound_zip, and the bh_ingest_zip /
bh_ingest_json @tool wrappers (JSON output shapes + error paths). All 81
tests are offline and deterministic.

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
…G CRUD, tier-2 ingesters, and fuzz tools

Adds test_tools_extended.py with comprehensive coverage of uncovered
surfaces in tools/research/tools.py: _parse_props, _severity_from_score/
string, _is_web_port, _jwt/_cookie_finding_severity, all four dependency-
file parsers, kg_add_node/edge/query/neighbors/stats, kg_ingest_subfinder/
dnsx/katana/masscan/ffuf/testssl/crackmapexec/asrep_hashes, fuzz_harness,
fuzz_record_crash, and error paths throughout. All mocked offline.

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
…task-tree formatter

tools/opplan.py had zero direct unit coverage despite holding the OPPLAN
status state machine and the agent-facing renderer. Adds 23 tests for:

- _is_valid_transition / _valid_next / _VALID_TRANSITIONS: the full
  pending -> in-progress -> completed/blocked/cancelled matrix, terminal
  states, unknown sources, no self-transitions, sorted next-state hints,
  and a guard that the table only references real ObjectiveStatus values.
- _build_opplan_payload: per-status summary counts, id-stable ordering
  (clean git diffs), zeroed empty-plan summary, and round-trip back into
  the OPPLAN model (wrapper fields dropped).
- _format_opplan_for_agent: header/progress, priority-sorted table,
  blocked_by joining + owner fallback, hierarchical task tree, status
  markers, the lowest-priority 'Next' recommendation, all-complete and
  no-actionable branches, and a cycle-guard regression test proving a
  malformed/injected duplicate-id tree cannot drive unbounded recursion.

Pure-logic only; no network/docker/LLM.
…hecks

A trailing dot is DNS-equivalent ("metadata.google.internal." resolves
identically to "metadata.google.internal"), but `_matches_rule` compared
hosts with only `.lower()` — no trailing-dot normalization. In ENFORCE
mode this let the FQDN form slip past the forbidden-destination guard AND
any operator `out_of_scope` host rule:

    evaluate_target("metadata.google.internal.", enforce) -> allow=True
    evaluate_target("169.254.169.254.", enforce)          -> allow=True

i.e. the agent could reach the cloud metadata service and exfiltrate
service-account credentials despite the built-in IMDS deny list, with the
audit ledger recording the call as "allow". The IMDS-IP form additionally
failed `ip_address()` parsing and fell through to default-allow.

Strip a trailing dot (keeping the existing case-fold) on BOTH the rule
pattern and the target in `_matches_rule`, across host, domain-glob, and
CIDR matching — so forbidden_destinations, out_of_scope, and in_scope are
all normalized consistently. Legitimate in-scope FQDN-form hosts still
match; the raw target is preserved in the decision detail for the audit.

Adds regression tests (TestFqdnTrailingDotNormalization) for the
forbidden-destination, out-of-scope, and in-scope paths in both host and
IMDS-IP forms.

Verified: ruff clean, basedpyright 0 errors, test_roe + decepticon-core
suite 130 passed.

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
`SandboxBase._normalize_workspace_path` validated each path component
against `[A-Za-z0-9_.-]{1,128}`, which accepts "." and "..". So
`_normalize_workspace_path("/workspace/../../etc")` returned the string
verbatim, and that traversable path flowed into the sandbox tmux/file
operations meant to be confined to `/workspace/<engagement>` — escaping
the per-engagement subtree. The sibling EngagementFilesystem layer
already guards this (`middleware/filesystem.py` uses `posixpath.normpath`
plus a documented test), but the sandbox_kernel + bash callers invoke
`_normalize_workspace_path` directly with no such guard.

Add the same fail-closed guard: if `posixpath.normpath(path) != path`,
return the safe `/workspace` default. Catches "..", ".", and "//"
traversal while preserving legitimate (incl. dotted) directory names.
posixpath (not os.path) so Windows hosts don't get "/"->"\\" rewrites.

Adds tests/unit/sandbox_kernel/test_workspace_path.py: legit paths
preserved + traversal/escape forms fail closed.

Verified: ruff clean, basedpyright 0 errors, sandbox_kernel suite green.

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
…ion eval, loaders

test_rule_match.py exercised happy-path matching but left the
silent-failure-prone internals at 77%. Adds 27 tests lifting
rule_match.py to 99%:

- _event_field: nested dotted paths, missing keys, non-dict mid-path,
  None and list values (a wrong return makes a rule silently never fire).
- _compile_pattern: literal escaping vs re: regex mode, case-insensitivity.
- _evaluate_condition: empty-condition all()/no-selection branches,
  unknown selection names, and malformed expressions — proving the
  sandboxed eval path fails closed (False) rather than raising.
- load_rules / _load_from_jsonl / _load_from_json / _rule_from_dict:
  JSONL + directory loading, blank/malformed/no-id line skipping,
  list/single/scalar JSON shapes, match-shorthand, missing-file and
  nonexistent-path non-fatal handling — the untrusted-rule-file parser.

Pure-logic + tmp_path only; no network/docker/LLM.
`_refresh_tokens` raised `litellm.AuthenticationError` with the raw token
endpoint response interpolated into the message:

    message=f"Codex ChatGPT refresh response missing fields: {data}"

On a partial-but-successful refresh (e.g. `access_token` present but
`id_token` missing) `data` carries the freshly-minted access token and
the rotated refresh token verbatim — which then land in logs and the
caller-visible error. Interpolate only the (non-sensitive) field NAMES
that were present, never their values.

The handler runs inside the litellm container (litellm is not a dev/test
dependency), so it has no standard-suite unit test; this is a one-line
redaction verified by inspection. ruff clean.

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
…idates)

`UntrustedOutputMiddleware` wraps the output of every tool in
`UNTRUSTED_TOOL_NAMES` in an `<UNTRUSTED_TOOL_OUTPUT>` envelope with a
heuristic risk score, so attacker-influenceable bytes reach the model
marked as data (and high-risk hits are logged to the quarantine ledger).

The allowlist omitted the scanner prefilter tools: `scan_shard` walks
`/workspace/target` and returns raw code snippets, and `rank_candidates`
re-emits those hits. An injection payload planted in a scanned target
file therefore reached the scanner agent's model UNwrapped — never
enveloped, risk-scored, or recorded.

Add `scan_shard` + `rank_candidates` to `UNTRUSTED_TOOL_NAMES`, and a
regression test asserting both are enveloped with the right `origin`.

Verified: ruff clean, basedpyright 0 errors, test_untrusted_output 42 passed.

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
Integration Bot added 22 commits May 30, 2026 21:40
# Conflicts:
#	config/codex_chatgpt_handler.py
# Conflicts:
#	clients/web/src/app/api/engagements/[id]/route.ts
…ng 80 PRs

- defense (#364/#416 tests vs #417 source): quote/escape SPL expectations;
  add sha256 indicator to YARA fixtures + KQL queryText assertion for the
  new Defender push behavior
- recording (#385 test vs #430 source): _tool_request uses tool_call.args
- backends (#418 test vs #400 source): assert workspace-path normalization
- llm oauth/gemini (#353/#401): force-load real oauth_token_store under the
  bare name so a sibling test's partial stub can't shadow it under -n auto
- ollama (#441): ruff format
Comment thread packages/decepticon/tests/unit/ad/test_ad.py Fixed
VoidChecksum and others added 5 commits June 1, 2026 14:28
Adds 9 credentials-aware AuthMethods for OpenAI-compatible LLM gateways /
aggregators, bringing Decepticon closer to oh-my-pi's provider breadth.
Each routes through LiteLLM's openai/ provider with a fixed api_base
override (the proven xiaomi_mimo / custom pattern), now table-driven via
OPENAI_COMPAT_GATEWAYS so the batch shares one code path:

  opencode   OpenCode Zen          https://opencode.ai/zen/v1
  vercel     Vercel AI Gateway     https://ai-gateway.vercel.sh/v1
  hf         Hugging Face Router   https://router.huggingface.co/v1
  venice     Venice AI             https://api.venice.ai/api/v1
  nanogpt    NanoGPT               https://nano-gpt.com/api/v1
  synthetic  Synthetic             https://api.synthetic.new/openai/v1
  zenmux     ZenMux                https://zenmux.ai/api/v1
  qianfan    Baidu Qianfan (ERNIE) https://qianfan.baidubce.com/v2
  cfgateway  Cloudflare AI Gateway per-account base URL

The model alias keeps the gateway prefix (opencode/claude-opus-4-6) so
two gateways exposing the same upstream slug never collide in the LiteLLM
model_list.

Wired end-to-end exactly like the existing providers:
- AuthMethod enum + METHOD_MODELS HIGH/MID/LOW matrix (types/llm.py)
- OPENAI_COMPAT_GATEWAYS table + build_model_entry branch
  (litellm_dynamic_config.py)
- static model_list entries + within-gateway fallback chains (litellm.yaml)
- env-var credential auto-detection, default priority, CLI labels
  (factory.py) — auth_inventory + `decepticon-cli auth` pick them up
  automatically
- .env.example keys, /model catalog (model.ts), setup-guide.md table

Base URLs + model ids verified against each provider's current public
docs and oh-my-pi's maintained provider catalog. Kimi-for-Coding was
evaluated and dropped: its coding/v1 endpoint enforces a coding-agent
client whitelist and requires the kimi-for-coding model id, so it is not
usable through a generic LiteLLM proxy (Moonshot is already covered by
moonshot_api). Cursor / GitLab Duo / Qwen Portal are deferred — they need
proprietary or OAuth-device protocols that don't fit the
OpenAI-compatible-via-LiteLLM path. Like the existing cerebras /
xiaomi_mimo additions, these are configured via .env, not the Go onboard
wizard.

Tests: gateway routing / alias-collision / validation
(test_litellm_dynamic_config.py), tier resolution + prefix invariants
(test_models.py), credential detection (test_auth.py).

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
…ith 'import' and 'import from''

Co-authored-by: Copilot Autofix powered by AI <62310815+github-advanced-security[bot]@users.noreply.github.com>
Brings the integration tree up to date with main (v1.1.4-v1.1.6 releases:
skillogy publish fix, native arm64 runners, CHANGELOG; #443 make-as-CI-
source-of-truth; #444-#448). Conflict: ci.yml coverage line — kept main's
'make ci-test-coverage' dispatch (#443) and carried #380's 35->60 coverage
floor into the Makefile target so both intents hold.
Adds OpenCode Zen + 8 OpenAI-compatible provider gateways. Clean auto-merge
with the integrated llm factory/types/dynamic-config changes (#353, #435,
#441); reconciliation verified by the test suite.
@VoidChecksum VoidChecksum merged commit 34bd0e9 into main Jun 1, 2026
26 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants