fix(roe): close target-extraction scope bypasses (userinfo, IP-encoding, compound args) by VoidChecksum · Pull Request #405 · PurpleAILAB/Decepticon

VoidChecksum · 2026-05-30T11:05:45Z

What

Closes three target-extraction gaps in RoEEnforcementMiddleware that let an offensive command reach an out-of-scope or forbidden host while the RoE gate saw nothing (fail-open) or an in-scope decoy. Extraction is the sole input to the scope evaluator, so each gap is a scope bypass, not a cosmetic miss.

Bypass	Example	Before	After
URL userinfo confusion	`curl http://in-scope.acme.com@evil.com/`	extracts in-scope decoy / nothing → ALLOW	extracts `evil.com` (real host)
Encoded-IP IMDS	`curl http://2852039166/` · `http://0xa9fea9fe/` · `http://[fd00:ec2::254]/`	not matched → ALLOW	normalized to `169.254.169.254` / de-bracketed IPv6
Compound option	`curl --resolve host:80:169.254.169.254 …`	junk `host:80:169.254.169.254` target	split → real IP validated

The authority is now parsed RFC-3986-correctly (urlsplit().hostname) so userinfo + port are dropped; packed integer/hex hosts are canonicalized to dotted-quad; compound host:port:ip tokens are split.

Tests

tests/unit/middleware/test_command_targets_scope_bypass.py — 11 cases: each bypass + regressions (plain URL, explicit port, IPv4, CIDR, and a small integer that must not be mangled into an IP).

Scope / merge notes

Confined to middleware/_command_targets.py; the core matcher (decepticon_core/types/roe.py) is untouched → no collision with fix(roe): normalize FQDN trailing dot so it can't bypass scope deny checks #374 (FQDN trailing-dot).
test(middleware): cover RoE command-target extraction (_command_targets) — 88%→99% #383 adds a test_command_targets.py asserting the current extraction output. Merge it after this PR and re-baseline those assertions against the corrected host extraction.

Verification (local, Python 3.13, same uv.lock as CI)

ruff check ✅ · ruff format --check ✅
basedpyright ✅ 0 errors (152 pre-existing warnings, unchanged)
pytest -n auto -m "not slow" ✅ 1717 passed, 26 skipped

…ng, compound args) RoEEnforcementMiddleware gates offensive commands by the targets that extract_targets() returns; an empty or incomplete extraction means NO scope check at all (fail-open). Three extraction gaps let a command reach an out-of-scope or forbidden host while the gate saw nothing or an in-scope decoy: - URL userinfo confusion: the old `://([^\s/:]+)` slice captured the authority up to the first `:`/`/`, i.e. the userinfo, not the host. `curl http://in-scope.acme.com@evil.com/` extracted the in-scope decoy (or nothing) while curl connects to evil.com. The authority is now parsed RFC-3986-correctly (urlsplit .hostname), so userinfo and port are dropped and the real connect host is evaluated. - Encoded-IP IMDS bypass: `http://2852039166/`, `http://0xa9fea9fe/` and `http://[fd00:ec2::254]/` never matched the dotted-quad / forbidden rules. Packed integer/hex hosts are normalized to dotted-quad and IPv6 literals de-bracketed, so cloud-metadata (169.254.169.254) and other forbidden destinations can no longer be reached via an alternate encoding. - Compound option mis-extraction: `--resolve host:port:ip` was emitted as a single junk `host:port:ip` target; it is now split so each piece (the real IP) is validated independently. Adds tests/unit/middleware/test_command_targets_scope_bypass.py (11 cases) covering each bypass plus regressions (plain URL, explicit port, IPv4, CIDR, and a small integer NOT mangled into an IP). Scoped to middleware/_command_targets.py only; the core matcher (decepticon_core/types/roe.py) is untouched, so this does not collide with #374 (FQDN trailing-dot normalization). Note: #383 adds a test_command_targets.py asserting current extraction output -- merge it after this PR and re-baseline those assertions against the corrected host extraction. Fast-lane: 1717 passed, 26 skipped. basedpyright 0 errors. ruff clean.

…tion false positive The userinfo-decoy cases asserted `"evil.com" in targets`, which CodeQL's py/incomplete-url-substring-sanitization flags as a HIGH alert (it cannot tell `targets` is a set, not a URL). Asserting exact set equality is both stronger (proves the in-scope decoy is absent) and free of the flagged pattern.

Folds in the second RoE scope-bypass fix from #392: .sh/.md/.py/.pl/.pub/.zip are delegated DNS TLDs, so a host like evil.zip was silently dropped from scope enforcement by the file-extension denylist. Removes those entries (file-only extensions retained) and corrects the stale comment. Adds regression tests. Consolidates #392 into this PR.

VoidChecksum requested a review from PurpleCHOIms as a code owner May 30, 2026 11:05

github-advanced-security AI found potential problems May 30, 2026

View reviewed changes

Comment thread packages/decepticon/tests/unit/middleware/test_command_targets_scope_bypass.py Fixed

Comment thread packages/decepticon/tests/unit/middleware/test_command_targets_scope_bypass.py Fixed

VoidChecksum added 3 commits May 30, 2026 13:27

chore: re-trigger CI (transient default CodeQL setup)

50d434c

VoidChecksum mentioned this pull request May 30, 2026

fix(roe): close two target-extraction scope bypasses (URL userinfo + TLD/extension collision) #392

Closed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

fix(roe): close target-extraction scope bypasses (userinfo, IP-encoding, compound args)#405

fix(roe): close target-extraction scope bypasses (userinfo, IP-encoding, compound args)#405
VoidChecksum wants to merge 4 commits into
mainfrom
fix/roe-target-extraction-bypass

VoidChecksum commented May 30, 2026

Uh oh!

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

Uh oh!

Conversation

VoidChecksum commented May 30, 2026

What

Tests

Scope / merge notes

Verification (local, Python 3.13, same uv.lock as CI)

Uh oh!

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants