fix(roe): close two target-extraction scope bypasses (URL userinfo + TLD/extension collision) by VoidChecksum · Pull Request #392 · PurpleAILAB/Decepticon

VoidChecksum · 2026-05-30T10:59:58Z

Summary

The RoE evaluator refuses out-of-scope hosts only for targets returned by extract_targets() (middleware/roe.py:205). Two extraction gaps let a host evade scope enforcement entirely.

1. URL userinfo bypass (security)

_URL_RE captured [^\s/:]+ immediately after ://, so curl https://user:pass@evil.example/ captured user (dropped as invalid) and the real host evil.example was never evaluated by RoE. The regex now skips an optional userinfo@ segment and captures the host, excluding @.

2. TLD / file-extension collision (security)

_NON_TARGET_EXTENSIONS dropped any token whose last label matched a file extension — but .sh .md .py .pl .pub .zip are delegated DNS TLDs, so curl https://evil.zip was silently dropped from scope enforcement. Those real-TLD entries are removed (file-only extensions retained); the stale comment claiming no TLD collides is corrected with a security rationale to prevent regression.

Both fixes err toward over-extraction, matching this module's documented "false positives are safer than false negatives" design — a spurious target is operator-overridable; a dropped real one is a silent scope breach.

Tests

New tests/unit/middleware/test_command_targets_scope_bypass.py: userinfo stripping (with port handling), TLD-colliding hosts now extracted, and confirmation that genuine local-file arguments (key.pem, scan.txt, …) stay excluded. Full existing RoE suite passes (64 tests green locally).

🤖 Generated with Claude Code

The RoE evaluator refuses out-of-scope hosts only for targets returned by `extract_targets()` (roe.py:205). Two extraction gaps let a host evade scope enforcement entirely: 1. URL userinfo. `_URL_RE` captured `[^\s/:]+` right after `://`, so `curl https://user:pass@evil.example/` captured `user` (dropped as invalid) and the real host `evil.example` was never evaluated. The regex now skips an optional `userinfo@` and captures the host, excluding `@`. 2. TLD/extension collision. `_NON_TARGET_EXTENSIONS` dropped any token whose last label matched a file extension — but `.sh .md .py .pl .pub .zip` are delegated DNS TLDs, so `curl https://evil.zip` was silently dropped. Those real-TLD entries are removed (file-only extensions are retained). The stale comment claiming no TLD collides is corrected. Both fixes err toward over-extraction, matching this module's documented "false positives are safer than false negatives" design. Adds regression tests for userinfo stripping (with port handling), TLD- colliding hosts, and confirms genuine local-file arguments stay excluded.

CodeQL py/incomplete-url-substring-sanitization flagged the 'host in extract_targets(...)' membership asserts. extract_targets returns a set; rewrite the asserts as explicit element equality (any/all) so the scanner no longer mistakes them for URL sanitization. Behaviour unchanged.

Folds in the second RoE scope-bypass fix from #392: .sh/.md/.py/.pl/.pub/.zip are delegated DNS TLDs, so a host like evil.zip was silently dropped from scope enforcement by the file-extension denylist. Removes those entries (file-only extensions retained) and corrects the stale comment. Adds regression tests. Consolidates #392 into this PR.

VoidChecksum · 2026-05-30T12:59:45Z

Closing as a duplicate. The two RoE scope-extraction fixes here (URL userinfo + real-TLD/extension collision) are now consolidated into #405, which uses a more robust RFC-3986 urlsplit-based host extraction and additionally normalizes packed-integer/hex IPv4 IMDS encodings. The TLD-collision fix + regression tests from this PR have been folded into #405. Merge #405 instead.

VoidChecksum requested a review from PurpleCHOIms as a code owner May 30, 2026 10:59

github-advanced-security AI found potential problems May 30, 2026

View reviewed changes

Comment thread packages/decepticon/tests/unit/middleware/test_command_targets_scope_bypass.py Fixed

Comment thread packages/decepticon/tests/unit/middleware/test_command_targets_scope_bypass.py Fixed

VoidChecksum added 2 commits May 30, 2026 13:34

style(roe): ruff-format the scope-bypass test assertions

13eb17f

VoidChecksum closed this May 30, 2026

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

fix(roe): close two target-extraction scope bypasses (URL userinfo + TLD/extension collision)#392

fix(roe): close two target-extraction scope bypasses (URL userinfo + TLD/extension collision)#392
VoidChecksum wants to merge 3 commits into
mainfrom
fix/roe-userinfo-and-tld-bypass

VoidChecksum commented May 30, 2026

Uh oh!

Uh oh!

Uh oh!

VoidChecksum commented May 30, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

Uh oh!

Conversation

VoidChecksum commented May 30, 2026

Summary

1. URL userinfo bypass (security)

2. TLD / file-extension collision (security)

Tests

Uh oh!

Uh oh!

Uh oh!

VoidChecksum commented May 30, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants