Skip to content

fix(roe): close target-extraction scope bypasses (userinfo, IP-encoding, compound args)#405

Open
VoidChecksum wants to merge 4 commits into
mainfrom
fix/roe-target-extraction-bypass
Open

fix(roe): close target-extraction scope bypasses (userinfo, IP-encoding, compound args)#405
VoidChecksum wants to merge 4 commits into
mainfrom
fix/roe-target-extraction-bypass

Conversation

@VoidChecksum
Copy link
Copy Markdown
Collaborator

What

Closes three target-extraction gaps in RoEEnforcementMiddleware that let an offensive command reach an out-of-scope or forbidden host while the RoE gate saw nothing (fail-open) or an in-scope decoy. Extraction is the sole input to the scope evaluator, so each gap is a scope bypass, not a cosmetic miss.

Bypass Example Before After
URL userinfo confusion curl http://in-scope.acme.com@evil.com/ extracts in-scope decoy / nothing → ALLOW extracts evil.com (real host)
Encoded-IP IMDS curl http://2852039166/ · http://0xa9fea9fe/ · http://[fd00:ec2::254]/ not matched → ALLOW normalized to 169.254.169.254 / de-bracketed IPv6
Compound option curl --resolve host:80:169.254.169.254 … junk host:80:169.254.169.254 target split → real IP validated

The authority is now parsed RFC-3986-correctly (urlsplit().hostname) so userinfo + port are dropped; packed integer/hex hosts are canonicalized to dotted-quad; compound host:port:ip tokens are split.

Tests

tests/unit/middleware/test_command_targets_scope_bypass.py — 11 cases: each bypass + regressions (plain URL, explicit port, IPv4, CIDR, and a small integer that must not be mangled into an IP).

Scope / merge notes

Verification (local, Python 3.13, same uv.lock as CI)

  • ruff check ✅ · ruff format --check
  • basedpyright ✅ 0 errors (152 pre-existing warnings, unchanged)
  • pytest -n auto -m "not slow"1717 passed, 26 skipped

…ng, compound args)

RoEEnforcementMiddleware gates offensive commands by the targets that
extract_targets() returns; an empty or incomplete extraction means NO
scope check at all (fail-open). Three extraction gaps let a command reach
an out-of-scope or forbidden host while the gate saw nothing or an
in-scope decoy:

- URL userinfo confusion: the old `://([^\s/:]+)` slice captured the
  authority up to the first `:`/`/`, i.e. the userinfo, not the host.
  `curl http://in-scope.acme.com@evil.com/` extracted the in-scope decoy
  (or nothing) while curl connects to evil.com. The authority is now
  parsed RFC-3986-correctly (urlsplit .hostname), so userinfo and port
  are dropped and the real connect host is evaluated.
- Encoded-IP IMDS bypass: `http://2852039166/`, `http://0xa9fea9fe/` and
  `http://[fd00:ec2::254]/` never matched the dotted-quad / forbidden
  rules. Packed integer/hex hosts are normalized to dotted-quad and IPv6
  literals de-bracketed, so cloud-metadata (169.254.169.254) and other
  forbidden destinations can no longer be reached via an alternate
  encoding.
- Compound option mis-extraction: `--resolve host:port:ip` was emitted as
  a single junk `host:port:ip` target; it is now split so each piece (the
  real IP) is validated independently.

Adds tests/unit/middleware/test_command_targets_scope_bypass.py (11 cases)
covering each bypass plus regressions (plain URL, explicit port, IPv4,
CIDR, and a small integer NOT mangled into an IP).

Scoped to middleware/_command_targets.py only; the core matcher
(decepticon_core/types/roe.py) is untouched, so this does not collide
with #374 (FQDN trailing-dot normalization). Note: #383 adds a
test_command_targets.py asserting current extraction output -- merge it
after this PR and re-baseline those assertions against the corrected host
extraction.

Fast-lane: 1717 passed, 26 skipped. basedpyright 0 errors. ruff clean.
…tion false positive

The userinfo-decoy cases asserted `"evil.com" in targets`, which CodeQL's
py/incomplete-url-substring-sanitization flags as a HIGH alert (it cannot tell
`targets` is a set, not a URL). Asserting exact set equality is both stronger
(proves the in-scope decoy is absent) and free of the flagged pattern.
Folds in the second RoE scope-bypass fix from #392: .sh/.md/.py/.pl/.pub/.zip are delegated DNS TLDs, so a host like evil.zip was silently dropped from scope enforcement by the file-extension denylist. Removes those entries (file-only extensions retained) and corrects the stale comment. Adds regression tests. Consolidates #392 into this PR.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants