Add OWASP LLM02 output-side scorer pack: SSRF / SSTI / XXE / open redirect / LDAP injection by ppcvote · Pull Request #2118 · microsoft/PyRIT

ppcvote · 2026-07-02T06:32:14Z

Summary

Adds five RegexScorer subclasses extending the OWASP LLM02 (Insecure Output Handling) scorer pack from #1868 with the remaining statically-detectable payload families: SSRF, SSTI, XXE, open redirect, and LDAP injection.

Proposed as a follow-up in #2002. Same design as #1868: one focused subclass per payload family, following the CredentialLeakScorer pattern, each independently enable/disable-able per scenario.

I'm opening this without a help-wanted label on #2002 — given the direct pattern parity with the merged #1868, I wanted the code visible for review rather than a standalone comment. Happy to close and wait for triage if the team prefers.

Per-scorer breakdown

Scorer	Default patterns	Pattern names
`SSRFOutputScorer`	4	Cloud Metadata Endpoint · Loopback URL Target · Private Network URL Target · SSRF URL Scheme
`SSTIOutputScorer`	2	Arithmetic Eval Probe · Python Gadget Chain
`XXEOutputScorer`	3	External Entity Declaration · External Parameter Entity · Doctype Internal Subset Entity
`OpenRedirectOutputScorer`	3	Protocol-Relative Redirect Param · Encoded Slash Redirect · Userinfo Host Confusion
`LDAPInjectionOutputScorer`	3	Filter Break Sequence · Always-True Clause · Boolean Operator Injection

15 default patterns across 5 scorers.

What's included

5 scorer classes under pyrit/score/true_false/regex/ (~55-65 lines each, mirroring the CredentialLeakScorer / Add OWASP LLM02 output-side scorer pack (XSS / SQLi / Shell / Path) #1868 structure: _DEFAULT_PATTERNS class var, keyword-only __init__ with an optional patterns override, categories=["security"], OR aggregator).
5 test files under tests/unit/score/regex/ — parametrized positive/negative cases + rationale-name assertion + custom-pattern override + memory-write assertion, mirroring test_credential_leak_scorer.py. The tests/unit/score/regex/ directory runs green (262 passed locally).
Export wiring: the submodule pyrit/score/true_false/regex/__init__.py is kept alphabetized; the insertions into the top-level pyrit/score/__init__.py follow that file's existing (non-alphabetical) ordering.

Note on precision

These are output-side detectors, so the patterns lean toward flagging when a model emits attack-shaped content. Two deliberate scoping choices worth calling out for review:

LDAP: each pattern requires an attr= clause adjacent to the filter break (*)(uid=, )(cn=*)), so ordinary code punctuation with the same *)( shape — e.g. a regex group (\w*)(\s+) — does not match. Negative tests cover that case.
SSRF: the Cloud Metadata Endpoint pattern matches the link-local metadata IP/host by design, so it will also fire if a model merely echoes 169.254.169.254 in prose. That is intentional for an output scorer (emitting the metadata endpoint is itself signal), but reviewers should be aware it is not URL-scheme-gated. Happy to tighten it to require a fetch/URL context if you'd prefer lower recall.

Not included (deliberate)

Doc walkthrough: Add OWASP LLM02 output-side scorer pack (XSS / SQLi / Shell / Path) #1868's doc/code/scoring/owasp_llm02_scorers.* notebook was folded into doc/code/scoring/1_true_false_scorers.{py,ipynb} by the docs refactor in DOC: Scoring Docs Refactor #1892; that section currently lists only XSS/SQLi/Shell/Path. Extending it with these five scorers is ready to fold into this PR on request, or as an immediate follow-up — kept out here to keep the diff review-sized.
Dataset: same deferral as Add OWASP LLM02 output-side scorer pack (XSS / SQLi / Shell / Path) #1868 (HF-hosted seed datasets are a separate contribution class per the Proposal: Add Agent Threat Rules (ATR) dataset loader and taxonomy scorer #1702 discussion).

Pattern provenance

Ported from the MIT-licensed prompt-defense-audit-py (same provenance as #1868, same author). Each pattern is verified in-repo against positive and negative cases in the new test files.

Test evidence

pytest tests/unit/score/regex/ → 262 passed locally.
Branch rebased onto current main; top-level from pyrit.score import SSRFOutputScorer, SSTIOutputScorer, XXEOutputScorer, OpenRedirectOutputScorer, LDAPInjectionOutputScorer verified.
pre-commit ruff format + ruff check pass on the changed files (the ty type-check hook was skipped locally for lack of uv; it will run in CI).

…irect / LDAP Extends the regex true/false scorer family from microsoft#1868 with five additional output-side payload detectors, mirroring the existing RegexScorer pattern. Each is deterministic (no LLM call), categorized "security", with unit tests covering positive payloads, benign negatives, rationale, custom patterns, and memory integration.

- List SSRF/SSTI/XXE/open-redirect/LDAP scorers in the OWASP LLM02 doc section (.py + .ipynb) - Fix alphabetical placement of LDAP/OpenRedirect entries in pyrit.score __all__ - Align Encoded Slash Redirect param list with Protocol-Relative Redirect Param (adds returnto/destination/forward/location); add regression test Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>

…pack

romanlutz self-assigned this Jul 2, 2026

Copilot AI and others added 3 commits July 2, 2026 14:31

Merge remote-tracking branch 'origin/main' into add-injection-scorer-…

18105a8

…pack

Merge branch 'main' into add-injection-scorer-pack

9751a30

romanlutz approved these changes Jul 3, 2026

View reviewed changes

romanlutz added this pull request to the merge queue Jul 3, 2026

Merged via the queue into microsoft:main with commit 1e21a04 Jul 3, 2026
53 checks passed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Add OWASP LLM02 output-side scorer pack: SSRF / SSTI / XXE / open redirect / LDAP injection#2118

Add OWASP LLM02 output-side scorer pack: SSRF / SSTI / XXE / open redirect / LDAP injection#2118
romanlutz merged 4 commits into
microsoft:mainfrom
ppcvote:add-injection-scorer-pack

ppcvote commented Jul 2, 2026

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

Uh oh!

Conversation

ppcvote commented Jul 2, 2026

Summary

Per-scorer breakdown

What's included

Note on precision

Not included (deliberate)

Pattern provenance

Test evidence

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants