Skip to content
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
15 changes: 8 additions & 7 deletions evals/policy_scanner/README.md
Original file line number Diff line number Diff line change
Expand Up @@ -16,12 +16,12 @@ The scanner is intentionally unchanged by this harness. The goal is to measure c
- `fixtures/suppression/*.json` covers suppression acceptance and rejection behavior.
- `fixtures/adversarial/*.json` covers brittle formatting and close-call cases.

The expanded corpus v1 contains 75 fixtures:
The expanded corpus contains 82 fixtures:

- `positive`: 34
- `negative`: 14
- `negative`: 18
- `suppression`: 15
- `adversarial`: 12
- `adversarial`: 15

The harness is read-only with respect to scanner behavior. It creates temporary directories, writes each fixture patch to `combined.patch`, invokes:

Expand Down Expand Up @@ -201,7 +201,7 @@ Do not update the baseline only to hide a regression. Baseline changes should be

## Current baseline sample

Local expanded-corpus baseline run on 2026-05-09:
Local expanded-corpus baseline run with generated-path audit cases:

```json
{
Expand All @@ -212,14 +212,14 @@ Local expanded-corpus baseline run on 2026-05-09:
"failed": 0,
"falseNegatives": 0,
"falsePositives": 0,
"fixtureCount": 75,
"fixtureCount": 82,
"hardGatePassed": true,
"knownBaselineFailures": 0,
"passed": 75,
"passed": 82,
"precision": 1.0,
"recall": 1.0,
"severityMismatches": 0,
"truePositives": 55,
"truePositives": 58,
"unexpectedCriticalFindings": 0
}
```
Expand All @@ -233,3 +233,4 @@ Local expanded-corpus baseline run on 2026-05-09:
## Current behavior notes

- `adversarial-generated-package-json-current-behavior`: generated-like `generated/package.json` paths are still treated as manifest dependency findings by current scanner behavior. This corpus records that behavior without changing scanner scope.
- Generated manifest files under top-level `generated/` remain dependency review signals in the current corpus because they may still affect runtime or supply-chain surfaces. Generated dependency examples under already non-production prefixes such as `docs/`, `examples/`, and `tests/` remain excluded by path scope.
14 changes: 7 additions & 7 deletions evals/policy_scanner/baselines/current.json
Original file line number Diff line number Diff line change
@@ -1,9 +1,9 @@
{
"fixtureCount": 75,
"fixtureCount": 82,
"generatedFrom": {
"command": "python3 evals/policy_scanner/run_policy_scanner_eval.py --repo-root . --json-output <candidate.json>",
"fixturesDir": "evals/policy_scanner/fixtures",
"note": "Frozen from the expanded policy scanner eval corpus v1; no timestamps or temp paths included.",
"note": "Frozen from the expanded policy scanner eval corpus with generated-path audit cases; no timestamps or temp paths included.",
"repeat": 2
},
"harnessName": "policy_scanner",
Expand All @@ -16,17 +16,17 @@
"failed": 0,
"falseNegatives": 0,
"falsePositives": 0,
"fixtureCount": 75,
"fixtureCount": 82,
"hardGateFailures": [],
"hardGatePassed": true,
"knownBaselineFailures": 0,
"passed": 75,
"passed": 82,
"precision": 1.0,
"recall": 1.0,
"runtimeMsP50": 160.184458,
"runtimeMsP95": 280.426723,
"runtimeMsP50": 248.451375,
"runtimeMsP95": 455.602164,
"severityMismatches": 0,
"truePositives": 55,
"truePositives": 58,
"unexpectedCriticalFindings": 0
},
"schemaVersion": "1.0"
Expand Down
Original file line number Diff line number Diff line change
@@ -0,0 +1,22 @@
{
"allowedExtraFindings": [],
"caseId": "adversarial-generated-client-package-json-current-behavior",
"description": "Nested generated package.json paths are still treated as manifest dependency findings by current scanner behavior.",
"expectedFindings": [
{
"file": "generated/client/package.json",
"line": 1,
"mustBlock": false,
"ruleId": "POLICY_NEW_NPM_DEPENDENCY",
"severity": "MAJOR"
}
],
"kind": "adversarial",
"patch": "diff --git a/generated/client/package.json b/generated/client/package.json\n@@ -0,0 +1,1 @@\n+\"sdk-runtime\": \"^2.0.0\"\n",
"tags": [
"dependency",
"generated",
"current-behavior",
"supply-chain"
]
}
Original file line number Diff line number Diff line change
@@ -0,0 +1,22 @@
{
"allowedExtraFindings": [],
"caseId": "adversarial-generated-go-mod-current-behavior",
"description": "Generated go.mod paths are still treated as manifest dependency findings by current scanner behavior.",
"expectedFindings": [
{
"file": "generated/go.mod",
"line": 1,
"mustBlock": false,
"ruleId": "POLICY_NEW_GO_DEPENDENCY",
"severity": "MAJOR"
}
],
"kind": "adversarial",
"patch": "diff --git a/generated/go.mod b/generated/go.mod\n@@ -0,0 +1,1 @@\n+github.com/acme/sdk v1.2.3\n",
"tags": [
"dependency",
"generated",
"current-behavior",
"supply-chain"
]
}
Original file line number Diff line number Diff line change
@@ -0,0 +1,22 @@
{
"allowedExtraFindings": [],
"caseId": "adversarial-generated-requirements-current-behavior",
"description": "Generated requirements.txt paths are still treated as manifest dependency findings by current scanner behavior.",
"expectedFindings": [
{
"file": "generated/requirements.txt",
"line": 1,
"mustBlock": false,
"ruleId": "POLICY_NEW_PYTHON_DEPENDENCY",
"severity": "MAJOR"
}
],
"kind": "adversarial",
"patch": "diff --git a/generated/requirements.txt b/generated/requirements.txt\n@@ -0,0 +1,1 @@\n+requests==2.31.0\n",
"tags": [
"dependency",
"generated",
"current-behavior",
"supply-chain"
]
}
Original file line number Diff line number Diff line change
@@ -0,0 +1,14 @@
{
"allowedExtraFindings": [],
"caseId": "negative-docs-generated-package-json-no-trigger",
"description": "Generated dependency examples under docs should remain excluded by docs/ path scope.",
"expectedFindings": [],
"kind": "negative",
"patch": "diff --git a/docs/generated/package.json b/docs/generated/package.json\n@@ -0,0 +1,1 @@\n+\"left-pad\": \"^1.3.0\"\n",
"tags": [
"dependency",
"generated",
"docs",
"negative"
]
}
Original file line number Diff line number Diff line change
@@ -0,0 +1,14 @@
{
"allowedExtraFindings": [],
"caseId": "negative-examples-generated-requirements-no-trigger",
"description": "Generated dependency examples under examples should remain excluded by examples/ path scope.",
"expectedFindings": [],
"kind": "negative",
"patch": "diff --git a/examples/generated/requirements.txt b/examples/generated/requirements.txt\n@@ -0,0 +1,1 @@\n+requests==2.31.0\n",
"tags": [
"dependency",
"generated",
"examples",
"negative"
]
}
Original file line number Diff line number Diff line change
@@ -0,0 +1,14 @@
{
"allowedExtraFindings": [],
"caseId": "negative-generated-nonmanifest-dependency-string-no-trigger",
"description": "Dependency-looking strings in generated non-manifest files should not trigger dependency findings.",
"expectedFindings": [],
"kind": "negative",
"patch": "diff --git a/generated/metadata.txt b/generated/metadata.txt\n@@ -0,0 +1,1 @@\n+\"left-pad\": \"^1.3.0\"\n",
"tags": [
"dependency",
"generated",
"nonmanifest",
"negative"
]
}
Original file line number Diff line number Diff line change
@@ -0,0 +1,14 @@
{
"allowedExtraFindings": [],
"caseId": "negative-tests-generated-go-mod-no-trigger",
"description": "Generated dependency examples under tests should remain excluded by tests/ path scope.",
"expectedFindings": [],
"kind": "negative",
"patch": "diff --git a/tests/generated/go.mod b/tests/generated/go.mod\n@@ -0,0 +1,1 @@\n+github.com/acme/sdk v1.2.3\n",
"tags": [
"dependency",
"generated",
"tests",
"negative"
]
}
Loading