test(policy-scanner): expand generated path eval corpus#116
Merged
Conversation
Why: - Generated manifest paths are ambiguous because they may be disposable output or runtime/supply-chain inputs. - The scanner should keep that policy boundary explicit in the frozen eval corpus before any future behavior change. What changed: - Add generated-path policy scanner fixtures that preserve top-level generated manifest files as dependency review signals. - Add negative fixtures for generated dependency examples under existing non-production prefixes and for generated non-manifest text. - Update the policy scanner baseline and README metrics for the 82-fixture corpus. Testing: - bash tests/test-policy-scanner-evals.sh - bash tests/test-policy-scanner-eval-compare.sh - bash tests/test-signum-evolve-v1.sh - PATH=/opt/homebrew/bin:/Users/vi/.codex/tmp/arg0/codex-arg0IyheAr:/opt/homebrew/opt/openjdk@17/libexec/openjdk.jdk/Contents/Home/bin:/Users/vi/Library/Android/sdk/platform-tools:/Users/vi/Library/Android/sdk/emulator:/Users/vi/.antigravity/antigravity/bin:/Users/vi/.agents/bin:/Users/vi/.opencode/bin:/Users/vi/.local/bin:/Users/vi/go/bin:/opt/homebrew/opt/libpq/bin:/Users/vi/.local/bin:/opt/homebrew/bin:/opt/homebrew/sbin:/usr/local/bin:/System/Cryptexes/App/usr/bin:/usr/bin:/bin:/usr/sbin:/sbin:/var/run/com.apple.security.cryptexd/codex.system/bootstrap/usr/local/bin:/var/run/com.apple.security.cryptexd/codex.system/bootstrap/usr/bin:/var/run/com.apple.security.cryptexd/codex.system/bootstrap/usr/appleinternal/bin:/opt/pkg/env/active/bin:/opt/pmk/env/global/bin:/Library/Apple/usr/bin:/usr/local/go/bin:/Users/vi/.local/bin:/Users/vi/.cargo/bin:/Users/vi/Library/Application Support/JetBrains/Toolbox/scripts:/Applications/Codex.app/Contents/Resources bash scripts/run-deterministic-tests.sh Risk: - narrow - This changes eval coverage and baseline fixture count only; scanner behavior and policy rule catalog are unchanged. Constraint: Do not change scanner behavior, policy rules, Codex prompt, Claude overlay runtime, CI wiring, or generated experiment output.
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Summary
generated/manifest files as dependency review signals.Why
generated/package.jsonis ambiguous: it may be disposable output, but it may also affect runtime or supply-chain surfaces.What changed
generated/client/package.jsongenerated/requirements.txtgenerated/go.moddocs/generated/package.jsonexamples/generated/requirements.txttests/generated/go.modgenerated/metadata.txtevals/policy_scanner/baselines/current.jsonfor the 82-fixture corpus.evals/policy_scanner/README.mdwith the generated-path policy note.Review focus
generated/manifests still produce dependency review signals;Test plan
bash tests/test-policy-scanner-evals.shbash tests/test-policy-scanner-eval-compare.shbash tests/test-signum-evolve-v1.shPATH=/opt/homebrew/bin:$PATH bash scripts/run-deterministic-tests.shNot changed
policy_scan.jsonoutput shapeCurrent metrics
Risks
Rollout / migration
Breaking changes
Follow-ups
generated/manifests are mostly disposable noise, propose a separate measured scanner behavior PR.Merge strategy recommendation