Skip to content

feat(observability): durable per-evaluation policy-audit summary + would-block rate (#3727)#3820

Merged
williamzujkowski merged 2 commits into
mainfrom
feat/policy-audit-denominator-3727
Jun 9, 2026
Merged

feat(observability): durable per-evaluation policy-audit summary + would-block rate (#3727)#3820
williamzujkowski merged 2 commits into
mainfrom
feat/policy-audit-denominator-3727

Conversation

@williamzujkowski

Copy link
Copy Markdown
Collaborator

Closes #3727. Decided by a 7/7 higher_order vote (Option A — capture-now, supplement).

Bug

The pipeline policy gate's durable dual-emit was gated on if (violations.length > 0) (policy-evaluator.ts:244) — a clean evaluation wrote nothing anywhere (no bus event, no durable record, no OutcomeStore; Contrarian-check confirmed). So the durable log captured only the numerator (violations); the would-block RATE had no denominator (total evaluations). The soak window (warn-mode default since #3769) was accumulating with the denominator non-backfillable.

(The inputTrustTier:'4' the issue also flagged is by design — pipeline policy is provenance/stage-driven with no single user tier; documented at line 305. Left unchanged.)

Fix

  • evaluatePipelinePolicy now appends one per-evaluation summary record (recordKind:'summary', violationCount) on every evaluation including clean ones — in addition to the existing per-violation records (now tagged recordKind:'violation'), preserving the observability: converge policy audit emission (emitPolicyEvent vs emitPolicyEvents) onto one durable sink #3710 count-parity invariant (the parity assertion now scopes by recordKind).
  • New computePolicyWouldBlockRate(events): denominator = summary records; numerator = summaries with violationCount > 0. Per-violation records are intentionally not counted (would double-count).
  • recordKind + violationCount round-trip through the audit bridge into the persisted record (the readiness gate reads persisted records). Extracted pipelinePolicyMetadata to keep mapPolicyGate under the complexity cap.

Live-routing use of the rate stays gated on the #3769-enforce soak-readiness gate.

Verification (TDD)

clean → 1 summary + 0 violation; N-violation → N violation + 1 summary(count=N) with #3710 parity scoped by recordKind; bridge round-trip of the new fields; rate over fixtures; empty → 0. 1488 tests pass; typecheck, lint, governance, producer-consumer clean.

🤖 Generated with Claude Code

…ate (#3727)

The pipeline policy gate's durable dual-emit was gated on `violations.length > 0`,
so a CLEAN evaluation wrote nothing — the durable log captured only the numerator
(violations), and the would-block RATE had no denominator (total evaluations).
The soak window (warn-mode default since #3769) was accumulating with no
recoverable denominator.

Fix (7/7 higher_order vote — Option A, SUPPLEMENT): emit ONE per-evaluation
summary record (recordKind:'summary', violationCount) on EVERY evaluation incl
clean ones, IN ADDITION to the per-violation records (now recordKind:'violation')
— preserving the #3710 count-parity invariant (the parity assertion now scopes by
recordKind). New computePolicyWouldBlockRate(events): denominator = summary
records, numerator = summaries with violationCount>0. recordKind + violationCount
round-trip through the audit bridge into the persisted record (extracted
pipelinePolicyMetadata to keep mapPolicyGate under the complexity cap).

The inputTrustTier:'4' the issue also flagged is BY DESIGN (pipeline policy has no
single user tier; documented) — left unchanged. Live-routing use of the rate stays
gated on #3769-enforce readiness.

TDD: clean→1 summary+0 violation; N-violation→N violation+1 summary(count=N) with
#3710 parity scoped by recordKind; bridge round-trip; rate over fixtures; empty→0.
1488 tests pass; typecheck/lint/governance/producer-consumer clean.

Closes #3727.

Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
…summary record

The two runDevPipeline durable-policy-audit integration tests asserted one
policy_gate record per evaluation; #3727 adds a per-evaluation summary record, so
each warn-mode violation now persists 2 records (1 violation + 1 summary). Scope
the per-violation parity assertions by metadata.recordKind and assert the summary
record's violationCount. verifyChain still passes (chain integrity unaffected).
Caught by CI (full suite); the policy-evaluator unit copies were already updated.

Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
@williamzujkowski williamzujkowski merged commit 2bdda5d into main Jun 9, 2026
42 checks passed
@williamzujkowski williamzujkowski deleted the feat/policy-audit-denominator-3727 branch June 9, 2026 20:15
@github-project-automation github-project-automation Bot moved this from Backlog to Done in nexus-agents project Jun 9, 2026
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

observability: durable policy audit (#3710) records violations only — no allow-baseline for the would-block rate

1 participant