feat(replay): cross-surface convergence verifier (operational tool)#354
feat(replay): cross-surface convergence verifier (operational tool)#354ctol3r wants to merge 1 commit into
Conversation
Stacked on #351 (replay integrity scripts). Adds a fourth CLI tool that probes every trust-bearing surface VitalCV exposes for a given NPI and verifies they all report THE SAME replay-identity truth. Where #351's scripts operate on offline evidence fixtures, this one talks to a live runtime over HTTP and reports cross-surface drift — the failure mode where a verifier reading the passport page and the receipt JWT for the same NPI would see different lineageKeys, runIds, checkedAts, or active kids. Probed surfaces: /api/passport/npi/:npi (Wave 10 / #343) /api/passport/:npi (proxy / #340) /api/receipt/:npi (#349) /.well-known/jwks.json (#349) /.well-known/did.json (#349) /.well-known/trust-register (#349) Five convergence findings: 1. lineageKey-consistency — every surface emitting one must agree 2. runId-consistency — every surface emitting one must agree 3. checkedAt-consistency — passport.lastCheckedAt == receipt.vcv.checkedAt 4. issuer-kid-consistency — JWKS, did.json, trust-register, receipt header X-Receipt-Kid, and JWT's protected header all carry the same kid 5. receipt-chronology — receipt.iat seconds == passport.lastCheckedAt seconds (proves the deterministic pinning from #349) The pure analyzer `analyzeConvergence(observations)` is exported separately from the CLI main() wrapper so it's unit-testable with inert fixture observations — no live HTTP, no JWT signing, no process spawning. Exit codes: 0 every surface agrees 2 bad input (missing/invalid NPI) 7 divergence detected (verifier-actionable signal) Failed surface probes (network errors, 5xx) gracefully drop out of the findings rather than falsely flagging divergence — a missing receipt does not produce a kid mismatch, it just reduces the observation count. Tests (apps/web/__tests__/replay-surface-convergence.test.ts, 9 vitest cases): - coherent baseline (all 5 findings ok) - lineageKey divergence between passport and receipt - runId divergence between two passport surfaces - checkedAt divergence between passport and receipt vcv block - kid divergence between jwks and did.json (rotation drift) - receipt chronology drift (iat off by 60s vs passport) - degraded probe handling: jwks 503 doesn't falsely flag - missing receipt: chronology check short-circuits ok - single-source observations report ok with count=1 Usage: pnpm exec tsx scripts/replay/verify-surface-convergence.ts \ --base-url http://localhost:3030 \ --npi 1346053246 This is operational pre-flight tooling for the Tier-1 merge plan documented in the canonical merge graph: after each Tier-A merge, operators run this against the Vercel preview URL + Railway preview URL to verify the new surfaces agree before promoting to production. Validation: 9/9 vitest passing; pnpm turbo run build --filter @vitalcv/web → 13/13 tasks; truth-strings CLEAN.
There was a problem hiding this comment.
💡 Codex Review
Here are some automated review suggestions for this pull request.
Reviewed commit: 7982202b3f
ℹ️ About Codex in GitHub
Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you
- Open a pull request for review
- Mark a draft as ready
- Comment "@codex review".
If Codex has suggestions, it will comment; otherwise it will react with 👍.
Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".
| } | ||
|
|
||
| function allEqual<T>(values: readonly T[]): boolean { | ||
| if (values.length <= 1) return true; |
There was a problem hiding this comment.
Fail when convergence has no observations
Treating <= 1 values as converged causes empty datasets to pass as healthy: if the target is down (or all probes return no usable payloads), each consistency check becomes vacuously true and the script exits 0, which can incorrectly green-light a broken environment. This tool is described as a pre-flight gate, so it should require a minimum observation count (at least 2 for pairwise consistency, and at least one successful probe overall) before reporting success.
Useful? React with 👍 / 👎.
| { source: 'jwks', value: jwks?.keys?.[0]?.kid }, | ||
| { source: 'did.json', value: did?.verificationMethod?.[0]?.publicKeyJwk?.kid }, |
There was a problem hiding this comment.
Compare active KID, not first JWKS/DID entry
Using keys[0] and verificationMethod[0] makes the result depend on array ordering rather than the active signing key, so normal key-rotation documents with multiple keys can be falsely flagged as divergence (or miss real divergence if stale keys happen to align). The consistency check should select the key indicated by active metadata (e.g., trust-register.signing.active_kid) or validate membership against all advertised kids.
Useful? React with 👍 / 👎.
Summary
Stacked on #351. Fourth CLI tool — probes every trust-bearing surface VitalCV exposes for a given NPI and verifies they all report THE SAME replay-identity truth.
Where #351's scripts work on offline JSON fixtures, this one talks to a live runtime over HTTP and catches cross-surface drift — the failure mode where a verifier reading the passport page and the receipt JWT for the same NPI would see different
lineageKey/runId/checkedAt/ activekid.Probed surfaces
/api/passport/npi/:npi/api/passport/:npi(proxy)/api/receipt/:npi(signed JWT)/.well-known/jwks.json/.well-known/did.json/.well-known/trust-registerFive convergence findings
lineageKey-consistencyrunId-consistencycheckedAt-consistencylastCheckedAt== receiptvcv.checkedAtissuer-kid-consistencyX-Receipt-Kidheader, and JWT protected header all carry the samekidreceipt-chronologyiatseconds == passportlastCheckedAtseconds (proves #349's deterministic pinning)Exit codes
Graceful degradation
Failed surface probes (network errors, 5xx, missing routes) drop out of the findings rather than falsely flagging divergence. A missing receipt does not produce a kid mismatch — it just reduces the observation count on the relevant finding.
Pure analyzer + CLI wrapper
analyzeConvergence(observations)is exported separately from themain()wrapper, so it's unit-testable with inert fixture observations. No live HTTP, no JWT signing, no process spawning in tests.Files
scripts/replay/verify-surface-convergence.tsanalyzeConvergenceapps/web/__tests__/replay-surface-convergence.test.tsOperational hook
This is pre-flight tooling for the Tier-1 merge plan from the canonical merge graph: after each Tier-A merge, operators run this against the Vercel preview URL + Railway preview URL to verify the new surfaces agree BEFORE promoting to production. Faster than full Codex re-runs and catches divergence Codex can't see (which is per-PR, not cross-surface).
Truth rules
Validation
pnpm turbo run build --filter @vitalcv/web→ 13/13 tasksjosefor JWT decode, already in deps)Out of scope (explicit follow-ups)
lineageKey(rather than NPI) — would need a backend endpoint.github/workflows/PR