Skip to content

feat(replay): cross-surface convergence verifier (operational tool)#354

Open
ctol3r wants to merge 1 commit into
wave/replay-integrity-scriptsfrom
wave/replay-surface-convergence
Open

feat(replay): cross-surface convergence verifier (operational tool)#354
ctol3r wants to merge 1 commit into
wave/replay-integrity-scriptsfrom
wave/replay-surface-convergence

Conversation

@ctol3r
Copy link
Copy Markdown
Owner

@ctol3r ctol3r commented May 13, 2026

Summary

Stacked on #351. Fourth CLI tool — probes every trust-bearing surface VitalCV exposes for a given NPI and verifies they all report THE SAME replay-identity truth.

Where #351's scripts work on offline JSON fixtures, this one talks to a live runtime over HTTP and catches cross-surface drift — the failure mode where a verifier reading the passport page and the receipt JWT for the same NPI would see different lineageKey / runId / checkedAt / active kid.

Probed surfaces

Surface Shipped in
/api/passport/npi/:npi Wave 10 (#343)
/api/passport/:npi (proxy) #340
/api/receipt/:npi (signed JWT) #349
/.well-known/jwks.json #349
/.well-known/did.json #349
/.well-known/trust-register #349

Five convergence findings

Finding Asserts
lineageKey-consistency every surface emitting one agrees
runId-consistency every surface emitting one agrees
checkedAt-consistency passport lastCheckedAt == receipt vcv.checkedAt
issuer-kid-consistency JWKS, did.json verificationMethod, trust-register, receipt X-Receipt-Kid header, and JWT protected header all carry the same kid
receipt-chronology receipt iat seconds == passport lastCheckedAt seconds (proves #349's deterministic pinning)

Exit codes

Code Meaning
0 every surface agrees
2 bad input (missing/invalid NPI)
7 divergence detected (verifier-actionable signal)

Graceful degradation

Failed surface probes (network errors, 5xx, missing routes) drop out of the findings rather than falsely flagging divergence. A missing receipt does not produce a kid mismatch — it just reduces the observation count on the relevant finding.

Pure analyzer + CLI wrapper

analyzeConvergence(observations) is exported separately from the main() wrapper, so it's unit-testable with inert fixture observations. No live HTTP, no JWT signing, no process spawning in tests.

Files

File Status
scripts/replay/verify-surface-convergence.ts new — script + exported analyzeConvergence
apps/web/__tests__/replay-surface-convergence.test.ts new — 9 vitest cases

Operational hook

This is pre-flight tooling for the Tier-1 merge plan from the canonical merge graph: after each Tier-A merge, operators run this against the Vercel preview URL + Railway preview URL to verify the new surfaces agree BEFORE promoting to production. Faster than full Codex re-runs and catches divergence Codex can't see (which is per-PR, not cross-surface).

Truth rules

  • Banned-strings scan: CLEAN
  • No new product claims; tool reports observable cross-surface state

Validation

  • Targeted vitest: 9/9 passing
  • Full web build: pnpm turbo run build --filter @vitalcv/web13/13 tasks
  • No new runtime dependencies (uses jose for JWT decode, already in deps)

Out of scope (explicit follow-ups)

  • Reverse-lookup variant by lineageKey (rather than NPI) — would need a backend endpoint
  • Multi-NPI batch mode — trivial extension; deferred until operator demand
  • Production-gate CI workflow that runs this against the preview URL on every merge — would be a small .github/workflows/ PR

Stacked on #351 (replay integrity scripts). Adds a fourth CLI tool
that probes every trust-bearing surface VitalCV exposes for a given
NPI and verifies they all report THE SAME replay-identity truth.

Where #351's scripts operate on offline evidence fixtures, this one
talks to a live runtime over HTTP and reports cross-surface drift —
the failure mode where a verifier reading the passport page and the
receipt JWT for the same NPI would see different lineageKeys, runIds,
checkedAts, or active kids.

Probed surfaces:
  /api/passport/npi/:npi      (Wave 10 / #343)
  /api/passport/:npi          (proxy / #340)
  /api/receipt/:npi           (#349)
  /.well-known/jwks.json      (#349)
  /.well-known/did.json       (#349)
  /.well-known/trust-register (#349)

Five convergence findings:
  1. lineageKey-consistency  — every surface emitting one must agree
  2. runId-consistency       — every surface emitting one must agree
  3. checkedAt-consistency   — passport.lastCheckedAt == receipt.vcv.checkedAt
  4. issuer-kid-consistency  — JWKS, did.json, trust-register, receipt
                                header X-Receipt-Kid, and JWT's protected
                                header all carry the same kid
  5. receipt-chronology      — receipt.iat seconds == passport.lastCheckedAt
                                seconds (proves the deterministic
                                pinning from #349)

The pure analyzer `analyzeConvergence(observations)` is exported
separately from the CLI main() wrapper so it's unit-testable with
inert fixture observations — no live HTTP, no JWT signing, no
process spawning.

Exit codes:
  0  every surface agrees
  2  bad input (missing/invalid NPI)
  7  divergence detected (verifier-actionable signal)

Failed surface probes (network errors, 5xx) gracefully drop out of
the findings rather than falsely flagging divergence — a missing
receipt does not produce a kid mismatch, it just reduces the
observation count.

Tests (apps/web/__tests__/replay-surface-convergence.test.ts, 9
vitest cases):
  - coherent baseline (all 5 findings ok)
  - lineageKey divergence between passport and receipt
  - runId divergence between two passport surfaces
  - checkedAt divergence between passport and receipt vcv block
  - kid divergence between jwks and did.json (rotation drift)
  - receipt chronology drift (iat off by 60s vs passport)
  - degraded probe handling: jwks 503 doesn't falsely flag
  - missing receipt: chronology check short-circuits ok
  - single-source observations report ok with count=1

Usage:
  pnpm exec tsx scripts/replay/verify-surface-convergence.ts \
    --base-url http://localhost:3030 \
    --npi 1346053246

This is operational pre-flight tooling for the Tier-1 merge plan
documented in the canonical merge graph: after each Tier-A merge,
operators run this against the Vercel preview URL + Railway preview
URL to verify the new surfaces agree before promoting to production.

Validation: 9/9 vitest passing; pnpm turbo run build --filter
@vitalcv/web → 13/13 tasks; truth-strings CLEAN.
@vercel
Copy link
Copy Markdown

vercel Bot commented May 13, 2026

The latest updates on your projects. Learn more about Vercel for GitHub.

Project Deployment Actions Updated (UTC)
vcv-web Ready Ready Preview, Comment May 13, 2026 0:38am
vitalcv Ready Ready Preview, Comment May 13, 2026 0:38am

Copy link
Copy Markdown

@chatgpt-codex-connector chatgpt-codex-connector Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

💡 Codex Review

Here are some automated review suggestions for this pull request.

Reviewed commit: 7982202b3f

ℹ️ About Codex in GitHub

Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you

  • Open a pull request for review
  • Mark a draft as ready
  • Comment "@codex review".

If Codex has suggestions, it will comment; otherwise it will react with 👍.

Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".

}

function allEqual<T>(values: readonly T[]): boolean {
if (values.length <= 1) return true;
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

P1 Badge Fail when convergence has no observations

Treating <= 1 values as converged causes empty datasets to pass as healthy: if the target is down (or all probes return no usable payloads), each consistency check becomes vacuously true and the script exits 0, which can incorrectly green-light a broken environment. This tool is described as a pre-flight gate, so it should require a minimum observation count (at least 2 for pairwise consistency, and at least one successful probe overall) before reporting success.

Useful? React with 👍 / 👎.

Comment on lines +222 to +223
{ source: 'jwks', value: jwks?.keys?.[0]?.kid },
{ source: 'did.json', value: did?.verificationMethod?.[0]?.publicKeyJwk?.kid },
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

P2 Badge Compare active KID, not first JWKS/DID entry

Using keys[0] and verificationMethod[0] makes the result depend on array ordering rather than the active signing key, so normal key-rotation documents with multiple keys can be falsely flagged as divergence (or miss real divergence if stale keys happen to align). The consistency check should select the key indicated by active metadata (e.g., trust-register.signing.active_kid) or validate membership against all advertised kids.

Useful? React with 👍 / 👎.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants