feat(replay): operational integrity scripts (verify, find-gaps, reconcile)#351
feat(replay): operational integrity scripts (verify, find-gaps, reconcile)#351ctol3r wants to merge 1 commit into
Conversation
…cile) Stacked on #344 (Wave 14 survivability suite). Adds three read-only CLI tools under scripts/replay/ that exercise the canonical replay-identity contract from PR #343 + #344 against arbitrary evidence inputs (JSON fixtures or, in future, Prisma snapshots). Operationally these are the offline / CI equivalents of the survivability properties already pinned by the in-process jest suite in apps/api/backend/src/services/replay/__tests__/. New files: scripts/replay/_lib.ts parseArgs, normalizeEvidence, loadEvidenceFromFile, chronologyKey, hoursBetween, emitJsonLine / emitHumanLine — dependency-light helpers shared across all three scripts. scripts/replay/verify-replay-integrity.ts Recomputes lineageKey + runId for an evidence set and verifies: 1. Determinism — 25 recomputations collapse to one runId 2. Cosmetic-input invariance — whitespace/order changes are absorbed 3. Sensitivity — tampered inputs always diverge 4. (Optional) expected-match — runId matches an externally-provided expected value (audit-corruption signal) Exit codes: 0 ok / 2 bad input / 3 integrity FAILED scripts/replay/find-replay-gaps.ts Scans a chronological list of snapshots (same lineageKey) and reports continuity gaps longer than --max-gap-hours (default 720h). Exit codes: 0 continuous / 2 bad input / 4 gap(s) detected. scripts/replay/reconcile-lineage.ts Reconciles two candidate evidence sets: same-lineage-same-snapshot (exit 0) — perfectly reconciled same-lineage-different-snapshot (exit 5) — lineage continuous, evidence delta surfaced (artifactsAdded/Removed, lastCheckedAtChanged, channelChanged) different-lineage (exit 6) — DIFFERENT SUBJECTS, verifier action required apps/web/__tests__/replay-scripts.test.ts 28 vitest cases — every exported pure function exercised, plus _lib helpers (parseArgs, normalizeEvidence, chronologyKey, hoursBetween). The three pure functions (verifyReplayIntegrity, findReplayGaps, reconcileLineage) are exported separately from the CLI main() wrappers so they're directly unit-testable without process spawning or filesystem fixtures. Survivability properties enforced by these scripts complement #344's in-process suite: where #344 pins the contract inside a jest run, these scripts let operators verify the contract against real production evidence (e.g. an exported snapshot manifest) post-restart, post-deploy, or as part of CI integrity gates. Validation: 28/28 vitest passing; pnpm turbo run build --filter @vitalcv/web → 13/13 tasks; truth-strings CLEAN.
There was a problem hiding this comment.
💡 Codex Review
Here are some automated review suggestions for this pull request.
Reviewed commit: 1bf9cce270
ℹ️ About Codex in GitHub
Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you
- Open a pull request for review
- Mark a draft as ready
- Comment "@codex review".
If Codex has suggestions, it will comment; otherwise it will react with 👍.
Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".
| if (typeof require !== 'undefined' && require.main === module) { | ||
| main(); |
There was a problem hiding this comment.
Use ESM-safe entrypoint detection for CLI main()
This file is executed as an ES module (#!/usr/bin/env -S node --import=tsx/esm), but the require.main === module guard is CommonJS-only. In ESM, require is undefined, so this condition is false and main() never runs, causing the script to exit without performing any checks; the same guard pattern is also present in scripts/replay/find-replay-gaps.ts and scripts/replay/reconcile-lineage.ts.
Useful? React with 👍 / 👎.
| const tamperedCheckedAt = computeRunId({ | ||
| ...evidence, | ||
| lastCheckedAt: evidence.lastCheckedAt | ||
| ? new Date(new Date(evidence.lastCheckedAt).getTime() + 1000).toISOString() |
There was a problem hiding this comment.
Handle invalid lastCheckedAt before calling toISOString
normalizeEvidence accepts any non-empty string for lastCheckedAt, but this sensitivity check immediately does new Date(...).toISOString(). For invalid timestamp strings, getTime() becomes NaN and toISOString() throws RangeError, so bad input crashes the script instead of producing the documented bad-input behavior.
Useful? React with 👍 / 👎.
Summary
Stacked on #344 (Wave 14 survivability suite). Adds three read-only CLI tools under
scripts/replay/that operators can run against arbitrary evidence inputs (JSON fixtures or, in future, exported snapshot manifests) to verify the canonical replay-identity contract from PR #343 + #344.Where #344 pins survivability inside an in-process jest run, these scripts let operators verify the contract against real production evidence after restart, deploy, or as part of CI integrity gates.
Files
scripts/replay/_lib.tsparseArgs,normalizeEvidence,loadEvidenceFromFile,chronologyKey,hoursBetween, JSON-line emittersscripts/replay/verify-replay-integrity.tslineageKey+runIdand asserts determinism, cosmetic-invariance, sensitivity, optional expected-matchscripts/replay/find-replay-gaps.ts--max-gap-hoursscripts/replay/reconcile-lineage.tssame-lineage-same-snapshot/same-lineage-different-snapshot/different-lineageapps/web/__tests__/replay-scripts.test.ts_libhelpersExit-code taxonomy
verify-replay-integrityfind-replay-gapsreconcile-lineageDistinct exit codes mean a CI pipeline can react differently — gap detection is informational (exit 4) while different-lineage on a receipt reconciliation is a hard alarm (exit 6).
Survivability matrix coverage
The scripts collectively cover all six runtime-turbulence scenarios from
docs/architecture/replay-survivability-matrix.md(shipped in #344):verify-replay-integrity— determinism + expected-matchverify-replay-integrity— sensitivity + expected-matchverify-replay-integrityon degraded fixtureverify-replay-integrity— determinism over 25 iterationsreconcile-lineage— surfaces evidence deltaverify-replay-integrity— wall-clock-independent idsfind-replay-gapsreconcile-lineageTruth rules
Validation
pnpm turbo run build --filter @vitalcv/web→ 13/13 tasksreplayIdentitymoduleapps/web/__tests__/so they run via the existing vitest config; scripts themselves remain at repo root per the briefUsage
Out of scope (explicit follow-ups)
scripts/runtime/verify-runtime-truth.ts(the runtime-coherence gate mentioned in a separate brief) — different concern, different worktree