Skip to content
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
26 changes: 25 additions & 1 deletion CHANGELOG.md
Original file line number Diff line number Diff line change
Expand Up @@ -2,7 +2,31 @@

All notable changes to `testing-os` are documented here. The format follows [Keep a Changelog](https://keepachangelog.com/en/1.1.0/), and this project adheres to [Semantic Versioning](https://semver.org/spec/v2.0.0.html).

## [Unreleased]
## [1.3.2] — 2026-06-02

**`@dogfood-lab/dogfood-swarm` self-audit health pass.** The swarm runner audited itself with its own 10-phase protocol. Stage A (bug/security) landed 28 fixes; Stage C (hardening / operator-UX) followed with exit-code-contract and documentation closure; two deferred follow-ups then landed — fp-p-005 made the finding fingerprint a pure, injective function of the finding's own stable content (an edit-stable context-snippet hash), and fp-p-006 consolidated the agent-output schema into `@dogfood-lab/schemas`. The only package-shape change is one new internal workspace dependency (`@dogfood-lab/dogfood-swarm` → `@dogfood-lab/schemas`, see below); no breaking changes (fp-p-005's behavior change is backward-compatible — see below). Findings recorded under the run in [`swarms/swarm-1780390764-7dab/`](swarms/swarm-1780390764-7dab/).

### Security & correctness (Stage A)

- **Verify engine — honest verdicts** ([`packages/dogfood-swarm/lib/verify/runner.js`](packages/dogfood-swarm/lib/verify/runner.js), [`packages/dogfood-swarm/lib/verify/adapters/node.js`](packages/dogfood-swarm/lib/verify/adapters/node.js)). The wave gate no longer reports a clean `pass` for non-evidence. `no_tests` (ve-004) distinguishes "the repo has no `test` script and `npm test --if-present` ran nothing" from a real pass; `tool_missing` (ve-p-001) distinguishes "a required build tool is absent from `PATH`" from a code failure; `skip` (ve-005) stops an empty required-step set from being a vacuous `pass`. A typo'd `--threshold` now fails loud (`CLI_INVALID_THRESHOLD`, ve-002) instead of silently disabling the CI gate via a `NaN` comparison. Step output is bounded and per-step timeouts are tagged `timed_out` rather than misread as fast failures.
- **Finding-id & agent-output integrity** (ve-001 verify-classifier, fp-001 agent-output schema validation as the two criticals; fp-002 within-wave fingerprint collisions, fp-003 non-ASCII git paths, fp-004 TOCTOU byte-gate, sm-001/002 domain handling, cli-001 rewind among the highs). Agent outputs are schema-validated at collect time with a structured `AgentOutputValidationError`; fingerprints stay collision-resistant within a wave and across non-ASCII paths.

### Hardening & operator UX (Stage C)

- **Exit-code contract closed on the CI-gate verbs** ([`packages/dogfood-swarm/cli.js`](packages/dogfood-swarm/cli.js)). `swarm verify` now exits non-zero on a `fail` verdict, and `swarm persist --ingest` exits non-zero when the dogfood ingest fails — aligning both with the 3-way (`0`/`1`/`2`) contract the `verify-*` and `findings` verbs already honored, so a non-interactive CI step can no longer go green on a hard failure.
- **Operator documentation** ([`packages/dogfood-swarm/README.md`](packages/dogfood-swarm/README.md)). The package README now documents the exit-code contract for every gate-capable verb, the five verify verdicts (`pass`/`fail`/`skip`/`no_tests`/`tool_missing`), the three scriptable environment variables (`SWARM_DB`, `DOGFOOD_FINDINGS_FORMAT`, `DOGFOOD_LOG_HUMAN`) with the NDJSON-on-stderr diagnostic channel, and a symptom→recovery-verb troubleshooting table that deep-links the handbook recovery and error-codes pages.
- **README→CLI contract test hardened** ([`packages/dogfood-swarm/meta-amendA-readme-contract.test.js`](packages/dogfood-swarm/meta-amendA-readme-contract.test.js)). The td-006 guard now also pins the operator-facing env-var vocabulary: it reads the real `process.env.*` literals from source and asserts each documented var appears in the README, closing the drift class (undocumented env vars) that the command-only check left open.

### Schema consolidation (fp-p-006)

- **One source of truth for the agent-output schema** ([`packages/schemas/src/json/agent-output.schema.json`](packages/schemas/src/json/agent-output.schema.json), [`packages/dogfood-swarm/lib/validate-agent-output.js`](packages/dogfood-swarm/lib/validate-agent-output.js), [`packages/dogfood-swarm/lib/templates.js`](packages/dogfood-swarm/lib/templates.js)). The fp-001 packaging fix had shipped a package-local copy at `packages/dogfood-swarm/schema/` guarded by a byte-equality drift test, because the repo-root `scripts/agent-output.schema.json` was absent from the published tarball. fp-p-006 (deferred from the same self-audit) removes the controlled duplication: the schema now lives in `@dogfood-lab/schemas`, and both the collect-time validator and the dispatch prompt-builder resolve it via `createRequire('@dogfood-lab/schemas/json/agent-output.schema.json')` — the same `./json/*` subpath pattern the eight contract schemas already use. `@dogfood-lab/dogfood-swarm` gains `@dogfood-lab/schemas` as a dependency; the package-local copy, the repo-root copy, and the `meta-amendA-schema-packaging.test.js` drift guard are deleted. The schema's `$id` moves to the canonical `packages/schemas/src/json/` path — a contract field, hence the lockstep bump. The schema ships as a raw JSON subpath (not registered in `validatePayload`): it stays a swarm output envelope compiled with a local Ajv, allowlisted in the single-canonical-validator gate ([`scripts/check-validator-cache-singleton.test.mjs`](scripts/check-validator-cache-singleton.test.mjs)).

### Fingerprint stability (fp-p-005)

- **Edit-stable context-snippet hash → injective base fingerprints** ([`packages/dogfood-swarm/lib/fingerprint.js`](packages/dogfood-swarm/lib/fingerprint.js), [`packages/dogfood-swarm/commands/collect.js`](packages/dogfood-swarm/commands/collect.js)). The base fingerprint was `sha256(category | rule_id | path | symbol | 10-line-bucket)`. Two genuinely-distinct symbol-less findings in the same file and bucket collided on the base fp; fp-002's `disambiguateFingerprints` salted the collision apart, correctly but with bounded residual new/recurring churn when a collision group grew or shrank across waves. fp-p-005 (deferred from the same self-audit) folds in an **edit-stable context-snippet hash** — the surrounding ~7 source lines around the finding, whitespace-collapsed and line-ending-normalized — as the LOCATION component when the source file is readable at collect time. This is the CodeQL `primaryLocationLineHash` design (hash the surrounding *content*, not the line number): it survives reflow, re-indentation, and code inserted elsewhere that shifts the finding's line number, while giving two findings at different points in one file *different* base fingerprints. The base fp is now a pure, injective function of the finding's own stable content, so `disambiguateFingerprints` is demoted from the primary collision mechanism to a **safety net** that fires only on the no-source fallback path and the rare case of two findings with byte-identical surrounding source. `computeFingerprint(finding, { sourceText })` reads no filesystem itself — `collect.js` reads each finding's file once (cached, size-guarded at 2 MB, path-contained to the worktree) and threads the text in. Coverity's enclosing-function key is the same idea at function granularity; the existing `symbol` component already carries the enclosing function name when the auditor reports one.
- **Backward-compatible by construction.** When no source is available (synthetic finding, deleted/unresolvable file, file-level finding with no line, or a path that escapes the worktree), LOCATION degrades to the historical 10-line bucket and the fingerprint is **byte-for-byte** what it was before — so the B-BACK-002 description-stability contract and the existing cross-wave dedup of source-less findings are untouched. The optional second argument means every existing `computeFingerprint(finding)` call site is unaffected.
- **Semantics note (one-time re-fingerprint).** Because the LOCATION encoding changes when source is present, a finding carried in a pre-upgrade `control-plane.db` will get a *new* (context-folded) fingerprint the first time it is re-audited with source available — a one-time `new` + `fixed`/`unverified` churn on that first post-upgrade wave, after which it is stable. The live `control-plane.db` on this rig holds zero findings, so there is no migration impact here; long-lived stores elsewhere will see the one-time churn once and then settle.
- **Tests** ([`packages/dogfood-swarm/meta-amendA-findings-persist.test.js`](packages/dogfood-swarm/meta-amendA-findings-persist.test.js), [`packages/dogfood-swarm/d3b-006-finding-id-collision.test.js`](packages/dogfood-swarm/d3b-006-finding-id-collision.test.js)). New coverage locks: `extractContextSnippet` null/edge cases + reflow/CRLF/indentation invariance; `computeFingerprint` injectivity for distinct same-bucket locations, B-BACK-002 stability via source (not prose), reflow survival, and byte-identical no-source fallback; the fp-002 cross-wave scenario re-run **in both input orders** with source, proving no collision group forms (A keeps its `finding_id` as `recurring`, B inserts as `new`, identically regardless of order) and nothing is salted; a real-worktree `collect()` integration test proving two same-bucket findings persist as two distinct rows whose fingerprints are the context-hash fps (not the shared no-source bucket fp); and the D3B-006 content-addressed `finding_id` derivation composed with a context-folded fingerprint. The fp-002/fp-r-001/fp-p-001 occurrence-salting tests stay green — they now exercise the no-source safety-net path.

## [1.3.1] — 2026-06-01

Expand Down
6 changes: 3 additions & 3 deletions CLAUDE.md
Original file line number Diff line number Diff line change
Expand Up @@ -91,7 +91,7 @@ This repo mirrors `world-forge` deliberately (npm workspaces, `tsc --build` comp
- Package names mirror the directory: `packages/findings/` → `@dogfood-lab/findings`. Exception: `dogfood-swarm` (the directory name disambiguates from generic "swarm")

### Versioning
**Lockstep.** All packages bump together. Currently **`1.3.1`** ([release v1.3.1 cut 2026-06-01](https://github.com/dogfood-lab/testing-os/releases/tag/v1.3.1); release v1.3.0 cut 2026-06-01; release v1.2.3 cut 2026-05-20; first stable v1.0.0 cut 2026-04-25). Six of seven `@dogfood-lab/*` packages have shipped to npm since v1.2.0; the seventh (`portfolio`) remains workspace-internal. The README's `<!-- version:start -->` block is auto-stamped by `scripts/sync-version.mjs` (runs as `prebuild`). Use `npm run sync-version:check` as a CI gate when you bump.
**Lockstep.** All packages bump together. Currently **`1.3.2`** ([release v1.3.1 cut 2026-06-01](https://github.com/dogfood-lab/testing-os/releases/tag/v1.3.1); release v1.3.0 cut 2026-06-01; release v1.2.3 cut 2026-05-20; first stable v1.0.0 cut 2026-04-25). Six of seven `@dogfood-lab/*` packages have shipped to npm since v1.2.0; the seventh (`portfolio`) remains workspace-internal. The README's `<!-- version:start -->` block is auto-stamped by `scripts/sync-version.mjs` (runs as `prebuild`). Use `npm run sync-version:check` as a CI gate when you bump.

### TypeScript
`tsconfig.base.json` is the only place to set compiler options. Per-package `tsconfig.json` extends it and adds `outDir`/`rootDir`/`include`. `composite: true` everywhere. Never set `baseUrl` (deprecated; bit repo-knowledge in CI).
Expand All @@ -118,12 +118,12 @@ Tests that need policy/schema/record fixtures read them from the runtime data di
When a new test needs a new fixture, add it under `fixtures/<category>/<scenario>.yaml` (or `.json`). Fixture filenames should describe what they exercise: `valid/well-formed-mcp-server-record.yaml`, `invalid/missing-source-record-ids.yaml`.

### Schemas
JSON Schema 2020-12. Title and description on every schema and every property. `additionalProperties: false` unless an open-ended bag is genuinely intended. The 8 current schemas in `packages/schemas/src/json/` are the canonical examples.
JSON Schema 2020-12. Title and description on every schema and every property. `additionalProperties: false` unless an open-ended bag is genuinely intended. The 8 contract-spine schemas in `packages/schemas/src/json/` (those registered in `validatePayload`) are the canonical examples. A 9th file, `agent-output.schema.json`, lives in the same directory and ships via the `./json/*` subpath, but it is a swarm output envelope resolved with a local Ajv — not a registered payload schema.

`$id` URLs point at the canonical monorepo path: `https://github.com/dogfood-lab/testing-os/packages/schemas/src/json/<name>.schema.json`. If you ever change a schema in a way that consumers should treat as a contract change, bump the workspace lockstep version — `$id` is a contract field.

### Ship gate
`SHIP_GATE.md` at the repo root tracks what shipcheck audits. Hard gates A–D (Security, Errors, Operator Docs, Hygiene) currently pass at 100% (21 checked / 16 SKIP-with-justification / 0 unchecked at v1.3.1, re-affirmed 2026-06-01). Soft gate E (Identity) is fully met. Re-run `npx @mcptoolshop/shipcheck audit` before any release; if a previously-checked item fails, fix the underlying gap before bumping the version.
`SHIP_GATE.md` at the repo root tracks what shipcheck audits. Hard gates A–D (Security, Errors, Operator Docs, Hygiene) currently pass at 100% (21 checked / 16 SKIP-with-justification / 0 unchecked at v1.3.2, re-affirmed 2026-06-02). Soft gate E (Identity) is fully met. Re-run `npx @mcptoolshop/shipcheck audit` before any release; if a previously-checked item fails, fix the underlying gap before bumping the version.

### Runtime data dirs at the repo root
`policies/`, `fixtures/`, `records/`, `indexes/`, `reports/`, `swarms/`, `dogfood/`, `docs/`. These are the **shared backing store** that consumers (e.g. `repo-knowledge`, `shipcheck`) read from via `raw.githubusercontent.com/dogfood-lab/testing-os/main/...` URLs. The paths inside those dirs are part of the public API. **Don't reorganize them without thinking about every consumer first.**
Expand Down
6 changes: 3 additions & 3 deletions README.es.md
Original file line number Diff line number Diff line change
Expand Up @@ -20,7 +20,7 @@
*Protocolos, almacenes de evidencia y ciclos de aprendizaje para software asistido por IA.*

<!-- version:start -->
**v1.3.1** — versión actual. Consulte [CHANGELOG.md](CHANGELOG.md) para ver qué se incluyó en esta versión.
**v1.3.2** — versión actual. Consulte [CHANGELOG.md](CHANGELOG.md) para ver qué se incluyó en esta versión.
<!-- version:end -->

📖 **[Lea el manual →](https://dogfood-lab.github.io/testing-os/handbook/)**
Expand All @@ -45,7 +45,7 @@ npm install -g @dogfood-lab/dogfood-swarm
swarm --help
```

La guía del operador, la referencia de la interfaz de línea de comandos (CLI), la referencia del esquema y las recetas de integración se encuentran en el **[manual](https://dogfood-lab.github.io/testing-os/handbook/)**. Los detalles específicos de cada versión se encuentran en [CHANGELOG.md](CHANGELOG.md).
La guía del operador, la referencia de la interfaz de línea de comandos, la referencia del esquema y las recetas de integración se encuentran en el **[manual](https://dogfood-lab.github.io/testing-os/handbook/)**. Los detalles de cada versión se encuentran en [CHANGELOG.md](CHANGELOG.md).

## Modelo de amenazas

Expand Down Expand Up @@ -104,7 +104,7 @@ Requiere Node ≥ 22. La matriz de CI ejecuta Node 22 y 24 en `ubuntu-latest`; s

## Control de versiones

Todos los paquetes `@dogfood-lab/*` se actualizan juntos, con un único número de versión para todo el repositorio. Se publican seis paquetes en npm bajo `@dogfood-lab` en la versión v1.3.1, de forma sincronizada (`schemas`, `verify`, `report`, `ingest`, `findings`, `dogfood-swarm`); el séptimo, `@dogfood-lab/portfolio`, permanece interno. La línea de versión que aparece en la parte superior de este archivo README se actualiza automáticamente desde `package.json` mediante [`scripts/sync-version.mjs`](scripts/sync-version.mjs) cada vez que se ejecuta `npm run build`.
Todos los paquetes `@dogfood-lab/*` se actualizan juntos, con un único número de versión para todo el repositorio. Se publican seis paquetes en npm bajo `@dogfood-lab` en la versión v1.3.2, de forma sincronizada (`schemas`, `verify`, `report`, `ingest`, `findings`, `dogfood-swarm`); el séptimo, `@dogfood-lab/portfolio`, permanece interno. La línea de versión que aparece en la parte superior de este archivo README se actualiza automáticamente desde `package.json` mediante [`scripts/sync-version.mjs`](scripts/sync-version.mjs) cada vez que se ejecuta `npm run build`.

## Licencia

Expand Down
4 changes: 2 additions & 2 deletions README.fr.md
Original file line number Diff line number Diff line change
Expand Up @@ -20,7 +20,7 @@
*Protocoles, référentiels de preuves et boucles d’apprentissage pour les logiciels assistés par l’IA.*

<!-- version:start -->
**v1.3.1** — version actuelle. Consultez le fichier [CHANGELOG.md](CHANGELOG.md) pour connaître les nouveautés.
**v1.3.2** — version actuelle. Consultez le fichier [CHANGELOG.md](CHANGELOG.md) pour connaître les modifications apportées.
<!-- version:end -->

📖 **[Consultez le manuel →](https://dogfood-lab.github.io/testing-os/handbook/)**
Expand Down Expand Up @@ -104,7 +104,7 @@ Nécessite Node ≥ 22. La matrice CI exécute Node 22 + 24 sur `ubuntu-latest`;

## Gestion des versions

Tous les paquets commençant par `@dogfood-lab/*` sont mis à jour simultanément, avec un seul numéro de version pour l’ensemble du monorepo. Six paquets sont publiés sur npm sous `@dogfood-lab` à la version v1.3.1, de manière synchronisée (`schemas`, `verify`, `report`, `ingest`, `findings`, `dogfood-swarm`); le septième, `@dogfood-lab/portfolio`, reste interne. La ligne de version située en haut de ce fichier README est automatiquement mise à jour à partir du fichier `package.json` via le script [`scripts/sync-version.mjs`](scripts/sync-version.mjs) à chaque exécution de `npm run build`.
Tous les paquets commençant par `@dogfood-lab/*` sont mis à jour simultanément, avec un seul numéro de version pour l’ensemble du monorepo. Six paquets sont publiés sur npm sous le nom `@dogfood-lab` à la version v1.3.2, de manière synchronisée (`schemas`, `verify`, `report`, `ingest`, `findings`, `dogfood-swarm`); le septième, `@dogfood-lab/portfolio`, reste interne. La ligne de version située en haut de ce fichier README est automatiquement générée à partir du fichier `package.json` via le script [`scripts/sync-version.mjs`](scripts/sync-version.mjs) à chaque exécution de la commande `npm run build`.

## Licence

Expand Down
Loading