Skip to content

release: v1.3.2 — dogfood-swarm self-audit + agent-output schema consolidation#36

Merged
mcp-tool-shop merged 5 commits into
mainfrom
release-v1.3.2
Jun 2, 2026
Merged

release: v1.3.2 — dogfood-swarm self-audit + agent-output schema consolidation#36
mcp-tool-shop merged 5 commits into
mainfrom
release-v1.3.2

Conversation

@mcp-tool-shop
Copy link
Copy Markdown
Contributor

testing-os v1.3.2 (lockstep)

The full dogfood-swarm self-audit health pass — the swarm protocol driven against its own @dogfood-lab/dogfood-swarm source — plus the agent-output schema consolidation.

dogfood-swarm health pass (~54 findings, all verified green)

  • Stage A (bug/security): 26 found + 2 re-audit regressions → fixed. Headlines: the published npm install -g was dead (fp-001), the verifier reported still-present bugs as "verified" (ve-001), and the tool crashed on its own first collect (fp-002).
  • Stage B/C (hardening + operator UX): exit-code/CI honesty, verify degradation verdicts (no_tests/tool_missing/timed_out), persist idempotency, DB pragmas, docs (troubleshooting + exit-code + env-var tables).
  • Deep refactors: edit-stable context-hash fingerprint (fp-005, CodeQL-style) + agent-output schema moved into @dogfood-lab/schemas (fp-006, single source of truth via the dep).

Tests 621 → 746 (0 fail). npm run verify + shipcheck audit (100% hard gates) green.

Note: v1.3.2 carries a fingerprint-semantics change (one-time re-fingerprint on first post-upgrade re-audit) and a cross-package schema move — hence the lockstep bump across all 7 @dogfood-lab/* packages.

🤖 Generated with Claude Code

mcp-tool-shop and others added 5 commits June 2, 2026 06:00
Meta-dogfood: ran the swarm protocol on the dogfood-swarm package itself
(init -> domains -> dispatch -> collect -> review -> amend -> verify),
driving the real `swarm` CLI against its own source. The audit surfaced 26
findings on a heavily-hardened v1.3.1 package; every CRITICAL/HIGH was
adversarially re-verified (0 refuted) before any fix.

CRITICAL
- fp-001: scripts/agent-output.schema.json was absent from the published
  tarball (files whitelist excluded scripts/), so `swarm dispatch`/`collect`/
  `revalidate` crashed with ENOENT on every `npm install -g`. Ship a
  package-local copy (schema/) resolved first, guarded by a byte-equality
  drift test. Consolidating into @dogfood-lab/schemas is a noted follow-up.
- ve-001: a still-present, symbol-less finding whose prose-derived anchor
  missed was classified `verified` -- the verifier lied about done-ness. A
  prose-anchor miss now routes to `unverifiable`; only a real code-identifier
  anchor miss may support `verified`.

HIGH (selected)
- fp-002 (reproduced live -- collect aborted on its own first 5-agent wave):
  two within-wave findings sharing a coarse fingerprint both landed in `new`
  and aborted the entire collect under UNIQUE(run_id, finding_id). Add
  occurrence-index disambiguation (singletons stay byte-identical, preserving
  cross-wave dedup and the B-BACK-002 contract) + a never-abort INSERT OR
  IGNORE net. Design grounded in CodeQL primaryLocationLineHash, Semgrep
  match_based_id, SQLite upsert, and the WER/Sentry default-expand principle.
- cli-001: `swarm rewind` aborted in-flight waves/agent_runs across ALL runs
  (and repos) sharing the single control-plane.db. Scope the DB abort to the
  run(s) owning the reset working tree.
- fp-003: the git porcelain parser corrupted quoted/unicode/space paths,
  defeating the ownership gate. Switch to `--porcelain -z` (NUL-delimited).
- sm-001/sm-002: enforce exclusive ownership (freeze-time overlap reject +
  first-match-wins resolution) and guard unfreeze against in-flight waves.
- td-001/002/003: README dispatch/collect/revalidate examples failed as
  written; corrected + added a README->CLI contract drift test (td-006).
- ve-002: `--threshold <non-numeric>` silently disabled the regression gate.

MEDIUM/LOW: verifier truthfulness (ve-003 symmetric window, ve-004 no_tests
verdict), promotion atomicity (sm-003), approve-event de-dup (cli-002),
--reason flag-swallow (cli-003), bounded-read TOCTOU + argv guards, dead-code
removal, and a status half-state breadcrumb after a failed collect.

690 tests (0 fail, +69 regression tests); `npm run verify` green end-to-end.

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
The wave-2 re-audit (the protocol's Stage A repeat) caught two HIGH
regressions that the Stage A fixes themselves introduced. Both are now
fixed with cross-wave / end-to-end regression tests that fail RED against
the buggy code.

- fp-r-001 (regressed fp-002): disambiguateFingerprints assigned the
  occurrence index from within-wave ARRAY ORDER, blind to prior state. When
  a wave-1 singleton gained a coarse-key sibling in wave 2, the bare-fp slot
  was awarded by (non-deterministic) sort order -> identity swap: the new
  sibling was swallowed as `recurring` and the original finding's stable
  finding_id was hijacked. Fix: make disambiguation prior-aware and
  order-independent -- the member matching the prior row keeps the bare fp
  (no hijack), other members get a content-derived salt (no swallow), groups
  sorted deterministically. Residual group<->singleton churn is bounded,
  never data loss; the edit-stable context-hash follow-up is noted in-code.

- sm-r-001 (regressed sm-002): the freeze-time overlap guard rejected
  GLOB-level overlap, but detectDomains resolves ownership at FILE-level
  first-match-wins. The default frontend (src/**/*.tsx) ⊂ backend (src/**)
  overlap is legitimate, so freezeDomains rejected the tool's own
  auto-detected default map, breaking init->freeze->dispatch for any
  full-stack repo. Coupled latent bug: resolveExclusiveOwner iterated
  alphabetical order, disagreeing with detection order. Fix: most-specific-
  glob-wins ownership (order-independent) + narrow the freeze guard to
  genuine equal-specificity criss-cross only. The default map now freezes
  and attributes src/**/*.tsx to frontend.

696 tests (0 fail); `npm run verify` green end-to-end. Deferred (noted): a
cosmetic help-block re-indent (cli-r-002, LOW); cli-r-001 was a false
positive (cross-repo isolation is already covered in meta-amendA-cli-
commands.test.js).

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
Meta-dogfood of the dogfood-swarm package via its own swarm protocol.
- 8f6e552: 26 findings (2 CRIT, 9 HIGH, ...) found, verified, fixed.
- 0280514: 2 HIGH regressions the fixes introduced, caught by the wave-2
  re-audit and fixed (fp-002 cross-wave identity, sm-002 freeze false-positive).
690 -> 696 tests; npm run verify green. The within-wave fp-002 collect-abort
that this very swarm hit live (wave 1) is fixed and proven by the wave-2 collect.

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
Consolidate the agent-output JSON schema into @dogfood-lab/schemas
(packages/schemas/src/json/agent-output.schema.json) and resolve it via
createRequire on the ./json/* subpath — the same pattern the eight
contract schemas use. This removes the controlled duplication left by the
fp-001 packaging fix: the package-local copy under
packages/dogfood-swarm/schema/, its byte-equality drift test
(meta-amendA-schema-packaging.test.js), and the repo-root scripts/ copy
are all deleted; one source of truth now ships through the dependency.
dogfood-swarm gains @dogfood-lab/schemas as a dependency. The schema $id
moves to the packages/schemas path (a contract field), so the lockstep
version bumps 1.3.1 -> 1.3.2 across every surface (8 manifests, lockfile,
README block, SHIP_GATE/SCORECARD/CLAUDE honesty surfaces,
verify-output.svg, CHANGELOG). fp-p-006, deferred from the self-audit.

Also lands the pending self-audit Stage C work (exit-code contract on the
CI-gate verbs, operator-doc closure, README->CLI contract test, and the
meta-amendB test files) that was uncommitted in the working tree; it ships
as part of the same 1.3.2 release.

npm run verify green on the combined tree.

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
Regenerate README.{ja,zh,es,fr,hi,it,pt-BR}.md from the v1.3.2 source via the
local TranslateGemma pipeline, so the released tag does not ship stale v1.3.1
translations (release-ordering discipline: translations land in the tagged
commit, before publish/release).

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
@mcp-tool-shop mcp-tool-shop merged commit f7d95a0 into main Jun 2, 2026
4 checks passed
@mcp-tool-shop mcp-tool-shop deleted the release-v1.3.2 branch June 2, 2026 21:10
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant