Skip to content

Wave 2: ollama-backfill 104 seed passports#4

Merged
mcp-tool-shop merged 1 commit into
mainfrom
wave-2
Apr 20, 2026
Merged

Wave 2: ollama-backfill 104 seed passports#4
mcp-tool-shop merged 1 commit into
mainfrom
wave-2

Conversation

@mcp-tool-shop
Copy link
Copy Markdown
Member

Summary

Data-only PR. 104 new packages/*/passport.json files generated by hermes3:8b via the local Ollama HTTP API with JSON-schema-constrained output. Only new code: scripts/seed-backfill.mjs and scripts/backfill-report.json. Regenerated derived artifacts: site/src/data/seeds.json, README.md category tables, llms.txt.

Results

  • 104/104 passports, 100% schema-valid.
  • Confidence histogram: 16 ≥0.95, 67 at 0.85-0.94, 21 at 0.70-0.84, 0 below.
  • Category distribution: developer-tools 56, desktop-apps 12, voice-and-sound 6, ml-and-training 5, vscode-extensions 5, crypto-and-provenance 5, governance-and-policy 4, typing-and-input 3, mouse-and-cursor 2, websketch 2, games-and-creative 1, suites-and-infrastructure 1, original-archive 2.
  • Kind distribution: cli 48, library 25, desktop 18, extension 5, mcp-server 4, plugin 3, service 1.
  • Health block auto-computed from git + filesystem: 90 have tests, 104 README, 101 LICENSE, 104 fresh (≤90d).

Review workflow

All 104 are flagged ingest.manualReview = true. pnpm seed:doctor surfaces them as the review queue. Clear the flag on each as you verify.

🤖 Generated with Claude Code

Backfills every archived package with a structured passport generated by
hermes3:8b (local Ollama, 4.7GB Q4_0) using schema-constrained JSON output.
All entries are marked ingest.manualReview=true so humans verify the LLM's
category/tag/pattern assignments at their own pace — the schema contract
is what ships, the per-seed content is the opening bid.

How it runs:
- scripts/seed-backfill.mjs (pnpm seed:backfill) iterates packages/*,
  builds a 1-6KB corpus per package (package.json + truncated README +
  up to 3 source files preferring entrypoints), and calls Ollama's
  /api/generate with the passport partial schema as the "format" constraint.
- LLM fills the narrow subset it can infer: title, description, taxonomy
  (category + tags from the registry), technical (kind + programming
  languages), discovery (oneLiner + whyItMatters), patterns with
  registry-enforced categories, agentCapsule insight, confidence.
- Script merges the LLM output with deterministic defaults (id, version
  from package.json, license, consolidation date 2026-04-08, lifecycle
  state=dormant, codeRepository URL, author, ingest provenance) and
  validates both the partial (post-LLM) and full (post-merge) schemas.

Prompt calibration:
- Three-package calibration pass (voice-soundboard, deltamind, mcpt)
  revealed two issues fixed before the full run: voice-and-sound routing
  (added category-routing hints) and language-tag hallucination (added
  "only claim languages verifiable from file extensions or package.json").
- After full run, 5 packages showed prompt-hint contamination (model
  echoed the "Source file extensions observed:" grounding line into its
  oneLiner). Removed that hint from the visible prompt and retried the 5;
  claude-memories required one hand-edit because its README contained
  instruction-like text that kept leaking.

Results:
- 104 passports, 100% schema-valid.
- Confidence histogram: 16 at >=0.95, 67 at 0.85-0.94, 21 at 0.70-0.84, 0 below.
- Category distribution: developer-tools 56, desktop-apps 12, voice-and-sound 6,
  ml-and-training 5, vscode-extensions 5, crypto-and-provenance 5,
  governance-and-policy 4, typing-and-input 3, mouse-and-cursor 2, websketch 2,
  games-and-creative 1, suites-and-infrastructure 1, original-archive 2.
- Health block auto-computed from git + filesystem: 90 have tests, 104 README,
  101 LICENSE, 104 fresh (<=90d since consolidation commits).

Derived artifacts regenerated:
- site/src/data/seeds.json (104 seeds)
- README.md category tables (between GENERATED markers)
- llms.txt at repo root (104 seeds, grouped by category)

Review workflow:
- pnpm seed:doctor lists all 104 under "Flagged for manual review".
  Expected for Wave 2 — clear manualReview=false as each passport is verified.
- Single source of contamination risk left: ~10% of oneLiners in the
  llms.txt sample show obvious weakness (tautological, README-fragment
  leaks). Cheaper to fix incrementally than to re-run all 104.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
@mcp-tool-shop mcp-tool-shop merged commit 25ba95a into main Apr 20, 2026
2 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant