Skip to content

Wave 2: ollama-backfill 104 seed passports#3

Closed
mcp-tool-shop wants to merge 3 commits into
mainfrom
seed-vault-wave-2-backfill
Closed

Wave 2: ollama-backfill 104 seed passports#3
mcp-tool-shop wants to merge 3 commits into
mainfrom
seed-vault-wave-2-backfill

Conversation

@mcp-tool-shop

Copy link
Copy Markdown
Member

Summary

Stacked on top of #2 (Wave 1 infrastructure). This PR is data-only — 104 new packages/*/passport.json files generated by hermes3:8b via the local Ollama HTTP API with JSON-schema-constrained output. No code changes except the new scripts/seed-backfill.mjs.

How it ran

  1. scripts/seed-backfill.mjs iterates packages/*, builds a 1-6KB corpus per package (package.json + truncated README + up to 3 source files preferring entrypoints).
  2. Calls Ollama /api/generate with model=hermes3:8b, format=<partialSchema>, temperature=0.1.
  3. LLM fills: title, description, taxonomy.category (from 13 frozen enum), taxonomy.tags (from registry), technical.kind, programmingLanguages, discovery.oneLiner, discovery.whyItMatters, patterns (with registry-enforced pattern categories), agentCapsule.insight, self-assessed confidence.
  4. Script merges with deterministic defaults (id, version from package.json, license, consolidation date 2026-04-08, lifecycle.state=dormant, ingest.method=ollama-backfill, ingest.manualReview=true) and validates both partial and full schemas.

Results

  • 104/104 passports, 100% schema-valid.
  • Confidence histogram: 16 at ≥0.95, 67 at 0.85-0.94, 21 at 0.70-0.84, 0 below.
  • Category distribution: developer-tools 56, desktop-apps 12, voice-and-sound 6, ml-and-training 5, vscode-extensions 5, crypto-and-provenance 5, governance-and-policy 4, typing-and-input 3, mouse-and-cursor 2, websketch 2, games-and-creative 1, suites-and-infrastructure 1, original-archive 2.
  • Kind distribution: cli 48, library 25, desktop 18, extension 5, mcp-server 4, plugin 3, service 1.
  • Health block auto-computed from git + filesystem by seed:index: 90 have tests, 104 have README, 101 have LICENSE, 104 fresh (≤90d since consolidation commits).

Prompt calibration

  • 3-package calibration pass (voice-soundboard, deltamind, mcpt) revealed two issues fixed before the full run:
    1. voice-and-sound routing — added category-routing hints to the prompt
    2. language-tag hallucination (e.g. "python" in npm packages) — added "only claim languages verifiable from file extensions or package.json"
  • After full run, 5 packages showed prompt-hint contamination (model echoed the "Source file extensions observed:" grounding line into its oneLiner). Removed that hint, retried — 4/5 fixed automatically, 1 (claude-memories) hand-edited because its README contains instruction-like text.
  • 1 package (polyglot-vscode) hit the initial num_predict=1500 ceiling listing 55 translation languages; bumped to 2500 and retried successfully.

Review workflow

  • pnpm seed:doctor currently lists all 104 under "Flagged for manual review" (expected — Wave 2 defaults manualReview=true). Clear that flag on each passport as you verify it.
  • Residual quality gap: ~10% of oneLiners in llms.txt are tautological or show README-fragment leakage. Cheaper to fix those individually than re-run 104.

Test plan

  • node scripts/seed-backfill.mjs produces 104/104 schema-valid passports
  • pnpm seed:validate passes on all 104
  • pnpm seed:index regenerates seeds.json, README tables, llms.txt
  • Astro site /prototypes/seeds/ renders 104 cards with all 13 category facets, 4 health facets, search filters across names + tags + patterns
  • Per-seed page (verified on deltamind) renders title, description, version, license, kind, languages, lineCount, lastCommitAt, health signals, patterns, whyItMatters, tags, source link
  • CI: both jobs pass on this PR

🤖 Generated with Claude Code

mcp-tool-shop and others added 3 commits April 20, 2026 16:59
Establish the passport + generator + validator + indexer pipeline so the
repo can grow from 104 to 1000+ archived prototypes without README drift
or category chaos.

Per-seed passport.json composes CodeMeta 3.0 core, RO-Crate 1.1 profile,
MCPD-style lifecycle facets, SBOM reference, Software Heritage SWHID slot,
and ingest provenance. Canonical taxonomy (13 frozen categories + tag
registry) constrains taxonomy.category and taxonomy.tags values, preventing
'typescript' vs 'ts' vs 'TypeScript' drift.

New scripts: seed:new (scaffold), seed:validate (AJV + taxonomy +
uniqueness + lineage), seed:index (regenerate site/src/data/seeds.json
and README tables between markers, fill lineCount and lastCommitAt from
git), seed:doctor (health report).

Astro /seeds/ faceted browser and dynamic per-seed routes consume the
generated seeds.json. Paths-gated seed-validate.yml CI workflow runs on
passport / schema / taxonomy / script changes only.

Wave 2 follows: LLM backfill for the 104 existing packages via the Ollama
Intern MCP (hermes3:8b, JSON-schema-constrained), landing as a separate
data-only PR with low-confidence entries flagged for manual review.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Research pass on 2025-2026 state-of-the-art catalog infrastructure identified
five high-leverage additions that are either genuinely novel or crystallizing
as emerging standards. Landing them before hermes3:8b backfills 104 packages
so the schema locks once, not twice.

ADD:
- patterns[] — structured pattern extraction with controlled-vocabulary
  category (from taxonomy.json:patternCategories, 24 canonical categories).
  Replaces the free-prose discovery.patternWorthStealing. Makes "which seeds
  touched supply-chain tricks?" queryable across the vault.
- failureModes[] — structured lessons-learned (tried / didntWorkBecause /
  pivoted?). A prototype's most valuable payload is often what broke. No
  prototype catalog does this today.
- agentCapsule — {insight, excerpt} — 10-second LLM-optimized summary plus
  a <=400-char code excerpt of the core trick. Agents pick up the idea
  without parsing source.
- priorArt[] — papers, blog posts, prior tools that inspired each seed.
- health block — auto-computed signals split out of technical: lineCount,
  lastCommitAt, commitRecencyDays, hasTests, hasReadme, hasLicense, buildable.
  Fills at index time from git + filesystem; no manual upkeep.

ADD TO taxonomy.json:
- patternCategories registry (24 entries — signal-processing, caching,
  concurrency, supply-chain, etc.) enforced by seed:validate.

ADD AT REPO ROOT:
- /llms.txt generated by seed:index — follows the Answer.AI emerging spec
  for LLM-discoverable sites (one line per seed, grouped by category).

SITE:
- Per-seed page renders patterns / failure modes / agent capsule / prior art /
  health signals with distinct visual treatment (failure modes in red).
- Faceted browser adds "Has tests / README / LICENSE / Fresh <=90d" filters
  and extends search to match pattern names + summaries.

Verified end-to-end: schema validates, pattern-category violation caught,
index emits llms.txt and enriched seeds.json, site renders all six novel
sections cleanly, no console/server errors.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Backfills every archived package with a structured passport generated by
hermes3:8b (local Ollama, 4.7GB Q4_0) using schema-constrained JSON output.
All entries are marked ingest.manualReview=true so humans verify the LLM's
category/tag/pattern assignments at their own pace — the schema contract
is what ships, the per-seed content is the opening bid.

How it runs:
- scripts/seed-backfill.mjs (pnpm seed:backfill) iterates packages/*,
  builds a 1-6KB corpus per package (package.json + truncated README +
  up to 3 source files preferring entrypoints), and calls Ollama's
  /api/generate with the passport partial schema as the "format" constraint.
- LLM fills the narrow subset it can infer: title, description, taxonomy
  (category + tags from the registry), technical (kind + programming
  languages), discovery (oneLiner + whyItMatters), patterns with
  registry-enforced categories, agentCapsule insight, confidence.
- Script merges the LLM output with deterministic defaults (id, version
  from package.json, license, consolidation date 2026-04-08, lifecycle
  state=dormant, codeRepository URL, author, ingest provenance) and
  validates both the partial (post-LLM) and full (post-merge) schemas.

Prompt calibration:
- Three-package calibration pass (voice-soundboard, deltamind, mcpt)
  revealed two issues fixed before the full run: voice-and-sound routing
  (added category-routing hints) and language-tag hallucination (added
  "only claim languages verifiable from file extensions or package.json").
- After full run, 5 packages showed prompt-hint contamination (model
  echoed the "Source file extensions observed:" grounding line into its
  oneLiner). Removed that hint from the visible prompt and retried the 5;
  claude-memories required one hand-edit because its README contained
  instruction-like text that kept leaking.

Results:
- 104 passports, 100% schema-valid.
- Confidence histogram: 16 at >=0.95, 67 at 0.85-0.94, 21 at 0.70-0.84, 0 below.
- Category distribution: developer-tools 56, desktop-apps 12, voice-and-sound 6,
  ml-and-training 5, vscode-extensions 5, crypto-and-provenance 5,
  governance-and-policy 4, typing-and-input 3, mouse-and-cursor 2, websketch 2,
  games-and-creative 1, suites-and-infrastructure 1, original-archive 2.
- Health block auto-computed from git + filesystem: 90 have tests, 104 README,
  101 LICENSE, 104 fresh (<=90d since consolidation commits).

Derived artifacts regenerated:
- site/src/data/seeds.json (104 seeds)
- README.md category tables (between GENERATED markers)
- llms.txt at repo root (104 seeds, grouped by category)

Review workflow:
- pnpm seed:doctor lists all 104 under "Flagged for manual review".
  Expected for Wave 2 — clear manualReview=false as each passport is verified.
- Single source of contamination risk left: ~10% of oneLiners in the
  llms.txt sample show obvious weakness (tautological, README-fragment
  leaks). Cheaper to fix incrementally than to re-run all 104.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
@mcp-tool-shop mcp-tool-shop changed the base branch from seed-vault-wave-1 to main April 20, 2026 22:12
@mcp-tool-shop

Copy link
Copy Markdown
Member Author

Superseded by the clean cherry-pick after PR #2 was squash-merged. See replacement PR.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant