feat: Literature-aware librarian & archivist agents by Baeora · Pull Request #389 · greyhaven-ai/autocontext

Baeora · 2026-03-14T05:04:42Z

Note: I started adapting this for my own use case and then realized it requires an API key to actually use / test so I am going to rebuild it in a way that lets me use my max plan instead; you might find use from my additions though (although you'll need to test and finish if you do find it interesting)

TLDR It adds "Librarian" agents which have literature SME and then an "Archivist" which will settle disputes between literature that spans over the years / is eventually refuted etc.

Here is Claude's summary:

Summary

Adds two new agent roles to the generation loop — librarian and archivist — that ground strategy evolution in ingested literature. Users can ingest markdown books into a persistent library. Each active book
gets a dedicated librarian agent that reviews strategies against the book's principles, flags violations, and is available to other agents via a consult_library tool. When a librarian escalates a violation,
the archivist spot-pulls original passages from the chunked source material and renders a verdict: dismissed, soft_flag, or hard_gate. Hard gates force a retry before tournament matches proceed.

This is a complete component-level implementation with 85 new tests. The runtime wiring to invoke librarians during a live generation loop (orchestrator _init_library() and the --books flag on the run
command) is not yet connected — the individual components are built, tested, and integrated at the DAG/routing/pipeline level, but the final orchestrator plumbing is left for a follow-up.

What's included

Ingestion pipeline (knowledge/ingestion.py)

register_book() — copies markdown, chunks on H1/H2 headings (preserving tables, code fences, math blocks, blockquotes, lists), writes meta.json
chunk_markdown() — heading-based splitter with atomic block preservation and a small-file bypass (~6k token threshold)
ingest_book() — LLM call that reads the full book and produces a condensed reference.md
validate_ingestion(), remove_book(), list_books()

Agent runners

LibrarianRunner — proactive review (run()) and reactive consultation (consult()) with structured output parser ()
ArchivistRunner — conditional arbiter with spot_pull_sections() for original passage retrieval and structured output parser ()
LibraryToolHandler — handles consult_library tool calls with per-role rate limiting, specific/broadcast routing, and consultation logging

Pipeline integration

build_mts_dag(active_books=) — dynamically adds librarian_* and archivist nodes with correct dependency edges (translator → librarians → archivist → coach)
Prefix-based routing in RoleRouter — librarian_clean-arch resolves to the librarian model/provider config
evaluate_archivist_gate() — pipeline stage that converts archivist decisions into proceed/retry/skip actions
consult_library added to Agent SDK ROLE_TOOL_CONFIG for competitor, analyst, coach, architect

Prompt system

LibraryPromptBundle and build_library_context_block() — structured prompt components listing available books
inject_library_context() — appends library context to any agent prompt

Storage & persistence

ArtifactStore methods: write_librarian_notes(), read/append_cumulative_notes(), write_archivist_decision(), write_active_books(), append_consultation_log()
SkillPackage.active_library_books field for knowledge export

CLI commands

autoctx add-book --title "..." [--name slug] [--author ...] [--tag ...]
autoctx list-books
autoctx remove-book

Data contracts & configuration

LibrarianFlag, LibrarianOutput, ArchivistDecision, ArchivistOutput dataclasses
AgentOutputs extended with librarian_outputs, archivist_output, library_advisories
12 new AppSettings fields: library_root, library_books, librarian_enabled, model_librarian, model_archivist, librarian_provider, archivist_provider, library_max_consults_per_role, ingestion_model

Documentation

Full design spec: docs/superpowers/specs/2026-03-13-librarian-archivist-design.md
21-task implementation plan: docs/superpowers/plans/2026-03-13-librarian-archivist.md
CLAUDE.md and README.md updated with new roles, commands, and config variables

What's NOT included (follow-up work)

--books flag on the run CLI command
AgentOrchestrator._init_library() — the method that instantiates LibrarianRunner/ArchivistRunner and plugs them into the live pipeline
build_role_handler() cases for librarian_* and archivist roles

These are the final wiring steps that connect the tested components to the running generation loop.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

Resolve DAG timing ambiguity, add parser format specs, define PromptBundle extension strategy, specify pipeline integration path, add Agent SDK tool registration, fix stage numbering, add ingestion error handling, defer multi-pass to follow-up. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

Add prefix-based role routing for dynamic librarian names, specify AgentOutputs field additions, clarify per_role_tools is a new mechanism, note controller override interaction. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

17 tasks across 9 chunks covering contracts, ingestion, agent runners, consult_library tool, DAG/routing/prompts, storage/gate/pipeline, CLI, docs, and integration tests. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

Fix chunking tests (pad inputs past small-file threshold), use tuples for RoleSpec.depends_on, add claude_skills_path to ArtifactStore tests, fix stage file path, add librarian_enabled guard. Add 4 missing tasks: LLM ingestion call, orchestrator wiring, prompt injection, knowledge export extension. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

Wire prefix fallback into existing route() method internals instead of separate methods. Wire ingest_book() into add-book CLI command with graceful fallback when no provider configured. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

…uration settings Chunk 1 of librarian/archivist implementation: LibrarianFlag, LibrarianOutput, ArchivistDecision, ArchivistOutput dataclasses; library fields on AgentOutputs; AppSettings with library_root, library_books, librarian/archivist model/provider settings, and ingestion_model. Also fix Windows compatibility for resource module. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

Chunk 2: slugify, chunk_markdown (heading-based splitting with atomic block preservation for tables/code/math/blockquotes/lists), register_book, validate_ingestion, remove_book, list_books. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

… and test style fixes Chunk 3: LibrarianRunner (parser + run/consult), ArchivistRunner (parser + run/noop + spot_pull_sections + has_violations). Chunk 4: LibraryToolHandler with rate limiting, routing, and consultation logging. Test style: all library tests updated to match codebase conventions. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

Task 8: prefix-based routing for librarian_*/archivist in RoleRouter. Task 9: build_mts_dag() now accepts active_books for dynamic librarian nodes. Task 10: LibraryPromptBundle and build_library_context_block in templates. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

Task 11: library persistence methods on ArtifactStore (notes, decisions, logs). Task 12: evaluate_archivist_gate stage for pipeline hard_gate/soft_flag/skip. Task 13: consult_library added to Agent SDK ROLE_TOOL_CONFIG for main roles. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

Task 14: CLI commands for library management via autoctx. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

Agent roles, generation loop stages, knowledge system, CLI commands, and configuration variables documented. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

…ort extension Task 17: end-to-end integration test for full library flow. Task 18: ingest_book() LLM call to produce reference.md. Task 19: orchestrator wiring tests for library configuration. Task 20: inject_library_context for agent prompt augmentation. Task 21: active_library_books field on SkillPackage. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

Baeora and others added 16 commits March 13, 2026 12:26

docs: add librarian & archivist agent design spec

7a38049

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

docs: address remaining spec review gaps

670125b

Add prefix-based role routing for dynamic librarian names, specify AgentOutputs field additions, clarify per_role_tools is a new mechanism, note controller override interaction. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

docs: add librarian/archivist implementation plan

5fc2bbd

17 tasks across 9 chunks covering contracts, ingestion, agent runners, consult_library tool, DAG/routing/prompts, storage/gate/pipeline, CLI, docs, and integration tests. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

docs: fix final plan review issues

e159180

Wire prefix fallback into existing route() method internals instead of separate methods. Wire ingest_book() into add-book CLI command with graceful fallback when no provider configured. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

feat(library): add add-book, list-books, remove-book CLI commands

f883dd9

Task 14: CLI commands for library management via autoctx. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

docs: add librarian/archivist to CLAUDE.md

d94f86b

Agent roles, generation loop stages, knowledge system, CLI commands, and configuration variables documented. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

docs: add librarian/archivist and library commands to README

67e6a4c

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

Merge branch 'greyhaven-ai:main' into main

644f9d6

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

feat: Literature-aware librarian & archivist agents#389

feat: Literature-aware librarian & archivist agents#389
Baeora wants to merge 16 commits intogreyhaven-ai:mainfrom
Baeora:main

Baeora commented Mar 14, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

Conversation

Baeora commented Mar 14, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant