Automatic LLM-powered summaries of AI-assisted development sessions.
Part of the CodeSteward platform β Govern, Verify, Evolve.
Quick Start β’ How It Works β’ Configuration
A background service that reads raw audit events from ClickHouse (written by codesteward-audit-proxy), summarizes development sessions using an LLM, and writes structured summaries back to ClickHouse. Supports Ollama (local), OpenAI, and Anthropic as LLM providers.
Summaries are served by the session_summaries MCP tool in codesteward-mcp.
audit-proxy codesteward-mcp
writes events βββΆ ClickHouse βββ reads summaries βββΆ MCP tool for IDEs
β
β poll
βΌ
βββββββββββββββββββ
β Summarizer β
β β
β 1. Discover ββββΆ Find idle, unsummarized sessions
β 2. Build ββββΆ Compress events into token-efficient context
β 3. Summarize ββββΆ LLM produces summary, decisions, tags
β 4. Write ββββΆ Structured results back to ClickHouse
βββββββββββββββββββ
Short sessions are summarized in a single LLM pass. Long sessions use a three-phase pipeline β extract structured facts from each chunk, merge deterministically, then synthesize the final summary β so no details are lost through double compression.
The summarizer is idempotent: re-running produces the same results. Bumping SUMMARIZER_VERSION triggers re-summarization of all sessions. Resumed sessions (with new events after a prior summary) are automatically detected and re-summarized. Each re-summarization creates a new revision, preserving prior summaries.
| Feature | Description | |
|---|---|---|
| π€ | Multi-provider LLM | Ollama (local, default), OpenAI, and Anthropic via official SDKs |
| π§© | Extract-merge-synthesize | 11 structured fact categories extracted per chunk, merged losslessly, then synthesized |
| π | Revision history | Re-summarization preserves all prior summaries and chunk extractions |
| π | Resumed session detection | Automatically re-summarizes sessions that received new events |
| π | Language-aware budgets | 28 languages with per-language chars-per-token ratios |
| βοΈ | Smart chunking | Prefers natural time gaps (>5 min) over arbitrary character splits |
| π | Security by default | Secrets stripped, tool_input bodies never sent to the LLM |
| β±οΈ | Flexible scheduling | Continuous polling (poll) or single-run for cron (once) |
- Python 3.12+
- uv
- ClickHouse accessible via HTTP
- An LLM provider: Ollama running locally, or an OpenAI/Anthropic API key
uv sync
uv run python -m summarizer.mainThe summarizer will auto-pull the configured model on first start.
uv sync --extra openai
LLM_PROVIDER=openai SUMMARIZER_MODEL=gpt-4o-mini OPENAI_API_KEY=sk-... uv run python -m summarizer.mainuv sync --extra anthropic
LLM_PROVIDER=anthropic SUMMARIZER_MODEL=claude-haiku-4-5-20251001 ANTHROPIC_API_KEY=sk-ant-... uv run python -m summarizer.maindocker compose up -dThis starts both the summarizer and an Ollama sidecar. The model is pulled automatically.
RUN_MODE=once uv run python -m summarizer.mainProcesses one batch of sessions and exits β ideal for scheduled jobs.
All configuration is via environment variables:
| Variable | Default | Description |
|---|---|---|
CLICKHOUSE_URL |
http://localhost:8123 |
ClickHouse HTTP interface |
CLICKHOUSE_USER |
default |
ClickHouse user |
CLICKHOUSE_PASSWORD |
"" |
ClickHouse password |
CLICKHOUSE_DATABASE |
audit |
Database name |
LLM_PROVIDER |
ollama |
LLM provider: ollama, openai, or anthropic |
SUMMARIZER_MODEL |
phi3:mini |
Model name (provider-specific) |
OLLAMA_URL |
http://localhost:11434 |
Ollama API endpoint |
OPENAI_API_KEY |
"" |
OpenAI API key (required when provider=openai) |
OPENAI_BASE_URL |
unset | Custom OpenAI-compatible endpoint |
ANTHROPIC_API_KEY |
"" |
Anthropic API key (required when provider=anthropic) |
RUN_MODE |
poll |
poll (continuous) or once (single run, then exit) |
POLL_INTERVAL_SECONDS |
300 |
Seconds between polling cycles (poll mode only) |
SESSION_COOLDOWN_MINUTES |
30 |
Minutes of inactivity before summarizing |
LOOKBACK_HOURS |
168 |
How far back to look (default: 7 days) |
BATCH_SIZE |
10 |
Max sessions per cycle |
CONTEXT_MAX_TOKENS |
4096 |
Model's context window in tokens |
SESSION_LANGUAGE |
en |
Session language for token budget calculation |
CONTEXT_MAX_CHARS |
(calculated) | Manual override β skips token calculation if set |
SUMMARIZER_VERSION |
v1 |
Bump to force re-summarization |
PROMPT_SOURCE |
code |
code = hardcoded prompts, database = load from prompt_registry table |
EVALUATION_ENABLED |
false |
When true, store full input contexts for downstream evaluation |
LOG_LEVEL |
info |
Logging level |
The character budget for LLM prompts is calculated automatically from CONTEXT_MAX_TOKENS and SESSION_LANGUAGE. For example, a 32K-token Mistral model with German sessions (CONTEXT_MAX_TOKENS=32768 SESSION_LANGUAGE=de) gets a ~94K character budget. Set CONTEXT_MAX_CHARS to override the calculation.
Migration files are Goose-compatible and can also be applied manually.
Each release publishes a migrations container to GHCR. It uses Goose to apply migrations with automatic version tracking:
docker run --rm --network host \
-e GOOSE_DBSTRING="http://default:@localhost:8123/audit" \
ghcr.io/codesteward/codesteward-session-summarizer-migrations:latestNote: Use
http://host:8123for the HTTP protocol ortcp://host:9000for the native TCP protocol.
The migration files work as plain SQL β the goose annotations are comments that have no effect when run directly:
clickhouse-client --multiquery < migrations/001_session_summaries.sql
clickhouse-client --multiquery < migrations/002_session_chunk_extractions.sql
clickhouse-client --multiquery < migrations/003_prompt_registry.sql
clickhouse-client --multiquery < migrations/004_prompt_provenance.sql
clickhouse-client --multiquery < migrations/005_evaluation_contexts.sql001_session_summaries.sqlβ creates thesession_summariestable with revision-based history002_session_chunk_extractions.sqlβ creates thesession_chunk_extractionstable for per-chunk fact extractions003_prompt_registry.sqlβ creates theprompt_registrytable for database-driven prompt management004_prompt_provenance.sqlβ addsprompt_id,prompt_hash,input_context_hashcolumns to output tables005_evaluation_contexts.sqlβ creates TTL-managed tables for storing evaluation input contexts
# Install all dependencies (including dev + all provider SDKs)
uv sync --all-extras
# Run tests
uv run pytest
# Lint
uv run ruff check src/ tests/
uv run ruff format --check src/ tests/Apache-2.0. See LICENSE for details.
