feat(importers): read full Claude Code transcripts from ~/.claude/projects by seilk · Pull Request #40 · NousResearch/hermes-agent-self-evolution

seilk · 2026-04-26T14:34:31Z

Summary

ClaudeCodeImporter previously only read ~/.claude/history.jsonl, which logs user prompts only — no assistant responses. That made Claude Code the one sessiondb source producing unpaired examples while Copilot and Hermes both yielded (task_input, assistant_response) pairs.

Modern Claude Code (≥ 2.x) writes full session transcripts to ~/.claude/projects/<encoded-cwd>/<session-id>.jsonl, with interleaved user, assistant, attachment, permission-mode, etc. records. Downstream code (RelevanceFilter, build_dataset_from_external) already plumbs assistant_response through (see external_importers.py lines 504–505), so the data-shape gap was the only blocker.

This PR closes that gap.

What changed

evolution/core/external_importers.py

New ClaudeCodeImporter.PROJECTS_DIR = ~/.claude/projects.
extract_messages(limit, source=...) gains a source arg:
- "auto" (default) — prefer projects/, fall back to history.jsonl
- "projects" — transcripts only
- "history" — flat log only (legacy semantics)
New _parse_claude_code_session(path, project) helper, mirroring _parse_copilot_events. It walks records in order, tracks the current user prompt, accumulates text blocks across consecutive assistant turns (so tool-call interleavings still produce one clean response), skips tool_result user records and short prompts, and rejects pairs containing detected secrets in either side.

tests/core/test_external_importers.py

Existing TestClaudeCodeImporter tests updated to pass source=\"history\" so they remain isolated from the new auto behavior.
New TestClaudeCodeProjectsImporter class with 12 tests covering:
- paired turn extraction
- multi-block assistant concatenation (text → tool_use → text)
- skipping tool_result user records
- secret redaction on either side of the pair
- short prompt filtering
- pairs with no assistant text (drops, doesn't pollute)
- malformed JSONL lines
- multi-project, multi-session walking
- limit parameter
- missing projects/ directory
- auto mode prefers projects, and falls back to history when projects is empty

Verification

$ pytest tests/ -q
152 passed, 11 warnings in 53.05s

Smoke-tested on a real ~/.claude/projects/ dataset (32k+ session messages):

ClaudeCodeImporter.extract_messages(limit=5, source='projects')
# => 5 pairs with both task_input and assistant_response populated

Compatibility

Backwards-compatible: extract_messages() with no args returns paired data when projects/ exists, identical legacy data otherwise. Existing downstream consumers (build_claude_code_examples, RelevanceFilter.score) already use msg.get(\"assistant_response\", \"\"), so they handle both shapes.
source=\"history\" preserves the exact prior behavior for any caller that wants it.

Why this matters for Claude Code users

The repo's pitch is sessiondb-based skill evolution. With this fix, Claude Code becomes a first-class data source on par with Hermes and Copilot, instead of a degraded user-only fallback. The richer data also unlocks better LLM relevance scoring downstream — the scorer already accepts assistant_response as input but was always getting an empty string from Claude Code.

Related context: the original sessiondb proposal (#3) and the recent sessiondb-quality fixes in #26, neither of which addressed the projects/ transcript source.

…jects Claude Code stores rich session transcripts (user prompts + assistant responses + tool calls) at ~/.claude/projects/<encoded-cwd>/<id>.jsonl. The previous ClaudeCodeImporter only read ~/.claude/history.jsonl, which is a flat log of *user prompts only* — no assistant responses. That meant Claude Code was the only sessiondb source that produced unpaired examples, while Copilot and Hermes both yielded (task_input, assistant_response) pairs. Downstream consumers (RelevanceFilter, build_dataset_from_external) already plumb assistant_response through, so the data shape gap was the only blocker. Changes: - Extend ClaudeCodeImporter with PROJECTS_DIR + _extract_from_projects. - Add _parse_claude_code_session helper, mirroring _parse_copilot_events. Handles user/assistant interleaving, tool_use/tool_result skipping, multi-block assistant turns, malformed JSON, and secret redaction. - New `source` arg on extract_messages: "auto" (default, prefers projects/, falls back to history.jsonl), "projects", or "history". - Existing tests updated to pass `source="history"` (now explicit), plus 12 new tests covering pair extraction, tool-result skipping, secret filtering, multi-session walking, limits, and auto fallback. Verified on real ~/.claude/projects/ data: yields paired examples with both task_input and assistant_response fields. Closes the data-quality gap noted in NousResearch#3 for Claude Code users.

innoscoutpro mentioned this pull request Apr 27, 2026

fix: integrate critical self-evolution pipeline fixes #42

Open

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

feat(importers): read full Claude Code transcripts from ~/.claude/projects#40

feat(importers): read full Claude Code transcripts from ~/.claude/projects#40
seilk wants to merge 1 commit into
NousResearch:mainfrom
seilk:feat/claude-code-projects-importer

seilk commented Apr 26, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

Conversation

seilk commented Apr 26, 2026

Summary

What changed

Verification

Compatibility

Why this matters for Claude Code users

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant