Skip to content

graph_interp follow-up: DSL, PMI, dead code cleanup#432

Closed
ocg-goodfire wants to merge 11 commits intodevfrom
merge/2a-graph-interp
Closed

graph_interp follow-up: DSL, PMI, dead code cleanup#432
ocg-goodfire wants to merge 11 commits intodevfrom
merge/2a-graph-interp

Conversation

@ocg-goodfire
Copy link
Collaborator

Post-merge improvements to graph_interp (after #427 squash-merged):

  • Markdown DSL (spd/graph_interp/markdown.py) — block-based builder, replaces f-string soup in prompts
  • PMI point-lookup on CorrelationStorage.pmi() method (dropped jaccard, top-k cofiring)
  • Inlined _save_edges wrapper
  • Made subrun_id and harvest_subrun_id required (no implicit generation)
  • Fixed CLI crash (missing required param)
  • Deleted dead code: export_html.py, is_later_layer, save_config, get_completed_*_keys, config table
  • Fixed progress bar over-count in unification phase

Review by Claude Code (Opus 4.6).

ocg-goodfire and others added 11 commits March 6, 2026 12:25
Three-phase pipeline using attribution graph structure:
- Output pass (late→early): "What does this component DO?"
- Input pass (early→late): "What TRIGGERS this component?"
- Unification: synthesizes both into a single label

Includes GraphInterpDB (SQLite via open_nfs_sqlite — NFS-safe, no WAL),
GraphInterpRepo, CLI (spd-graph-interp), prompt construction with
attribution edges and activation examples.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
- export_html.py is unneeded
- DONE_MARKER imported from autointerp.db instead of redeclared

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
subrun_id generation moved to run_slurm.py (the only caller). The
worker script now requires it explicitly — no hidden auto-generation.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
No more implicit "use most recent" — callers must specify which
harvest data to use. Removes hidden state dependency.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
…firing

- Add get_pmi() point-lookup to harvest/analysis.py
- Replace _build_cofiring_lookup (top-k query + manual join) with
  direct get_pmi() per attributed component
- Drop jaccard from RelatedComponent (PMI is more informative)
- Update prompt to show co-firing PMI instead of Jaccard
- RelatedComponent: 6 fields → 5, removed jaccard/pmi split

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
The point-lookup lives on the data it operates on. Removes the free
function from analysis.py. graph_context now calls
correlation_storage.pmi(key_a, key_b) directly.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
- run_slurm_cli.py: add required harvest_subrun_id param (was crashing)
- Delete: save_config + config table, get_completed_output/input_keys,
  is_later_layer — all unused
- Unification: pre-filter skipped keys so progress bar count is accurate
  (was showing x/500 when only 400 calls made)

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
…rings

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Atomic unit is now a block, not a line. build() joins with \n\n.
bullet() → bullets() taking a list (consecutive bullets = one block).
blank() removed — block separation is automatic. Optional sections
just work without manual spacing.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
@ocg-goodfire
Copy link
Collaborator Author

Folding into PR3 (#431) instead

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant