Skip to content

Merge 2b+3: investigate + app cleanup + DSL prompts#433

Closed
ocg-goodfire wants to merge 23 commits intodevfrom
merge/2b-investigate
Closed

Merge 2b+3: investigate + app cleanup + DSL prompts#433
ocg-goodfire wants to merge 23 commits intodevfrom
merge/2b-investigate

Conversation

@ocg-goodfire
Copy link
Collaborator

Summary

Combines the remaining merge branches into dev:

  • merge/3-app-cleanup (Merge 3/3: App overhaul + clustering + cleanup #431): App overhaul, clustering, frontend cleanup, DB fixes, dead code removal
  • investigate module fixes: Simplify run_agent to single inv_id arg (reads config from metadata.json), fail-fast wait_for_backend, fail-fast _format_model_info
  • Md DSL for all LLM prompts: Convert compact_skeptical, dual_view, and graph_interp prompt formatters from f-string concatenation to the Md block-based DSL. Extract shared token_pmi_pairs helper. Add labeled_list to Md.
  • Misc: Fix pre-existing basedpyright warnings, harvest_subrun_id required everywhere, dead scripts deleted

Test plan

  • make check passes (0 errors, 0 warnings)
  • make test passes (455 passed, 13 skipped)
  • Three-agent code review on investigate module (reuse, quality, efficiency)
  • Code simplifier review on DSL changes

🤖 Generated with Claude Code

ocg-goodfire and others added 23 commits March 6, 2026 12:26
SLURM-launched Claude Code agent that investigates specific research
questions about SPD models via MCP tools. Isolated from global config.

Includes research log system (append-only JSONL + markdown), agent
prompt with SPD-specific tool guidance, CLI (spd-investigate).

Co-Authored-By: Claude Opus 4.6 (1M context) <[email protected]>
App Backend:
- AppTokenizer: server-side token display
- Refactored graph computation, absolute-target attribution edges
- SQLite prompt DB on NFS with DELETE journal + fcntl.flock locking
- New routers: graph_interp, investigations, MCP, pretrain_info, run_registry, data_sources
- Unified InterventionResult, target-sans masking, masked predictions
- Spotlight mode, configurable optimization loss (CE/KL/positional)
- Removed get_attribution_strength MCP tool (storage method was deleted)

App Frontend:
- Canvas edges, spotlight mode, 50K edge limit
- New: DataSourcesTab, InvestigationsTab, ClustersTab, ModelGraph,
  DatasetExplorerTab, OptimizationSettings
- Design system: CSS variables, token probability coloring
- Lazy loading, bulk endpoints, Loadable<T> pattern

Clustering:
- CUDA support, memory optimizations, Pile model configs

Cleanup:
- Remove scratch files, CLAUDE.md updates, .gitignore additions

Co-Authored-By: Claude Opus 4.6 (1M context) <[email protected]>
- Delete scripts/{migrate_harvest_data,test_abs_grad_trick,parse_transformer_circuits_post}.py
- Remove spd/app/TODO.md (moved to ~/app-todo-2026-03-04.md for reference)
- Remove hardcoded partition="h200-reserved" in investigations.py
- Narrow bare except Exception to json.JSONDecodeError in investigations.py
- Add exhaustive match default in graph_interp.py (was NameError on unexpected pass_name)

Co-Authored-By: Claude Opus 4.6 (1M context) <[email protected]>
Matches the signature in #428 (investigate module). Uses
DEFAULT_PARTITION_NAME instead of hardcoded string. TODO to remove
when investigate module drops the required partition param.

make check now passes with 0 errors.

Co-Authored-By: Claude Opus 4.6 (1M context) <[email protected]>
- Remove ~87 lines of commented-out _tool_get_component_attributions in mcp.py
- Remove unused DEVICE constant + get_device import + stale TODO in prompts.py
- Remove unused ActivationContextsGenerationConfig from schemas.py

Co-Authored-By: Claude Opus 4.6 (1M context) <[email protected]>
Checked SPD_OUT_DIR/app/prompt_attr.db: ci_masked_label_prob,
stoch_masked_label_prob, adv_pgd_label_prob all exist.

Co-Authored-By: Claude Opus 4.6 (1M context) <[email protected]>
The CREATE TABLE statement already includes edges_data_abs (and all
metric columns). The real DB at SPD_OUT_DIR has all columns present.
No legacy DBs without these columns exist.

Co-Authored-By: Claude Opus 4.6 (1M context) <[email protected]>
The prompt DB is no longer disposable — it's shared team state on NFS.
Schema changes need manual ALTER TABLE with backups. CREATE TABLE
statements are the source of truth.

Co-Authored-By: Claude Opus 4.6 (1M context) <[email protected]>
MeanKL: single version used F.kl_div(reduction="batchmean") which for
[1, seq, vocab] gives sum over all positions. Batched version used
.sum(-1).mean(-1) giving mean over positions. These differ by a factor
of seq_len. Fixed single version to match batched (mean over positions).

search_tokens: was running model forward pass without GPU lock, risking
concurrent CUDA ops with graph computation. Added manager.gpu_lock().

Co-Authored-By: Claude Opus 4.6 (1M context) <[email protected]>
Previously asserted and crashed after the graph was already saved to
the DB, leaving an orphaned graph with no base intervention run. Now
logs a warning and returns early.

Co-Authored-By: Claude Opus 4.6 (1M context) <[email protected]>
~112 lines of dead mock data, mock functions, and MOCK_MODE branches.
Also remove the cross-router MOCK_MODE import in runs.py.

Co-Authored-By: Claude Opus 4.6 (1M context) <[email protected]>
Co-Authored-By: Claude Opus 4.6 (1M context) <[email protected]>
Co-Authored-By: Claude Opus 4.6 (1M context) <[email protected]>
…S_PER_POS

database.py:
- Remove ForkedInterventionRunRecord class
- Remove forked_intervention_runs table + index from schema
- Remove fork cleanup from delete_prompt
- Remove save/get/delete_forked_intervention_run methods
- Remove unused: delete_graphs_for_prompt, delete_graphs_for_run,
  delete_intervention_runs_for_graph, get_intervention_run

graphs.py:
- Import MAX_OUTPUT_NODES_PER_POS from compute.py instead of redefining

Co-Authored-By: Claude Opus 4.6 (1M context) <[email protected]>
All CLI entrypoints and pipeline functions now require explicit
harvest_subrun_id. Eliminates silent fallback to open_most_recent()
which could pick up stale data from a different config.

- autointerp run_interpret.py: main() and get_command() require it
- autointerp run_slurm.py: submit_autointerp() requires it
- autointerp run_slurm_cli.py: CLI requires --harvest_subrun_id
- autointerp scoring/run_label_scoring.py: main() and get_command() require it
- dataset_attributions config.py: harvest_subrun_id is required on config
- dataset_attributions harvest.py: _build_alive_masks requires it
- dataset_attributions run_slurm.py: removed harvest_subrun_id param (now in config)
- postprocess __init__.py: sets harvest_subrun_id on attr config from harvest result

Co-Authored-By: Claude Opus 4.6 (1M context) <[email protected]>
Same block-based DSL from graph_interp, canonical location for
autointerp strategies to import from too.

Co-Authored-By: Claude Opus 4.6 (1M context) <[email protected]>
Convert compact_skeptical, dual_view, and graph_interp prompt formatters
from f-string concatenation to the Md block-based DSL. Extract shared
token_pmi_pairs helper into prompt_helpers. Add labeled_list to Md for
the common bold-header + bullet-items pattern.

Also fix two pre-existing basedpyright warnings: DONE_MARKER import
path in graph_interp/repo.py and unused param in test_storage.py.

Co-Authored-By: Claude Opus 4.6 (1M context) <[email protected]>
# Conflicts:
#	spd/app/backend/routers/graphs.py
#	spd/investigate/scripts/run_agent.py
- run_agent reads all config from metadata.json instead of duplicating
  as CLI args (wandb_path, context_length, max_turns)
- wait_for_backend raises directly instead of returning bool
- _format_model_info accesses keys directly instead of .get() fallbacks

Co-Authored-By: Claude Opus 4.6 (1M context) <[email protected]>
@ocg-goodfire
Copy link
Collaborator Author

Superseded by a cleaner PR with just the investigate fixes

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant