Merge 3/3: App overhaul + clustering + cleanup by ocg-goodfire · Pull Request #431 · goodfire-ai/spd

ocg-goodfire · 2026-03-06T12:32:26Z

Summary

Depends on: PR2 modules (#427 graph_interp, #428 investigate, #429 editing, #430 postprocess) — merge those first.

Passes make check once all dependencies are merged.

App Backend

AppTokenizer: server-side token display
Refactored graph computation, absolute-target attribution edges (∂|y|/∂x · x)
SQLite prompt DB on NFS with DELETE journal + fcntl.flock write locking
New routers: graph_interp, investigations, MCP, pretrain_info, run_registry, data_sources
Unified InterventionResult, target-sans masking, masked predictions (CI/stochastic/adversarial)
Spotlight mode, configurable optimization loss (CE/KL/positional)
Removed get_attribution_strength MCP tool (underlying get_attribution method deleted in PR1)

App Frontend

Canvas edges, spotlight mode, 50K edge limit
New: DataSourcesTab, InvestigationsTab, ClustersTab, ModelGraph, DatasetExplorerTab, OptimizationSettings
Design system: CSS variables, token probability coloring, adaptive text contrast
Lazy loading, bulk endpoints, Loadable<T> pattern

Clustering

CUDA support, memory optimizations, Pile model configs

Cleanup

Remove scratch files (find_clean_facts.py, etc.)
CLAUDE.md updates throughout
.gitignore additions

Review by Claude Code (Opus 4.6).

App Backend: - AppTokenizer: server-side token display - Refactored graph computation, absolute-target attribution edges - SQLite prompt DB on NFS with DELETE journal + fcntl.flock locking - New routers: graph_interp, investigations, MCP, pretrain_info, run_registry, data_sources - Unified InterventionResult, target-sans masking, masked predictions - Spotlight mode, configurable optimization loss (CE/KL/positional) - Removed get_attribution_strength MCP tool (storage method was deleted) App Frontend: - Canvas edges, spotlight mode, 50K edge limit - New: DataSourcesTab, InvestigationsTab, ClustersTab, ModelGraph, DatasetExplorerTab, OptimizationSettings - Design system: CSS variables, token probability coloring - Lazy loading, bulk endpoints, Loadable<T> pattern Clustering: - CUDA support, memory optimizations, Pile model configs Cleanup: - Remove scratch files, CLAUDE.md updates, .gitignore additions Co-Authored-By: Claude Opus 4.6 (1M context) <[email protected]>

- Delete scripts/{migrate_harvest_data,test_abs_grad_trick,parse_transformer_circuits_post}.py - Remove spd/app/TODO.md (moved to ~/app-todo-2026-03-04.md for reference) - Remove hardcoded partition="h200-reserved" in investigations.py - Narrow bare except Exception to json.JSONDecodeError in investigations.py - Add exhaustive match default in graph_interp.py (was NameError on unexpected pass_name) Co-Authored-By: Claude Opus 4.6 (1M context) <[email protected]>

Co-Authored-By: Claude Opus 4.6 (1M context) <[email protected]>

Matches the signature in #428 (investigate module). Uses DEFAULT_PARTITION_NAME instead of hardcoded string. TODO to remove when investigate module drops the required partition param. make check now passes with 0 errors. Co-Authored-By: Claude Opus 4.6 (1M context) <[email protected]>

- Remove ~87 lines of commented-out _tool_get_component_attributions in mcp.py - Remove unused DEVICE constant + get_device import + stale TODO in prompts.py - Remove unused ActivationContextsGenerationConfig from schemas.py Co-Authored-By: Claude Opus 4.6 (1M context) <[email protected]>

Checked SPD_OUT_DIR/app/prompt_attr.db: ci_masked_label_prob, stoch_masked_label_prob, adv_pgd_label_prob all exist. Co-Authored-By: Claude Opus 4.6 (1M context) <[email protected]>

The CREATE TABLE statement already includes edges_data_abs (and all metric columns). The real DB at SPD_OUT_DIR has all columns present. No legacy DBs without these columns exist. Co-Authored-By: Claude Opus 4.6 (1M context) <[email protected]>

The prompt DB is no longer disposable — it's shared team state on NFS. Schema changes need manual ALTER TABLE with backups. CREATE TABLE statements are the source of truth. Co-Authored-By: Claude Opus 4.6 (1M context) <[email protected]>

MeanKL: single version used F.kl_div(reduction="batchmean") which for [1, seq, vocab] gives sum over all positions. Batched version used .sum(-1).mean(-1) giving mean over positions. These differ by a factor of seq_len. Fixed single version to match batched (mean over positions). search_tokens: was running model forward pass without GPU lock, risking concurrent CUDA ops with graph computation. Added manager.gpu_lock(). Co-Authored-By: Claude Opus 4.6 (1M context) <[email protected]>

Previously asserted and crashed after the graph was already saved to the DB, leaving an orphaned graph with no base intervention run. Now logs a warning and returns early. Co-Authored-By: Claude Opus 4.6 (1M context) <[email protected]>

~112 lines of dead mock data, mock functions, and MOCK_MODE branches. Also remove the cross-router MOCK_MODE import in runs.py. Co-Authored-By: Claude Opus 4.6 (1M context) <[email protected]>

…S_PER_POS database.py: - Remove ForkedInterventionRunRecord class - Remove forked_intervention_runs table + index from schema - Remove fork cleanup from delete_prompt - Remove save/get/delete_forked_intervention_run methods - Remove unused: delete_graphs_for_prompt, delete_graphs_for_run, delete_intervention_runs_for_graph, get_intervention_run graphs.py: - Import MAX_OUTPUT_NODES_PER_POS from compute.py instead of redefining Co-Authored-By: Claude Opus 4.6 (1M context) <[email protected]>

All CLI entrypoints and pipeline functions now require explicit harvest_subrun_id. Eliminates silent fallback to open_most_recent() which could pick up stale data from a different config. - autointerp run_interpret.py: main() and get_command() require it - autointerp run_slurm.py: submit_autointerp() requires it - autointerp run_slurm_cli.py: CLI requires --harvest_subrun_id - autointerp scoring/run_label_scoring.py: main() and get_command() require it - dataset_attributions config.py: harvest_subrun_id is required on config - dataset_attributions harvest.py: _build_alive_masks requires it - dataset_attributions run_slurm.py: removed harvest_subrun_id param (now in config) - postprocess __init__.py: sets harvest_subrun_id on attr config from harvest result Co-Authored-By: Claude Opus 4.6 (1M context) <[email protected]>

Same block-based DSL from graph_interp, canonical location for autointerp strategies to import from too. Co-Authored-By: Claude Opus 4.6 (1M context) <[email protected]>

Convert compact_skeptical, dual_view, and graph_interp prompt formatters from f-string concatenation to the Md block-based DSL. Extract shared token_pmi_pairs helper into prompt_helpers. Add labeled_list to Md for the common bold-header + bullet-items pattern. Also fix two pre-existing basedpyright warnings: DONE_MARKER import path in graph_interp/repo.py and unused param in test_storage.py. Co-Authored-By: Claude Opus 4.6 (1M context) <[email protected]>

…, investigate - graph_interp/db.py: Extract parameterized _save_label/_get_label/_get_all_labels from 3x3 duplicated CRUD methods - graph_interp/interpret.py: Unify process_output_layer/process_input_layer via _make_process_layer factory - autointerp/prompt_helpers.py: Deduplicate build_fires_on_examples/build_says_examples into _build_examples - graph_interp/prompts.py: Simplify _format_related string building with f-string - investigate/agent_prompt.py: Replace repetitive config blocks with data-driven loop - investigate/scripts/run_agent.py: Remove obvious docstrings, simplify fetch_model_info Co-Authored-By: Claude Opus 4.6 (1M context) <[email protected]>

…tale docs Backend: - graphs.py: Extract _build_loss_config, _build_loss_result, _maybe_pgd_config, _maybe_adv_pgd helpers - server.py: Move deferred stdlib imports to module-level - __init__.py: Fix __all__ ordering - CLAUDE.md: Remove duplicate router entries - sqlite.py: Fix stale docstring referencing old DB location Frontend components: - Deduplicate getTopEdgeAttributions into shared topEdgeAttributions() in promptAttributionsTypes.ts - Extract generic parseSSEStream<T>() in graphs.ts, eliminating ~50 lines of duplicated SSE parsing - Extract AVAILABILITY_COLUMNS in RunSelector, reducing ~60 lines of duplicated template - Eliminate redundant computeMaxAbsComponentAct in ActivationContextsViewer + ClusterComponentCard - Fix unreachable null check in ClusterComponentCard - Fix mid-file import in ComponentNodeCard - Remove dead fork handler stubs in PromptAttributionsTab - Remove unused isRunEditable export, 5 unused CSS selectors, 12+ unnecessary comments Co-Authored-By: Claude Opus 4.6 (1M context) <[email protected]>

…rt both-or-neither Co-Authored-By: Claude Opus 4.6 (1M context) <[email protected]>

ocg-goodfire and others added 16 commits March 6, 2026 12:32

Use ValueError instead of assert False for unexpected pass_name

65ca30f

Co-Authored-By: Claude Opus 4.6 (1M context) <[email protected]>

Merge remote-tracking branch 'origin/dev' into merge/3-app-cleanup

ce29a31

Remove unreachable default case — pyright proves pass_name is exhaustive

3313f6b

Co-Authored-By: Claude Opus 4.6 (1M context) <[email protected]>

Revert unnecessary migration — real DB already has the columns

3b6d019

Checked SPD_OUT_DIR/app/prompt_attr.db: ci_masked_label_prob, stoch_masked_label_prob, adv_pgd_label_prob all exist. Co-Authored-By: Claude Opus 4.6 (1M context) <[email protected]>

Remove MOCK_MODE and all mock code from graph_interp router

1e2598d

~112 lines of dead mock data, mock functions, and MOCK_MODE branches. Also remove the cross-router MOCK_MODE import in runs.py. Co-Authored-By: Claude Opus 4.6 (1M context) <[email protected]>

remove dead scripts

c959635

ocg-goodfire mentioned this pull request Mar 6, 2026

graph_interp follow-up: DSL, PMI, dead code cleanup #432

Closed

ocg-goodfire and others added 2 commits March 6, 2026 15:54

Add Md DSL to spd/utils/ for shared use across prompt builders

48b91ac

Same block-based DSL from graph_interp, canonical location for autointerp strategies to import from too. Co-Authored-By: Claude Opus 4.6 (1M context) <[email protected]>

ocg-goodfire mentioned this pull request Mar 6, 2026

Merge 2b+3: investigate + app cleanup + DSL prompts #433

Closed

4 tasks

ocg-goodfire and others added 3 commits March 6, 2026 17:28

Merge _maybe_pgd_config + _maybe_adv_pgd into single _maybe_pgd, asse…

e5fe094

…rt both-or-neither Co-Authored-By: Claude Opus 4.6 (1M context) <[email protected]>

ocg-goodfire merged commit 4e97229 into dev Mar 6, 2026

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Merge 3/3: App overhaul + clustering + cleanup#431

Merge 3/3: App overhaul + clustering + cleanup#431
ocg-goodfire merged 21 commits intodevfrom
merge/3-app-cleanup

ocg-goodfire commented Mar 6, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

Conversation

ocg-goodfire commented Mar 6, 2026

Summary

App Backend

App Frontend

Clustering

Cleanup

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant