Skip to content
Draft
Show file tree
Hide file tree
Changes from all commits
Commits
Show all changes
519 commits
Select commit Hold shift + click to select a range
74cdf00
Restore RMSNorm modules in reverse residual CI
claude-spd1 Feb 2, 2026
e8230c5
Merge branch 'dev' into feature/continuous-pgd
claude-spd1 Feb 2, 2026
2eda121
Rename ContinuousPGD to PersistentPGD and simplify mask shape
claude-spd1 Feb 2, 2026
05b4947
Remove AliveComponentsTracker
claude-spd1 Feb 2, 2026
43ffc26
Add PersistentPGDReconSubsetLoss for subset routing
claude-spd1 Feb 2, 2026
ed48b4e
wip: Add Adam optimizer support for persistent PGD masks
claude-spd1 Feb 2, 2026
53cbea9
wip: Refactor persistent PGD: remove PersistentPGDResult, update mask…
claude-spd1 Feb 2, 2026
833e69f
wip: Fix persistent PGD loss weighting and variable naming
claude-spd1 Feb 3, 2026
db6aa82
Misc changes merged from main
claude-spd1 Feb 4, 2026
8609521
Merge branch 'main' into dev
claude-spd1 Feb 4, 2026
38f37ac
Merge branch 'main' into dev
claude-spd1 Feb 4, 2026
ba149fe
Merge branch 'main' into dev
claude-spd1 Feb 4, 2026
81866c8
Add transition_hidden_dim to GlobalReverseResidualCiFn and new experi…
claude-spd1 Feb 4, 2026
6e243bf
wip: Add scope config to PersistentPGD, integrate into compute_losses
claude-spd1 Feb 4, 2026
b5ffbfb
wip: Extract mask shape computation before loop to reduce duplication
claude-spd1 Feb 4, 2026
aad4413
merge
claude-spd1 Feb 4, 2026
cd04fc5
clean up configs
claude-spd1 Feb 4, 2026
e849997
fix eval compatibility
claude-spd1 Feb 4, 2026
b7a41da
clean up ppgd mask types
claude-spd1 Feb 4, 2026
58c7aeb
remove registry additions
claude-spd1 Feb 4, 2026
16511ae
fix test and include perspgd in eval
claude-spd1 Feb 4, 2026
13b4780
merge
claude-spd1 Feb 4, 2026
4be170e
Merge branch 'main' into dev
claude-spd1 Feb 5, 2026
40c847d
Merge branch 'global-shared-transformer-ci' into dev
claude-spd1 Feb 5, 2026
f9083a3
Add batch_invariant scope to persistent PGD (#358)
ocg-goodfire Feb 6, 2026
d9fddd4
Merge branch 'main' into dev
danbraunai-goodfire Feb 6, 2026
23a8ad7
Make transition_hidden_dim optional in GlobalReverseResidualCiFn
claude-spd1 Feb 6, 2026
929643b
Add optional mem parameter to SlurmConfig
claude-spd1 Feb 6, 2026
4237c6f
Add experiment configs for jan22 runs
claude-spd1 Feb 6, 2026
a05ce11
App frontend updates (#362)
ocg-goodfire Feb 6, 2026
d59ac2d
Explicitly broadcast persistent PGD masks from rank 0
claude-spd1 Feb 7, 2026
f540e85
Merge branch 'main' into dev
danbraunai-goodfire Feb 7, 2026
a3b3bab
Autointerp improvements: strategy pattern, scoring, eval, and reporting
ocg-goodfire Feb 8, 2026
dad6b5b
Add unified spd-postprocess CLI for SLURM dependency-chained postproc…
ocg-goodfire Feb 8, 2026
7f65eaf
Clean up autointerp PR per review feedback
claude-spd1 Feb 9, 2026
8d049dc
Merge branch 'main' into dev
danbraunai-goodfire Feb 9, 2026
0f704fc
Add exhaustive match default cases in dispatch.py
claude-spd1 Feb 9, 2026
f2b4216
Persistent PGD: scope renames, source terminology, sigmoid param, opt…
ocg-goodfire Feb 9, 2026
6713338
Move postprocess pipeline from CLI args to YAML config
Feb 9, 2026
3f917fb
Centralize all SLURM job scheduling in postprocess pipeline
Feb 9, 2026
d23582c
Make autointerp a functional unit that owns its eval jobs
Feb 9, 2026
9403457
Move intruder eval from autointerp into harvest functional unit
Feb 9, 2026
27f0460
Merge branch 'main' into dev
claude-spd1 Feb 9, 2026
ff2499d
Update default output_dir for logs and trained models
danbraunai-goodfire Feb 9, 2026
9ecdb35
Merge branch 'main' into dev
danbraunai-goodfire Feb 9, 2026
c3aac34
Merge remote-tracking branch 'origin/dev' into feature/autointerp-imp…
claude-spd1 Feb 10, 2026
25eb2da
Fix CI: suppress reportUnreachable in dispatch.py, add datasetAttribu…
claude-spd1 Feb 10, 2026
e0b8c10
Generalize app to arbitrary transformer models (#373)
ocg-goodfire Feb 10, 2026
29721bf
Abstract app over models/data/tokenizers and migrate autointerp to Ap…
claude-spd1 Feb 10, 2026
e35b6fa
Merge branch 'main' into dev
danbraunai-goodfire Feb 10, 2026
8f1b5a6
Refactor postprocess config: typed Pydantic configs, single source of…
claude-spd1 Feb 10, 2026
636e6c0
Extract configs into config.py files, flatten harvest/lib/, add AppTo…
claude-spd1 Feb 10, 2026
0a86d1a
Move postprocess into spd/scripts/postprocess/ directory
claude-spd1 Feb 10, 2026
5f14011
Tune defaults (n_batches=1000, n_gpus=8 for attributions), fix stale …
claude-spd1 Feb 10, 2026
9388cb2
Validate wandb_path early in postprocess() before submitting jobs
claude-spd1 Feb 10, 2026
3db2221
Handle Fire parsing config_json as dict instead of str
claude-spd1 Feb 10, 2026
7338da7
Merge remote-tracking branch 'origin/dev' into feature/autointerp-imp…
claude-spd1 Feb 10, 2026
f0e817c
Add migration for unique_per_batch_per_token → per_batch_per_position…
claude-spd1 Feb 10, 2026
3283689
Harvest defaults: batch_size 128, n_batches 2000 (OOM at 256)
claude-spd1 Feb 10, 2026
b4f2c21
Use ModelAdapter modules in AttributionHarvester instead of hardcoded…
claude-spd1 Feb 10, 2026
0471152
Always resolve unembed module in ModelAdapter
claude-spd1 Feb 10, 2026
66245ec
Attribution default batch_size 128 (OOM at 256)
claude-spd1 Feb 10, 2026
6dbd97d
Reify ArchConfig per model class with explicit path patterns
ocg-goodfire Feb 10, 2026
72519a2
Bump harvest merge mem to 128G (OOM at 64G for large models)
claude-spd1 Feb 10, 2026
b6995b9
Fix GPT2 config patterns and update model adapter tests
ocg-goodfire Feb 10, 2026
04791aa
Merge remote-tracking branch 'origin/dev' into feature/autointerp-imp…
claude-spd1 Feb 10, 2026
80dd6c9
Use exclude_none=True in config JSON serialization for Fire compatibi…
claude-spd1 Feb 10, 2026
47277b8
Default reasoning_effort to None in CompactSkepticalConfig
claude-spd1 Feb 10, 2026
502d3e6
Handle HF GPT-2 layer paths in autointerp prompt formatting
claude-spd1 Feb 10, 2026
c0059d2
Add design doc for abstract transformer topology
claude-spd1 Feb 10, 2026
9fe6e54
Add TransformerTopology: unified abstract transformer structure
claude-spd1 Feb 10, 2026
da58d2a
Replace ModelAdapter with TransformerTopology
claude-spd1 Feb 10, 2026
a99eab5
Migrate consumers to TransformerTopology
claude-spd1 Feb 10, 2026
8203c86
Delete ModelAdapter, migrate all consumers to TransformerTopology
claude-spd1 Feb 10, 2026
06f640f
Merge main to dev
danbraunai-goodfire Feb 10, 2026
0236300
Fix stale CLAUDE.md references to deleted ModelAdapter
claude-spd1 Feb 10, 2026
a76f485
Assert exactly one integer segment in block index extraction
claude-spd1 Feb 10, 2026
834861e
Remove role from LayerInfo, make it construction-only
claude-spd1 Feb 10, 2026
0c8fe5d
WIP: PathSchema + CanonicalWeight types
claude-spd1 Feb 10, 2026
8fc9ae3
WIP: PathSchema subclasses, describe moved to frontend
claude-spd1 Feb 10, 2026
1c911a4
WIP: Node.layer is now CanonicalWeight, canonical_to_str, translate a…
claude-spd1 Feb 10, 2026
90b3911
WIP: CanonicalWeight as ABC with canonical_str method
claude-spd1 Feb 10, 2026
e28add6
WIP: Oli's topology tweaks
claude-spd1 Feb 10, 2026
cbdb292
WIP: parse_canonical_str, intervention router translates canonical ->…
claude-spd1 Feb 10, 2026
2cc5e80
WIP: Move parse to CanonicalWeight.parse() static method
claude-spd1 Feb 10, 2026
58b9dd4
WIP: Translate canonical <-> concrete at router boundaries (activatio…
claude-spd1 Feb 10, 2026
c17e4fd
WIP: Translate canonical <-> concrete in dataset_attributions router
claude-spd1 Feb 10, 2026
ef0e1e3
WIP: Harvest and attributions store canonical keys natively
claude-spd1 Feb 10, 2026
5f468bb
Revert "WIP: Harvest and attributions store canonical keys natively"
claude-spd1 Feb 10, 2026
aae7073
WIP: Rewrite graphLayout.ts for canonical addresses, no ModelInfo dep…
claude-spd1 Feb 10, 2026
803f37f
WIP: Remove ModelInfo from frontend — canonical addresses are self-de…
claude-spd1 Feb 10, 2026
94b5dd2
WIP: Fix remaining type errors — add get_unembed_weight, remove Model…
claude-spd1 Feb 10, 2026
ed2bd4f
WIP: Add missing POST /api/prompts/custom endpoint
claude-spd1 Feb 10, 2026
7897544
WIP: App works with s-275c8f21 — resilience to missing harvest data, …
claude-spd1 Feb 11, 2026
fe423bd
Cleanup: DRY topology.py, replace try/except with None-checks in comp…
claude-spd1 Feb 11, 2026
ca63700
Add [perf] logging throughout graph computation codepath
claude-spd1 Feb 11, 2026
85b2366
Polish: display escaping, strip padding at source, reduce prefetch pa…
claude-spd1 Feb 11, 2026
56e7f06
Rename canonical embedding address from "wte" to "embed"
claude-spd1 Feb 11, 2026
12a196f
Autointerp: shared RateLimiter, app cleanup
claude-spd1 Feb 11, 2026
76baa29
Strip padding sentinels in ActivationExample.__post_init__
claude-spd1 Feb 11, 2026
8ed3ba5
Rename padding strip to _strip_legacy_padding for clarity
claude-spd1 Feb 11, 2026
af15a6a
Introduce Repository pattern: HarvestRepo, InterpRepo, AttributionRepo
claude-spd1 Feb 11, 2026
615d39c
Migrate harvest component data from JSONL to SQLite
claude-spd1 Feb 11, 2026
d6669a6
Refactor autointerp LLM API: fix rate limiter, unify client interface
claude-spd1 Feb 11, 2026
ff8db55
Migrate autointerp persistence from JSONL to SQLite
claude-spd1 Feb 11, 2026
b8ddb9d
Fix migration scripts for old data formats
claude-spd1 Feb 11, 2026
d73b0e8
Delete harvest/loaders.py — all consumers use repos
claude-spd1 Feb 11, 2026
bf5ec9e
Store raw logits in DB instead of pre-computed output probs
claude-spd1 Feb 11, 2026
c34672c
Fix type errors and lint from repo migration + logits refactor
claude-spd1 Feb 11, 2026
54639c6
wip: Delete unused model adapter tests
claude-spd1 Feb 11, 2026
d92f795
Make spd-postprocess idempotent: skip steps with existing outputs
claude-spd1 Feb 11, 2026
043274c
Simplify postprocess: always run all steps, data accumulates
claude-spd1 Feb 11, 2026
9f099bf
Add sub-run versioning to harvest and dataset attributions
claude-spd1 Feb 11, 2026
9d26374
Move intruder scores from InterpDB to HarvestDB
claude-spd1 Feb 11, 2026
1743c9e
Clean up adapter→topology naming, remove plan/migration files
claude-spd1 Feb 11, 2026
96702c1
Fix test failures: escape_for_display in tokenizer test, target_model…
claude-spd1 Feb 11, 2026
8effcd0
Fix bugs: translate concrete→canonical keys in intruder scores and co…
claude-spd1 Feb 11, 2026
65452e6
Fix topology.py docstring: list all 4 sublayer types, not just "ffn"
claude-spd1 Feb 11, 2026
2cc485d
Split canonical address regex into separate patterns with XOR assertion
claude-spd1 Feb 11, 2026
a961ab7
Split topology into package, narrow public API to TransformerTopology…
claude-spd1 Feb 11, 2026
8bdfee1
Underscore-prefix all internal topology types, remove section comments
claude-spd1 Feb 11, 2026
4c646dd
Rewrite _parse_block_path with strict regex matching
claude-spd1 Feb 11, 2026
d50161b
Extract path schemas into spd/topology/path_schemas.py
claude-spd1 Feb 11, 2026
447b488
Pass harvest sub-run ID to intruder eval, don't rely on "latest"
claude-spd1 Feb 11, 2026
2d4325d
Thread harvest sub-run ID through entire pipeline, no "latest" lookups
claude-spd1 Feb 11, 2026
995b1ab
Pin all pipeline scripts to explicit harvest sub-run ID
claude-spd1 Feb 11, 2026
5de9a83
Open HarvestDB in read-only mode for HarvestRepo
claude-spd1 Feb 11, 2026
e935ee5
Open HarvestDB with immutable=1 for read-only access
claude-spd1 Feb 11, 2026
16c7fde
Persist postprocess dispatch manifest
claude-spd1 Feb 11, 2026
6ab37bf
Move watch script to spd/postprocess/scripts, fix plot script
claude-spd1 Feb 11, 2026
c7e9e6c
Add log scrapers for job progress in postprocess watcher
claude-spd1 Feb 11, 2026
4221236
Move postprocess watcher out of repo to ~/pp-watch
claude-spd1 Feb 11, 2026
042e7a7
Merge remote-tracking branch 'origin/dev' into feature/autointerp-imp…
claude-spd1 Feb 11, 2026
eef543b
Fix harvest OOM on large models, tune defaults
claude-spd1 Feb 11, 2026
f25b83e
Bump harvest merge memory 128G→200G
claude-spd1 Feb 11, 2026
02dfaf1
Merge harvest worker states on GPU to avoid OOM
claude-spd1 Feb 11, 2026
5987aeb
Fix harvest OOM on large models, tune defaults
claude-spd1 Feb 11, 2026
2121d7f
Merge remote-tracking branch 'origin/dev' into fix/harvest-merge-oom
claude-spd1 Feb 11, 2026
4fa5fc5
Fix harvest OOM on large models, tune defaults
claude-spd1 Feb 11, 2026
9baf37b
Split harvest entrypoints, make intruder eval optional
claude-spd1 Feb 11, 2026
ca3cb59
Remove legacy --merge flag from harvest run.py
claude-spd1 Feb 11, 2026
2c8d37f
Delete harvest scripts/run.py, replaced by run_worker.py and run_merg…
claude-spd1 Feb 11, 2026
9d9d36f
Move intruder_eval toggle to HarvestSlurmConfig
claude-spd1 Feb 11, 2026
3aae94e
Update MERGE_OOM.md with resolution
claude-spd1 Feb 11, 2026
7720d26
Simplify autointerp reasoning config, add reasoning to evals, bump ha…
claude-spd1 Feb 11, 2026
efd17cc
fix app db access
claude-spd1 Feb 11, 2026
6289617
Fix harvest merge OOM: stream components instead of accumulating in m…
claude-spd1 Feb 12, 2026
cfa4616
wip: Remove unused LLMClientConfig dataclass and inline parameters
claude-spd1 Feb 12, 2026
d43c92a
Merge branch 'dev' into fix/harvest-merge-oom
claude-spd1 Feb 12, 2026
baa65cb
Replace Python list reservoir with tensor reservoir in Harvester
claude-spd1 Feb 12, 2026
1f9d5a8
wip: Update harvest merge logging for consistency
claude-spd1 Feb 12, 2026
7671995
Collapse HarvesterState into Harvester, delete reservoir_sampler.py
claude-spd1 Feb 12, 2026
8e649fa
Move free functions into Harvester methods, add 30 unit tests
claude-spd1 Feb 12, 2026
2adff6d
Extract ActivationExamplesReservoir into reservoir.py
claude-spd1 Feb 12, 2026
d12e407
Rename harvester fields for clarity
claude-spd1 Feb 12, 2026
134eb70
Extract extract_firing_windows, rename fields for clarity
claude-spd1 Feb 12, 2026
9464355
Use __init__ in Harvester.load instead of __new__
claude-spd1 Feb 12, 2026
53d13a3
Remove default device args — require explicit device everywhere
claude-spd1 Feb 12, 2026
c046526
Replace unsqueeze/expand with einops rearrange/repeat
claude-spd1 Feb 12, 2026
e250a6d
Add readonly mode to InterpDB, use in app read paths
claude-spd1 Feb 12, 2026
8741472
Add Data Sources tab, per-subrun interp.db, eager repo construction
claude-spd1 Feb 12, 2026
c5fd73e
Move intruder scores to harvest in Data Sources, improve harvest jaxt…
claude-spd1 Feb 12, 2026
36d5cf9
Add dataset attributions to Data Sources, consolidate attribution loa…
claude-spd1 Feb 12, 2026
537f554
Remove stale MERGE_OOM.md
claude-spd1 Feb 12, 2026
05b4d03
Clean up component_data router, delete autointerp loaders
claude-spd1 Feb 12, 2026
cb11f5a
Remove __new__ from reservoir, add .create()/.to(), delete autointerp…
claude-spd1 Feb 12, 2026
559816f
Replace hand-rolled rate limiter with aiolimiter token bucket + globa…
claude-spd1 Feb 12, 2026
f6d7948
Uncap rate limiter, let global backoff find equilibrium
claude-spd1 Feb 12, 2026
7ece220
Clean up postprocess system: configs, throttling, validation, signatures
claude-spd1 Feb 12, 2026
f1adbbe
Use openrouter Effort type, default reasoning_effort to low
claude-spd1 Feb 12, 2026
81855c4
Move intruder eval from autointerp to harvest
claude-spd1 Feb 12, 2026
a353a98
Handle Fire parsing JSON args to dicts in eval CLIs
claude-spd1 Feb 12, 2026
594c975
Default intruder eval to off (too expensive for routine use)
claude-spd1 Feb 12, 2026
6898bcb
Restore rate limit to 200/min
claude-spd1 Feb 12, 2026
f8006a0
Make rate limit configurable via max_requests_per_minute
claude-spd1 Feb 12, 2026
af62a98
Bump scorer max_tokens to 5000 to avoid truncation with reasoning
claude-spd1 Feb 12, 2026
5b34803
Default eval reasoning_effort to none (simple classification, no thin…
claude-spd1 Feb 12, 2026
802b551
Pass reasoning_effort through to fuzzing scorer
claude-spd1 Feb 12, 2026
553c22e
Add center-on-peak token alignment and log mean CI plot to activation…
claude-spd1 Feb 12, 2026
7291097
App UI improvements: log Y toggle, global center-on-peak, simplified nav
claude-spd1 Feb 12, 2026
f16c14c
Add wandb link to nav, TODO.md, and registry entry
claude-spd1 Feb 12, 2026
8c8de47
wip: Update canonical runs registry with correct model names
claude-spd1 Feb 12, 2026
e84fd76
Decouple intruder eval from harvest, make it a top-level postprocess …
claude-spd1 Feb 12, 2026
a26e46b
wip: Reorder postprocess config fields to match dependency graph
claude-spd1 Feb 12, 2026
069eb14
wip: Add dry-run flag to postprocess CLI
claude-spd1 Feb 12, 2026
fdb23a3
Add pretrain model info to app and paginate token stats
claude-spd1 Feb 13, 2026
1e6dbf2
Remove duplicate model config from data sources tab
claude-spd1 Feb 13, 2026
4aef8b8
Fix PersistentPGDState source shape for non-sequence (MSE) inputs (#389)
danbraunai-goodfire Feb 13, 2026
1a31713
Merge branch 'main' into dev
danbraunai-goodfire Feb 15, 2026
c0ba235
Merge branch 'main' into dev
danbraunai-goodfire Feb 15, 2026
b433c3c
Fix old call of call_on_rank0_then_broadcast
danbraunai-goodfire Feb 16, 2026
e8c569a
Fix autointerp hanging/JSON errors, DRY concurrent LLM pipeline (#384)
ocg-goodfire Feb 16, 2026
2b72ed6
Merge branch 'main' into dev
danbraunai-goodfire Feb 16, 2026
cef0ef2
Add back the log viewer to the app
danbraunai-goodfire Feb 16, 2026
ac2f4d8
Search for tokens server-side
danbraunai-goodfire Feb 16, 2026
52bfbfa
Use canvas to avoid linear overhead w.r.t number of edges
danbraunai-goodfire Feb 16, 2026
332c2f1
Lazily load the component data. Huge speedups.
danbraunai-goodfire Feb 16, 2026
fbf593c
Increase GLOBAL_EDGE_LIMIMT to 50k and sort on the backend
danbraunai-goodfire Feb 16, 2026
f94ceec
Add adversarial PGD loss to app CI optimization (#383)
danbraunai-goodfire Feb 16, 2026
753f063
Make edge rendering faster and increase limit to 50k
danbraunai-goodfire Feb 16, 2026
642858c
Fix PersistentPGDReconSubsetLoss routing to all layers instead of sub…
danbraunai-goodfire Feb 17, 2026
76cad7c
Fix broken attribution connection between layers
danbraunai-goodfire Feb 17, 2026
a033398
Merge branch 'main' into dev
danbraunai-goodfire Feb 17, 2026
3859e55
Use AVG instead of SUM for persistent PGD source gradient all-reduce …
danbraunai-goodfire Feb 17, 2026
c7d50e2
Support n_warmup_steps in persistent PGD (#397)
danbraunai-goodfire Feb 18, 2026
b3bdeed
Merge branch 'main' into dev
danbraunai-goodfire Feb 18, 2026
cf66851
Merge branch 'main' into dev
danbraunai-goodfire Feb 18, 2026
0b3fd29
Remove deprecated gradient_accumulation_steps from config files
danbraunai-goodfire Feb 18, 2026
5a8b598
Generalize harvest data model + adapters over decomposition methods (…
ocg-goodfire Feb 18, 2026
6dd7162
Simplify autointerp LLM API + generalize evals (#400)
ocg-goodfire Feb 18, 2026
78d3577
Add dual-view autointerp strategy (#401)
ocg-goodfire Feb 18, 2026
d52071c
Add init_spd_checkpoint field back to Config
ocg-goodfire Feb 18, 2026
162ea12
Fix attributions dependency on harvest in postprocess pipeline
ocg-goodfire Feb 18, 2026
49d0b37
Fix fire.Fire JSON parsing in all SLURM worker scripts
ocg-goodfire Feb 18, 2026
1794635
Fix fire.Fire JSON parsing in all SLURM worker scripts
ocg-goodfire Feb 18, 2026
b0bdda4
Rename subrun_id to manifest_id in postprocess manifest
ocg-goodfire Feb 18, 2026
476951a
Move component model to device in SPDHarvestFn
ocg-goodfire Feb 19, 2026
62a2372
Add lr_schedule to PPGD
danbraunai-goodfire Feb 19, 2026
8a2c7a1
Autointerp and harvest fixes from editing branch (#406)
ocg-goodfire Feb 19, 2026
afb6843
Request 1 GPU for autointerp/eval/intruder SLURM jobs
ocg-goodfire Feb 19, 2026
1be2bfc
Fix YAML configs to use current schema and fix misleading error messa…
danbraunai-goodfire Feb 19, 2026
c7b8e73
Add CI and PGD variants of hidden acts recon loss
danbraunai-goodfire Feb 19, 2026
0e829a3
Fix SQLite issues on NFS: remove WAL, separate read/write connections
ocg-goodfire Feb 20, 2026
d39c91d
Fix attributions SLURM passing full config instead of inner config
ocg-goodfire Feb 20, 2026
8ef8faa
Cleanup sum
danbraunai-goodfire Feb 23, 2026
6d576a0
NOT-REVIEWED: Use existing sources for hidden act loss
danbraunai-goodfire Feb 23, 2026
3e3d101
Merge branch 'main' into dev
danbraunai-goodfire Feb 23, 2026
26fc5ee
Fix typo: PGDHiddenActsReconLoss → PPGDHiddenActsReconLoss in 4L config
danbraunai-goodfire Feb 24, 2026
6cf7b03
PPGDEvalLosses to encompass hidden acts and output recon
danbraunai-goodfire Feb 24, 2026
cbbc47d
Inline _accumulate_into_state()
danbraunai-goodfire Feb 24, 2026
15644a4
Run PPGD evals by default if exists in loss metrics
danbraunai-goodfire Feb 24, 2026
5bf47bd
Go back to config-based eval structure
danbraunai-goodfire Feb 24, 2026
a0dde78
Use PersistentPGDReconSubsetEval and PersistentPGDReconEval
danbraunai-goodfire Feb 24, 2026
77a3689
Init different ppgd sources in different ranks for PerBatchPerPosition
danbraunai-goodfire Feb 25, 2026
d067a3a
Update config
danbraunai-goodfire Feb 25, 2026
ece9d3d
Remove unused registry entry
danbraunai-goodfire Feb 25, 2026
0001c6c
Merge dev into feature/hidden-acts-recon-variants
danbraunai-goodfire Feb 25, 2026
00de5ec
Update tests
danbraunai-goodfire Feb 25, 2026
e4b2d28
Add CI and PGD variants of hidden acts recon loss (#409)
danbraunai-goodfire Feb 25, 2026
214772e
Merge remote-tracking branch 'origin/feature/hidden-acts-recon-varian…
danbraunai-goodfire Feb 25, 2026
a8919d4
Update configs to use new pretrain t-9d2b8f02
danbraunai-goodfire Feb 25, 2026
5784de1
Add dataset_seed option to LMTaskConfig (#416)
Antovigo Feb 25, 2026
ac5dc77
Add StochasticAttentionPatternsReconLoss metric (#402)
Antovigo Feb 25, 2026
3da1b4e
Merge branch 'main' into dev
danbraunai-goodfire Feb 26, 2026
9ebb467
Reduce memory usage in clustering
danbraunai-goodfire Feb 27, 2026
c5fc8da
Make clustering happen on cuda
danbraunai-goodfire Feb 27, 2026
8716e32
Update config
danbraunai-goodfire Feb 27, 2026
58a36b1
Update cluster mapping docs
danbraunai-goodfire Feb 27, 2026
75d5f06
Update cluster_mapping to use cluster run instead of ensemble
danbraunai-goodfire Feb 27, 2026
2ab728c
Merge branch 'main' into dev
danbraunai-goodfire Feb 27, 2026
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
4 changes: 4 additions & 0 deletions .gitignore
Original file line number Diff line number Diff line change
@@ -1,6 +1,10 @@
spd/scripts/sweep_params.yaml
docs/coverage/**
notebooks/**
scratch/

# Script outputs (generated files, often large)
scripts/outputs/

**/out/
neuronpedia_outputs/
Expand Down
45 changes: 40 additions & 5 deletions CLAUDE.md
Original file line number Diff line number Diff line change
Expand Up @@ -104,6 +104,10 @@ This repository implements methods from two key research papers on parameter dec
- `spd/metrics.py` - Metrics for logging to WandB (e.g. CI-L0, KL divergence, etc.)
- `spd/figures.py` - Figures for logging to WandB (e.g. CI histograms, Identity plots, etc.)

**Terminology: Sources vs Masks:**
- **Sources** (`adv_sources`, `PPGDSources`, `self.sources`): The raw values that PGD optimizes adversarially. These are interpolated with CI to produce component masks: `mask = ci + (1 - ci) * source`. Used in both regular PGD (`spd/metrics/pgd_utils.py`) and persistent PGD (`spd/persistent_pgd.py`).
- **Masks** (`component_masks`, `RoutingMasks`, `make_mask_infos`, `n_mask_samples`): The materialized per-component masks used during forward passes. These are produced from sources (in PGD) or from stochastic sampling, and are a general SPD concept across the whole codebase.

**Experiment Structure:**

Each experiment (`spd/experiments/{tms,resid_mlp,lm}/`) contains:
Expand Down Expand Up @@ -161,6 +165,7 @@ Each experiment (`spd/experiments/{tms,resid_mlp,lm}/`) contains:
│ ├── clustering/ # Component clustering (see clustering/CLAUDE.md)
│ ├── dataset_attributions/ # Dataset attributions (see dataset_attributions/CLAUDE.md)
│ ├── harvest/ # Statistics collection (see harvest/CLAUDE.md)
│ ├── postprocess/ # Unified postprocessing pipeline (harvest + attributions + autointerp)
│ ├── pretrain/ # Target model pretraining (see pretrain/CLAUDE.md)
│ ├── experiments/ # Experiment implementations
│ │ ├── tms/ # Toy Model of Superposition
Expand Down Expand Up @@ -193,8 +198,9 @@ Each experiment (`spd/experiments/{tms,resid_mlp,lm}/`) contains:
| `spd-run` | `spd/scripts/run.py` | SLURM-based experiment runner |
| `spd-local` | `spd/scripts/run_local.py` | Local experiment runner |
| `spd-harvest` | `spd/harvest/scripts/run_slurm_cli.py` | Submit harvest SLURM job |
| `spd-autointerp` | `spd/autointerp/scripts/cli.py` | Submit autointerp SLURM job |
| `spd-autointerp` | `spd/autointerp/scripts/run_slurm_cli.py` | Submit autointerp SLURM job |
| `spd-attributions` | `spd/dataset_attributions/scripts/run_slurm_cli.py` | Submit dataset attribution SLURM job |
| `spd-postprocess` | `spd/postprocess/cli.py` | Unified postprocessing pipeline (harvest + attributions + interpret + evals) |
| `spd-clustering` | `spd/clustering/scripts/run_pipeline.py` | Clustering pipeline |
| `spd-pretrain` | `spd/pretrain/scripts/run_slurm_cli.py` | Pretrain target models |

Expand Down Expand Up @@ -225,7 +231,7 @@ Use `spd/` as the search root (not repo root) to avoid noise.
- `spd-harvest` → `spd/harvest/scripts/run_slurm_cli.py` → `spd/utils/slurm.py` → SLURM array → `spd/harvest/scripts/run.py` → `spd/harvest/harvest.py`

**Autointerp Pipeline:**
- `spd-autointerp` → `spd/autointerp/scripts/cli.py` → `spd/utils/slurm.py` → `spd/autointerp/interpret.py`
- `spd-autointerp` → `spd/autointerp/scripts/run_slurm_cli.py` → `spd/utils/slurm.py` → `spd/autointerp/interpret.py`

**Dataset Attributions Pipeline:**
- `spd-attributions` → `spd/dataset_attributions/scripts/run_slurm_cli.py` → `spd/utils/slurm.py` → SLURM array → `spd/dataset_attributions/harvest.py`
Expand Down Expand Up @@ -279,6 +285,33 @@ spd-autointerp <wandb_path> # Submit SLURM job to interpret component

Requires `OPENROUTER_API_KEY` env var. See `spd/autointerp/CLAUDE.md` for details.

### Unified Postprocessing (`spd-postprocess`)

Run all postprocessing steps for a completed SPD run with a single command:

```bash
spd-postprocess <wandb_path> # Run everything with default config
spd-postprocess <wandb_path> --config custom_config.yaml # Use custom config
```

Defaults are defined in `PostprocessConfig` (`spd/postprocess/config.py`). Pass a custom YAML/JSON config to override. Set any section to `null` to skip it:
- `attributions: null` — skip dataset attributions
- `autointerp: null` — skip autointerp entirely (interpret + evals)
- `autointerp.evals: null` — skip evals but still run interpret
- `intruder: null` — skip intruder eval

SLURM dependency graph:

```
harvest (GPU array → merge)
├── intruder eval (CPU, depends on harvest merge, label-free)
└── autointerp (depends on harvest merge)
├── interpret (CPU, LLM calls)
│ ├── detection (CPU, depends on interpret)
│ └── fuzzing (CPU, depends on interpret)
attributions (GPU array → merge, parallel with harvest)
```

### Running on SLURM Cluster (`spd-run`)

For the core team, `spd-run` provides full-featured SLURM orchestration:
Expand Down Expand Up @@ -389,7 +422,7 @@ Downloaded runs are cached in `SPD_OUT_DIR/runs/<project>-<run_id>/`.

Core principles:
- **Fail fast** - assert assumptions, crash on violations, don't silently recover
- **No backwards compat** - delete unused code, don't deprecate or add migration shims
- **No legacy support** - delete unused code, don't add fallbacks for old formats or migration shims
- **Narrow types** - avoid `| None` unless null is semantically meaningful; use discriminated unions over bags of optional fields
- **No try/except for control flow** - check preconditions explicitly, then trust them
- **YAGNI** - don't add abstractions, config options, or flexibility for hypothetical futures
Expand Down Expand Up @@ -427,6 +460,7 @@ value = config.key
- Do not write: `if everythingIsOk: continueHappyPath()`. Instead do `assert everythingIsOk`
- You should have a VERY good reason to handle an error gracefully. If your program isn't working like it should then it shouldn't be running, you should be fixing it.
- Do not write `try-catch` blocks unless it definitely makes sense
- **Write for the golden path.** Never let edge cases bloat the code. Before handling them, just raise an exception. If an edge case becomes annoying enough, we'll handle it then — but write first and foremost for the common case.

### Control Flow
- Keep I/O as high up as possible. Make as many functions as possible pure.
Expand All @@ -444,8 +478,9 @@ value = config.key
- good: {<id>: <val>}
- bad: {"tokens": …, "loss": …}
- Default args are rarely a good idea. Avoid them unless necessary. You should have a very good reason for having a default value for an argument, especially if it's caller also defaults to the same thing
- This repo uses basedpyright (not mypy)
- This repo uses basedpyright (not mypy)
- Keep defaults high in the call stack.
- Don't use `from __future__ import annotations` — use string quotes for forward references instead.

### Tensor Operations
- Try to use einops by default for clarity.
Expand Down Expand Up @@ -476,7 +511,7 @@ value = config.key


### Other Important Software Development Practices
- Backwards compatibility that adds complexity should be avoided.
- Don't add legacy fallbacks or migration code - just change it and let old data be manually migrated if needed.
- Delete unused code.
- If an argument is always x, strongly consider removing as an argument and just inlining
- **Update CLAUDE.md files** when changing code structure, adding/removing files, or modifying key interfaces. Update the CLAUDE.md in the same directory (or nearest parent) as the changed files.
Expand Down
4 changes: 3 additions & 1 deletion Makefile
Original file line number Diff line number Diff line change
Expand Up @@ -6,8 +6,10 @@ install: copy-templates
.PHONY: install-dev
install-dev: copy-templates
uv sync
pre-commit install
uv run pre-commit install

.PHONY: install-all
install-all: install-dev install-app

# special install for CI (GitHub Actions) that reduces disk usage and install time
# 1. create a fresh venv with `--clear` -- this is mostly only for local testing of the CI install
Expand Down
Loading
Loading