Skip to content
Open
Show file tree
Hide file tree
Changes from all commits
Commits
Show all changes
65 commits
Select commit Hold shift + click to select a range
9143920
Add agent swarm for parallel behavior investigation
claude Jan 30, 2026
498d459
Stream Claude Code output to file in real-time
claude Jan 30, 2026
efe5928
Use stream-json output format and add max_turns limit
claude Jan 30, 2026
ef5b0fd
Fix stream-json output requiring --verbose flag
claude Jan 30, 2026
f40f02e
Add GPU lock to prevent concurrent GPU operations
claude-spd1 Jan 30, 2026
567fb19
Add research_log.md for human-readable agent progress
claude-spd1 Jan 30, 2026
4c4a843
Add full timestamps to research log examples
claude-spd1 Jan 30, 2026
dcb28f4
Merge remote-tracking branch 'origin/dev' into claude/slurm-agent-swa…
claude-spd1 Jan 31, 2026
cb6e6f0
wip: Integrate agent swarm with MCP for Claude Code tool access
claude-spd1 Jan 31, 2026
06cf2e8
Fix MCP JSON-RPC response format violating spec
claude-spd1 Jan 31, 2026
39b5acb
wip: Refactor agent swarm MCP configuration to require all swarm sett…
claude-spd1 Jan 31, 2026
ae88d53
Fix agent swarm hanging at ~80% optimization
claude-spd1 Feb 1, 2026
58129b0
Simplify agent swarm env vars: 4 → 2
claude-spd1 Feb 1, 2026
b47733f
wip: Add graph artifacts to investigation research logs
claude-spd1 Feb 2, 2026
6f957a0
Merge branch 'dev' into claude/slurm-agent-swarm-lIpTu
claude-spd1 Feb 13, 2026
13cf49a
Refactor agent_swarm → investigate: single-agent, researcher-directed
claude-spd1 Feb 13, 2026
1b30e81
Fix investigation wandb_path matching
claude-spd1 Feb 13, 2026
22f9971
UI improvements: run picker arch labels, artifact graph layout, inves…
claude-spd1 Feb 13, 2026
2625f34
Fix MCP canonical/concrete key translation
claude-spd1 Feb 13, 2026
474e2f3
Move app DB from repo-local .data/ to SPD_OUT_DIR/app/
claude-spd1 Feb 13, 2026
2e6dff1
Sandbox investigation agent to MCP-only, revert DB path move
claude-spd1 Feb 13, 2026
26ff2a7
Isolate investigation agent from global Claude Code config
claude-spd1 Feb 13, 2026
edc7c58
Add topological interpretation module
ocg-goodfire Feb 19, 2026
a41cfef
Remove output-label dependency from cofiring neighbors
ocg-goodfire Feb 19, 2026
54d5a7b
Rename neighbor → related component terminology
ocg-goodfire Feb 19, 2026
c7991a9
Skip components missing labels in unification pass
ocg-goodfire Feb 20, 2026
cb18c86
Use in-memory accumulator for scan state, DB is write-only
ocg-goodfire Feb 20, 2026
b5d3a02
Request 1 GPU for autointerp/eval/intruder SLURM jobs
ocg-goodfire Feb 19, 2026
09c8bd8
Fix YAML configs to use current schema and fix misleading error messa…
danbraunai-goodfire Feb 19, 2026
b95f6bf
Fix SQLite issues on NFS: remove WAL, separate read/write connections
ocg-goodfire Feb 20, 2026
64556ba
Fix attributions SLURM passing full config instead of inner config
ocg-goodfire Feb 20, 2026
9e7d3ef
add worktrees to ignore
ocg-goodfire Feb 23, 2026
e8cd454
Rewrite dataset attribution storage: dict-of-dicts, canonical names, …
ocg-goodfire Feb 23, 2026
a116ddd
Fix alive_targets iteration: use torch.where for indices, not bool to…
ocg-goodfire Feb 23, 2026
5f98d81
Fix KeyError for embed source: CI dict doesn't include embedding layer
ocg-goodfire Feb 23, 2026
01633c5
Fix scatter_add OOB: use embedding num_embeddings instead of tokenize…
ocg-goodfire Feb 23, 2026
9118c1e
Split run.py into run_worker.py and run_merge.py
ocg-goodfire Feb 23, 2026
d0166d0
Correct attr_abs via backprop through |target|, reorganise method sig…
ocg-goodfire Feb 23, 2026
fd42030
Add merge_mem config (default 200G) to prevent merge OOM
ocg-goodfire Feb 23, 2026
223afd4
Add 3-metric selection to dataset attributions in app
ocg-goodfire Feb 23, 2026
4fc7cf1
Allow bare s-prefixed run IDs everywhere (e.g. "s-17805b61")
ocg-goodfire Feb 23, 2026
6a5d0f6
Fix AttributionRepo.open skipping valid subruns due to old-format dirs
ocg-goodfire Feb 23, 2026
08b17c9
Fix 3s lag on attribution metric toggle: O(V) linear scan per pill
ocg-goodfire Feb 23, 2026
b0df7c0
Ship token strings from backend instead of resolving vocab IDs in fro…
ocg-goodfire Feb 23, 2026
2385a82
Hide negative attribution column for non-signed metrics
ocg-goodfire Feb 23, 2026
747991a
Narrow frontend types: SignedAttributions vs UnsignedAttributions
ocg-goodfire Feb 23, 2026
e36e187
Update dataset_attributions CLAUDE.md for new storage format and 3 me…
ocg-goodfire Feb 23, 2026
206dcf0
Integrate new dataset attributions storage, lazy harvest loading, emb…
ocg-goodfire Feb 23, 2026
42fca11
Separate output/input context in prompts, reduce examples, remove err…
ocg-goodfire Feb 23, 2026
c397a2c
Add activation examples to unification prompt
ocg-goodfire Feb 23, 2026
f12aedf
Clean up prompts: human-readable keys, normalized attributions, filte…
ocg-goodfire Feb 23, 2026
98a65ae
Tweak component display, tighten error threshold to 5%
ocg-goodfire Feb 23, 2026
17a25ba
wip.
ocg-goodfire Feb 24, 2026
4781853
wip: Refactor dataset attribution harvester to track abs attributions
ocg-goodfire Feb 24, 2026
a557576
Rewrite dataset attribution storage with explicit edge types
ocg-goodfire Feb 24, 2026
48d318c
Fix embed path not removed from unembed sources in harvester
ocg-goodfire Feb 24, 2026
b44115a
Rename topological_interp → graph_interp and integrate into SPD app
ocg-goodfire Feb 24, 2026
ef4ec4b
Store raw attribution sums, normalize at query time
ocg-goodfire Feb 24, 2026
7298cd7
Fix n_batches removal, detach tensors on save, handle output source q…
ocg-goodfire Feb 24, 2026
3ccc301
Add graph interp badge to components tab, prune model graph to 500 nodes
ocg-goodfire Feb 25, 2026
63c544e
Expand graph interp badge with detail, edges, token strings, and auto…
ocg-goodfire Feb 25, 2026
889a89e
Move graph interp detail fetch into useComponentData hooks
ocg-goodfire Feb 25, 2026
e183401
tiny tidy
ocg-goodfire Feb 25, 2026
5beaa66
wip: Add embed token count normalization for dataset attributions
ocg-goodfire Feb 26, 2026
52c275e
fold in the investigator work
ocg-goodfire Feb 26, 2026
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
12 changes: 12 additions & 0 deletions .claude/skills/gpudash.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,12 @@
---
name: gpudash
description: Check GPU availability across the SLURM cluster
user_invocable: true
---

# gpudash

Run the `gpudash` command to show GPU availability across the cluster.

## Steps
1. Run `gpudash` and show the output to the user.
1 change: 1 addition & 0 deletions .claude/worktrees/bold-elm-8kpb
Submodule bold-elm-8kpb added at 356f8c
1 change: 1 addition & 0 deletions .claude/worktrees/bright-fox-a4i0
Submodule bright-fox-a4i0 added at 356f8c
1 change: 1 addition & 0 deletions .claude/worktrees/calm-owl-v4pj
Submodule calm-owl-v4pj added at dbe066
1 change: 1 addition & 0 deletions .claude/worktrees/cozy-frolicking-stream
Submodule cozy-frolicking-stream added at 356f8c
1 change: 1 addition & 0 deletions .claude/worktrees/stateless-dancing-blanket
Submodule stateless-dancing-blanket added at 356f8c
1 change: 1 addition & 0 deletions .claude/worktrees/swift-owl-yep9
Submodule swift-owl-yep9 added at 356f8c
1 change: 1 addition & 0 deletions .claude/worktrees/swift-ray-amfs
Submodule swift-ray-amfs added at 356f8c
1 change: 1 addition & 0 deletions .claude/worktrees/vectorized-wiggling-whisper
Submodule vectorized-wiggling-whisper added at cb18c8
1 change: 1 addition & 0 deletions .claude/worktrees/xenodochial-germain
Submodule xenodochial-germain added at 5c9f34
4 changes: 3 additions & 1 deletion .gitignore
Original file line number Diff line number Diff line change
Expand Up @@ -177,4 +177,6 @@ cython_debug/
#.idea/

**/*.db
**/*.db*
**/*.db*

.claude/worktrees
7 changes: 1 addition & 6 deletions .mcp.json
Original file line number Diff line number Diff line change
@@ -1,8 +1,3 @@
{
"mcpServers": {
"svelte-llm": {
"type": "http",
"url": "https://svelte-llm.stanislav.garden/mcp/mcp"
}
}
"mcpServers": {}
}
85 changes: 69 additions & 16 deletions CLAUDE.md

Large diffs are not rendered by default.

4 changes: 3 additions & 1 deletion pyproject.toml
Original file line number Diff line number Diff line change
Expand Up @@ -56,7 +56,9 @@ spd-clustering = "spd.clustering.scripts.run_pipeline:cli"
spd-harvest = "spd.harvest.scripts.run_slurm_cli:cli"
spd-autointerp = "spd.autointerp.scripts.run_slurm_cli:cli"
spd-attributions = "spd.dataset_attributions.scripts.run_slurm_cli:cli"
spd-investigate = "spd.investigate.scripts.run_slurm_cli:cli"
spd-postprocess = "spd.postprocess.cli:cli"
spd-graph-interp = "spd.graph_interp.scripts.run_slurm_cli:cli"

[build-system]
requires = ["setuptools", "wheel"]
Expand All @@ -69,7 +71,7 @@ include = ["spd*"]
[tool.ruff]
line-length = 100
fix = true
extend-exclude = ["spd/app/frontend"]
extend-exclude = ["spd/app/frontend", ".circuits-ref"]

[tool.ruff.lint]
ignore = [
Expand Down
4 changes: 4 additions & 0 deletions spd/app/CLAUDE.md
Original file line number Diff line number Diff line change
Expand Up @@ -15,6 +15,7 @@ This is a **rapidly iterated research tool**. Key implications:
- **Database is disposable**: Delete `.data/app/prompt_attr.db` if schema changes break things
- **Prefer simplicity**: Avoid over-engineering for hypothetical future needs
- **Fail loud and fast**: The users are a small team of highly technical people. Errors are good. We want to know immediately if something is wrong. No soft failing, assert, assert, assert
- **Token display**: Always ship token strings rendered server-side via `AppTokenizer`, never raw token IDs. For embed/output layers, `component_idx` is a token ID — resolve it to a display string in the backend response.

## Running the App

Expand Down Expand Up @@ -50,6 +51,9 @@ backend/
├── intervention.py # Selective component activation
├── correlations.py # Component correlations + token stats + interpretations
├── clusters.py # Component clustering
├── dataset_search.py # SimpleStories dataset search
├── agents.py # Various useful endpoints that AI agents should look at when helping
├── mcp.py # MCP (Model Context Protocol) endpoint for Claude Code
├── dataset_search.py # Dataset search (reads dataset from run config)
└── agents.py # Various useful endpoints that AI agents should look at when helping
```
Expand Down
22 changes: 20 additions & 2 deletions spd/app/backend/database.py
Original file line number Diff line number Diff line change
Expand Up @@ -9,6 +9,7 @@
import hashlib
import io
import json
import os
import sqlite3
from pathlib import Path
from typing import Literal
Expand All @@ -24,7 +25,24 @@

# Persistent data directories
_APP_DATA_DIR = REPO_ROOT / ".data" / "app"
DEFAULT_DB_PATH = _APP_DATA_DIR / "prompt_attr.db"
_DEFAULT_DB_PATH = _APP_DATA_DIR / "prompt_attr.db"


def get_default_db_path() -> Path:
"""Get the default database path.

Checks env vars in order:
1. SPD_INVESTIGATION_DIR - investigation mode, db at dir/app.db
2. SPD_APP_DB_PATH - explicit override
3. Default: .data/app/prompt_attr.db
"""
investigation_dir = os.environ.get("SPD_INVESTIGATION_DIR")
if investigation_dir:
return Path(investigation_dir) / "app.db"
env_path = os.environ.get("SPD_APP_DB_PATH")
if env_path:
return Path(env_path)
return _DEFAULT_DB_PATH


class Run(BaseModel):
Expand Down Expand Up @@ -111,7 +129,7 @@ class PromptAttrDB:
"""

def __init__(self, db_path: Path | None = None, check_same_thread: bool = True):
self.db_path = db_path or DEFAULT_DB_PATH
self.db_path = db_path or get_default_db_path()
self._check_same_thread = check_same_thread
self._conn: sqlite3.Connection | None = None

Expand Down
6 changes: 6 additions & 0 deletions spd/app/backend/routers/__init__.py
Original file line number Diff line number Diff line change
Expand Up @@ -7,8 +7,11 @@
from spd.app.backend.routers.data_sources import router as data_sources_router
from spd.app.backend.routers.dataset_attributions import router as dataset_attributions_router
from spd.app.backend.routers.dataset_search import router as dataset_search_router
from spd.app.backend.routers.graph_interp import router as graph_interp_router
from spd.app.backend.routers.graphs import router as graphs_router
from spd.app.backend.routers.intervention import router as intervention_router
from spd.app.backend.routers.investigations import router as investigations_router
from spd.app.backend.routers.mcp import router as mcp_router
from spd.app.backend.routers.pretrain_info import router as pretrain_info_router
from spd.app.backend.routers.prompts import router as prompts_router
from spd.app.backend.routers.runs import router as runs_router
Expand All @@ -20,9 +23,12 @@
"correlations_router",
"data_sources_router",
"dataset_attributions_router",
"graph_interp_router",
"dataset_search_router",
"graphs_router",
"intervention_router",
"investigations_router",
"mcp_router",
"pretrain_info_router",
"prompts_router",
"runs_router",
Expand Down
18 changes: 16 additions & 2 deletions spd/app/backend/routers/data_sources.py
Original file line number Diff line number Diff line change
Expand Up @@ -28,15 +28,21 @@ class AutointerpInfo(BaseModel):

class AttributionsInfo(BaseModel):
subrun_id: str
n_batches_processed: int
n_tokens_processed: int
ci_threshold: float


class GraphInterpInfo(BaseModel):
subrun_id: str
config: dict[str, Any] | None
label_counts: dict[str, int]


class DataSourcesResponse(BaseModel):
harvest: HarvestInfo | None
autointerp: AutointerpInfo | None
attributions: AttributionsInfo | None
graph_interp: GraphInterpInfo | None


router = APIRouter(prefix="/api/data_sources", tags=["data_sources"])
Expand Down Expand Up @@ -70,13 +76,21 @@ def get_data_sources(loaded: DepLoadedRun) -> DataSourcesResponse:
storage = loaded.attributions.get_attributions()
attributions_info = AttributionsInfo(
subrun_id=loaded.attributions.subrun_id,
n_batches_processed=storage.n_batches_processed,
n_tokens_processed=storage.n_tokens_processed,
ci_threshold=storage.ci_threshold,
)

graph_interp_info: GraphInterpInfo | None = None
if loaded.graph_interp is not None:
graph_interp_info = GraphInterpInfo(
subrun_id=loaded.graph_interp.subrun_id,
config=loaded.graph_interp.get_config(),
label_counts=loaded.graph_interp.get_label_counts(),
)

return DataSourcesResponse(
harvest=harvest_info,
autointerp=autointerp_info,
attributions=attributions_info,
graph_interp=graph_interp_info,
)
Loading