Refactor: src/scripts/artifacts structure + parity-only pipeline #1
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
What changed
Project layout refactor
Moved code to src/ and scripts to scripts/; build outputs now go to artifacts/ (gitignored).
Renamed files for clarity:
tracr_transformer_pt.py → src/tracr_pt_model.py
my_majority_program.py → scripts/compile_export.py
load_and_visualize_with_torchlens.py → scripts/parity_check.py
Removed diagram/TorchLens files and other unused helpers.
Single-pass build/export
scripts/compile_export.py compiles the RASP program, saves reference activations, and exports Tracr params from the same compile to avoid basis mismatches.
Deterministic parity check
scripts/parity_check.py loads the NPZ, infers dims, mirrors Tracr math (Attn→MLP, causal, √d_head), and auto-discovers BOS/0/1/PAD embedding row mapping once, saving it to artifacts/token_to_id.json.
Tracr import handling
Scripts look for Tracr in external/Tracr/tracr or Tracr/tracr; alternatively you can pip install the repo.
(Optional) CI
Added a GitHub Actions workflow suggestion (.github/workflows/parity.yml) to fail the build if parity breaks.
How to run
(in a virtualenv)
python -m pip install --upgrade pip
pip install numpy "jax[cpu]" dm-haiku
pip install --index-url https://download.pytorch.org/whl/cpu torch
Option A: vendor Tracr under external/Tracr, or
Option B: pip install it:
pip install git+https://github.com/google-deepmind/tracr.git
1) Compile RASP → Tracr, save activations, export params (into artifacts/)
python scripts/compile_export.py
2) Verify PyTorch parity (discovers BOS/0/1/PAD mapping and saves it)
python scripts/parity_check.py
Expected output:
Outputs match: True
Max abs diff: ~1e-6 (or 0)
Notes
artifacts/token_to_id.json is preserved once discovered; compile_export.py won’t overwrite it.
Re-run step (1) if you change the RASP program or compiler flags (e.g., MAX_SEQ_LEN, BOS/PAD, causal).