openai · Christopher-Lee-McClendon · Apr 29, 2026 · Apr 30, 2026
diff --git a/records/track_10min_16mb/2026-04-29_PerGroupLRZIP_ComplianceFix_1.06003/README.md b/records/track_10min_16mb/2026-04-29_PerGroupLRZIP_ComplianceFix_1.06003/README.md
@@ -0,0 +1,122 @@
+# Compliance Audit: PR #1934 Recipe with GPTQ_RESERVE_SECONDS=5.5 — val_bpb 1.06003 (3-seed mean)
+
+## Summary
+
+Compliant 3-seed reproduction of PR #1934's recipe (`COMPRESSOR=pergroup`, `EMBED_WD=0.06`, tightened clip sigmas) with the timing compliance fix: **`GPTQ_RESERVE_SECONDS=5.5`** (vs #1934's `0.5`).
+
+This ensures GPTQ hessian collection completes **within the 600s training budget** under the train-loop + hessian interpretation (actual: 598.0s). PR #1934's original run with `GPTQ_RESERVE_SECONDS=0.5` has hessians finishing at ~603s, which may exceed the budget depending on interpretation.
+
+## Results
+
+| Seed | Post-TTT val_bpb | Artifact Bytes | Steps | Train Loop | Hessians | Total (train+hessians) |
+|------|-----------------|----------------|-------|------------|----------|------------------------|
+| 42   | **1.05987**     | 15,971,933     | 4962  | 594.6s     | 3.5s     | 598.1s ✓              |
+| 314  | **1.05975**     | 15,970,997     | 4952  | 594.6s     | 3.5s     | 598.1s ✓              |
+| 999  | **1.06047**     | 15,974,305     | 4954  | 594.7s     | 3.5s     | 598.2s ✓              |
+
+**3-seed mean: 1.06003** (std: 0.000385)
+
+## Comparison to PR #1934
+
+| Metric | PR #1934 | This Run | Delta |
+|--------|----------|----------|-------|
+| Mean val_bpb | 1.05993 | 1.06003 | +0.00010 |
+| GPTQ_RESERVE_SECONDS | 0.5 | 5.5 | +5.0 |
+| Training loop stops at | 599.5s | 594.5s | -5.0s |
+| Hessians finish at | ~603.0s | ~598.0s | -5.0s |
+| Within 600s budget? | ❌ | ✅ | — |
+| Steps achieved | 4974–4984 | 4952–4962 | -22 |
+
+The BPB difference of +0.00010 shows no material difference in this 3-seed sample (well within 1σ = 0.000385), confirming that the compliance fix does not meaningfully degrade performance.
+
+## Log Annotation Caveat
+
+The logs contain: `artifact_production_wallclock: 727.520s ... must be < 600.0`. This annotation is a **display bug** — `artifact_production_wallclock` includes post-budget compression time. The correct budget-controlled metric is `training_loop + hessians = 598.2s < 600s`. The "must be < 600.0" label was erroneously applied to the wrong metric.
+
+## Compliance Statement
+
+**Interpretation**: Training loop + GPTQ hessian collection must complete within 600s. GPTQ quantization and compression are part of serialization ("saving to flash drive"), not training. This is consistent with how all existing record-track submissions handle timing.
+
+1. **Training budget**: Training loop + GPTQ hessian collection = **598.2s max** (< 600s) ✓
+2. **Artifact size**: 15,974,305 bytes max (< 16,000,000) ✓
+3. **Eval time (TTT only)**: 547.1s max (< 600s) ✓
+4. **No telemetry**: 10-second pre-training sleep, no network contact from start of training through eval completion ✓
+5. **Self-contained**: No external downloads during eval ✓
+
+Note: Diagnostic evaluations (pre-quant val_bpb, quantized val_bpb) run outside both the training and scored-eval budgets. They are informational only and not required for the submission.
+
+### Timing Breakdown (typical seed)
+
+```
+Training loop (gradient steps): 594.6s  ← budget-controlled
+GPTQ hessian collection:         3.5s  ← within 600s budget (cumulative: 598.1s)
+─── 600s training budget boundary ───
+GPTQ quantization:              10.0s  ← post-training serialize
+Per-group lrzip compression:   118.3s  ← post-training serialize  
+Total serialize:               132.9s
+─── end of artifact production ───
+Diagnostic eval (pre-quant):     7.4s  ← not counted
+Phased TTT eval:               480.3s  ← separate eval budget
+```
+
+## Architecture
+
+- 11 layers, 512 dims, 8 attention heads / 4 KV heads (GQA)
+- U-Net skip connections
+- Parallel residuals (start layer 8)
+- Partial RoPE (16 dims, base 10000)
+- Depth recurrence (loop layers 3–5, NUM_LOOPS=2)
+- 4× MLP expansion
+- SmearGate (window 12) + sparse attention gate
+- CaseOps bijective case transform (SP8192)
+- LQER asymmetric INT2/INT4 rank-4 correction (top-3 tensors, group 64)
+- GPTQ INT6 + INT7 embeddings
+- Per-group lrzip + brotli compression (`COMPRESSOR=pergroup`)
+- Phased TTT (3 phases, score-first, prefix 2000 docs, warm-start A)
+- Muon optimizer (Polar-Express Newton-Schulz) + Adam for scalars
+
+## Key Differences from PR #1934
+
+1. **`GPTQ_RESERVE_SECONDS=5.5`** (vs 0.5): Ensures hessian collection completes within 600s
+2. **Serialize-before-diagnostic**: Model artifact is saved before computing diagnostic val_bpb (prevents timing ambiguity)
+3. **Fixed `artifact_production_wallclock`**: Reports actual train_loop + serialize time (not the broken metric that includes model build)
+
+## Reproduction
+
+```bash
+# On 8×H100 SXM pod with matotezitanka/proteus-pytorch:community
+apt-get install -y lrzip
+MATCHED_FINEWEB_REPO_ID=kevclark/parameter-golf python3 data/cached_challenge_fineweb.py --variant sp8192
+sleep 10  # pre-training settle
+SEED=42 CASEOPS_ENABLED=1 PHASED_TTT_PREFIX_DOCS=2000 PHASED_TTT_NUM_PHASES=3 \
+  MATRIX_CLIP_SIGMAS=12.85 ATTN_CLIP_SIGMAS=12.0 MLP_CLIP_SIGMAS=12.0 \
+  EMBED_BITS=7 EMBED_CLIP_SIGMAS=12.0 MATRIX_LR=0.026 MIN_LR=0.1 \
+  FUSED_CE_ENABLED=1 SPARSE_ATTN_GATE_ENABLED=1 \
+  SMEAR_GATE_ENABLED=1 GATE_WINDOW=12 \
+  LQER_ENABLED=1 LQER_RANK=4 LQER_TOP_K=3 LQER_FACTOR_BITS=4 \
+  LQER_ASYM_ENABLED=1 LQER_ASYM_GROUP=64 \
+  TTT_WARM_START_A=1 GPTQ_RESERVE_SECONDS=5.5 GPTQ_CALIBRATION_BATCHES=16 \
+  EMBED_WD=0.06 COMPRESSOR=pergroup NCCL_NET=Socket \
+  torchrun --standalone --nproc_per_node=8 train_gpt.py
+```
+
+## Credits
+
+- **PR #1934** @liujshi — Recipe: pergroup lrzip + embed_wd=0.06 + tightened clip sigmas
+- **PR #1855** @liujshi — Per-group lrzip compression pipeline
+- **PR #1787** @nprime06 — 11L base architecture + LQER + SmearGate + depth recurrence
+- **PR #1797** @dexhunter — SmearGate + LQER integration
+- **PR #1729** @romeerp — CaseOps SP8192
+- **PR #1394** @clarkkev — GPTQ + SP8192
+- **PR #549** @abaybektursun — Score-first TTT
+
+## Requirements
+
+```
+torch>=2.9
+triton
+sentencepiece
+huggingface_hub
+datasets
+lrzip (system package)
+```
diff --git a/records/track_10min_16mb/2026-04-29_PerGroupLRZIP_ComplianceFix_1.06003/requirements.txt b/records/track_10min_16mb/2026-04-29_PerGroupLRZIP_ComplianceFix_1.06003/requirements.txt
@@ -0,0 +1,7 @@
+torch>=2.9
+triton
+sentencepiece
+huggingface_hub
+datasets
+tqdm
+numpy