From 3a7c4be0531390a019f8911fa8c3f6789a386ea9 Mon Sep 17 00:00:00 2001 From: Alex Date: Tue, 28 Apr 2026 11:57:17 -0700 Subject: [PATCH 01/11] Update parameter golf leaderboard with BOS fix Co-authored-by: Codex --- README.md | 7 +++++++ 1 file changed, 7 insertions(+) diff --git a/README.md b/README.md index 1445065db7..85ba40653c 100644 --- a/README.md +++ b/README.md @@ -30,6 +30,13 @@ Happy training! | Run | Score | Author | Summary | Date | Info | |-----|------:|--------|---------|------|------| +| BOS-Fixed SmearGate + LQER Asymmetric + Phased TTT | 1.0615 | aquariouseworkman | On PR #1851 with 3-seed support from PR #1868: BOS fix by aquariouseworkman on dexhunter's SmearGate + LQER stack from PR #1797, with CaseOps/SP8192 lineage and phased score-first TTT | 2026-04-27 | [info](https://github.com/openai/parameter-golf/pull/1851), [3-seed](https://github.com/openai/parameter-golf/pull/1868) | +| GatedAttn + Alpha-Scaled LoRA + Warm-Start A + WD=1.0 | 1.0708 | renqianluo | On PR #1784: gated attention plus alpha-scaled LoRA TTT, warm-start A, and WD=1.0 on the post-#1667 lineage | 2026-04-23 | [info](https://github.com/openai/parameter-golf/pull/1784) | +| SmearGate + Attention Output Gate + Legal TTT | 1.0714 | MarioPaerle | On PR #1667: SmearGate, attention output gate, depth recurrence, parallel residuals, QK-Gain 5.25, quantization, and score-first TTT | 2026-04-16 | [info](https://github.com/openai/parameter-golf/pull/1667) | +| VarLen Attention + Fused MLP + Multi-Phase Global SGD TTT | 1.0719 | dexhunter | On PR #1626: VarLen attention, fused MLP, multi-phase global SGD TTT, trimmed GPTQ, MLR 0.026, int7 embeddings, and adaptive clip | 2026-04-14 | [info](https://github.com/openai/parameter-golf/pull/1626) | +| VarLenAttn + PhasingTTT | 1.0728 | romeerp | On PR #1610: #1530-style VarLen/fused stack plus phased TTT over already-scored validation chunks | 2026-04-13 | [info](https://github.com/openai/parameter-golf/pull/1610) | +| VarLen Attention + Fused MLP + Doc-Independent Legal TTT | 1.0734 | samacqua | On PR #1530: variable-length FA3 attention, fused Triton MLP, grouped small-parameter all-reduces, and doc-independent score-first LoRA TTT | 2026-04-11 | [info](https://github.com/openai/parameter-golf/pull/1530) | +| Asymmetric Two-Lane Parallel Routing + Tap-In V6 + Legal TTT | 1.0739 | Abay Bektursun | On PR #1518: two-lane parallel residual routing, Tap-In V6 prefix n-gram eval nudge, and score-first legal TTT | 2026-04-12 | [info](https://github.com/openai/parameter-golf/pull/1518) | | SP8192 + 3-Layer Recurrence + Parallel Residuals + Legal TTT | 1.0810 | bigbag | On PR #1493: 3-layer recurrence, parallel residuals, QK-Gain 5.25, and legal score-first TTT on the PR #1394 stack | 2026-04-09 | [info](records/track_10min_16mb/2026-04-09_SP8192_3LayerRecur_ParResid_QK525_LegalTTT/README.md) | | SP8192 + Parallel Residuals + Score-First TTT | 1.0822 | aryanbhosale | On PR #1477: parallel residuals on the PR #1413 SP8192 + legal score-first TTT stack | 2026-04-08 | [info](records/track_10min_16mb/2026-04-08_SP8192_ParallelResid_ScoreFirstTTT/README.md) | | SP8192 + QK-Gain 5 + Legal Score-First TTT | 1.0828 | dexhunter | On PR #1413: QK-Gain 5.0 + legal score-first TTT on the PR #1394 SP8192 stack | 2026-04-06 | [info](records/track_10min_16mb/2026-04-06_SP8192_QK5_LegalTTT_1.0828/README.md) | From 0d046475873a1cfce98e445d299c9114337363fa Mon Sep 17 00:00:00 2001 From: Alex Date: Tue, 28 Apr 2026 12:30:42 -0700 Subject: [PATCH 02/11] Credit PR 1797 in leaderboard update Co-authored-by: Codex --- README.md | 3 ++- 1 file changed, 2 insertions(+), 1 deletion(-) diff --git a/README.md b/README.md index 85ba40653c..0655760850 100644 --- a/README.md +++ b/README.md @@ -30,7 +30,8 @@ Happy training! | Run | Score | Author | Summary | Date | Info | |-----|------:|--------|---------|------|------| -| BOS-Fixed SmearGate + LQER Asymmetric + Phased TTT | 1.0615 | aquariouseworkman | On PR #1851 with 3-seed support from PR #1868: BOS fix by aquariouseworkman on dexhunter's SmearGate + LQER stack from PR #1797, with CaseOps/SP8192 lineage and phased score-first TTT | 2026-04-27 | [info](https://github.com/openai/parameter-golf/pull/1851), [3-seed](https://github.com/openai/parameter-golf/pull/1868) | +| BOS-Fixed SmearGate + LQER Asymmetric + PR1787 SparseAttn + Phased TTT | 1.0615 | aquariouseworkman | On PR #1851 with 3-seed support from PR #1868: follow-on run of dexhunter's BOS-masked SmearGate + LQER stack from PR #1797, using the PR #1787 SparseAttnGate/PolarNS/FusedCE base plus CaseOps and phased score-first TTT | 2026-04-27 | [info](https://github.com/openai/parameter-golf/pull/1851), [3-seed](https://github.com/openai/parameter-golf/pull/1868) | +| PR1787 Base + BOS-Masked SmearGate + LQER Asymmetric + Phased TTT | 1.0641 | dexhunter | On PR #1797: dexhunter's original SmearGate + LQER asymmetric rank-4 stack on the PR #1787 base, rebanked with the BOS mask applied in both forward paths (commit 17f50d9); credits PR #1787 SparseAttnGate/PolarNS/FusedCE, CaseOps lineage from PR #1729/#1736, and phased score-first TTT lineage | 2026-04-24 | [info](https://github.com/openai/parameter-golf/pull/1797) | | GatedAttn + Alpha-Scaled LoRA + Warm-Start A + WD=1.0 | 1.0708 | renqianluo | On PR #1784: gated attention plus alpha-scaled LoRA TTT, warm-start A, and WD=1.0 on the post-#1667 lineage | 2026-04-23 | [info](https://github.com/openai/parameter-golf/pull/1784) | | SmearGate + Attention Output Gate + Legal TTT | 1.0714 | MarioPaerle | On PR #1667: SmearGate, attention output gate, depth recurrence, parallel residuals, QK-Gain 5.25, quantization, and score-first TTT | 2026-04-16 | [info](https://github.com/openai/parameter-golf/pull/1667) | | VarLen Attention + Fused MLP + Multi-Phase Global SGD TTT | 1.0719 | dexhunter | On PR #1626: VarLen attention, fused MLP, multi-phase global SGD TTT, trimmed GPTQ, MLR 0.026, int7 embeddings, and adaptive clip | 2026-04-14 | [info](https://github.com/openai/parameter-golf/pull/1626) | From ce84ddcc3fd9ae362f37fe6c3992326c359c5245 Mon Sep 17 00:00:00 2001 From: Alex Date: Tue, 28 Apr 2026 12:32:02 -0700 Subject: [PATCH 03/11] Credit CaseOps and PR 1787 leaderboard rows Co-authored-by: Codex --- README.md | 3 +++ 1 file changed, 3 insertions(+) diff --git a/README.md b/README.md index 0655760850..50beeb61d6 100644 --- a/README.md +++ b/README.md @@ -31,7 +31,10 @@ Happy training! | Run | Score | Author | Summary | Date | Info | |-----|------:|--------|---------|------|------| | BOS-Fixed SmearGate + LQER Asymmetric + PR1787 SparseAttn + Phased TTT | 1.0615 | aquariouseworkman | On PR #1851 with 3-seed support from PR #1868: follow-on run of dexhunter's BOS-masked SmearGate + LQER stack from PR #1797, using the PR #1787 SparseAttnGate/PolarNS/FusedCE base plus CaseOps and phased score-first TTT | 2026-04-27 | [info](https://github.com/openai/parameter-golf/pull/1851), [3-seed](https://github.com/openai/parameter-golf/pull/1868) | +| PR1736 + PolarNS + MIN_LR + SparseAttnGate + FusedCE + Warm-A TTT | 1.0634 | nprime06 | On PR #1787: PR #1736 CaseOps stack plus Polar Express Newton-Schulz coefficients, MIN_LR=0.1, SparseAttnGate, fused softcapped CE, and PR #1767-style warm-start-A TTT | 2026-04-23 | [info](https://github.com/openai/parameter-golf/pull/1787) | | PR1787 Base + BOS-Masked SmearGate + LQER Asymmetric + Phased TTT | 1.0641 | dexhunter | On PR #1797: dexhunter's original SmearGate + LQER asymmetric rank-4 stack on the PR #1787 base, rebanked with the BOS mask applied in both forward paths (commit 17f50d9); credits PR #1787 SparseAttnGate/PolarNS/FusedCE, CaseOps lineage from PR #1729/#1736, and phased score-first TTT lineage | 2026-04-24 | [info](https://github.com/openai/parameter-golf/pull/1797) | +| SP8192 + CaseOps + GatedAttn + QuantGate + Loop45 + Phased TTT | 1.0655 | dexhunter | On PR #1736: adopts romeerp's lossless CaseOps transform from PR #1729 with byte-sidecar BPB accounting, then adds gated attention and quant-gate scaling on the PR #1530 SP8192 phased-TTT stack | 2026-04-19 | [info](https://github.com/openai/parameter-golf/pull/1736) | +| CaseOps Tokenizer + Tapered WD + Phased TTT | 1.0678 | romeerp | On PR #1729: lossless CaseOps bijective case transform with validation byte sidecars, plus mild late Muon weight-decay taper on the PR #1626 legal phased-TTT stack | 2026-04-19 | [info](https://github.com/openai/parameter-golf/pull/1729) | | GatedAttn + Alpha-Scaled LoRA + Warm-Start A + WD=1.0 | 1.0708 | renqianluo | On PR #1784: gated attention plus alpha-scaled LoRA TTT, warm-start A, and WD=1.0 on the post-#1667 lineage | 2026-04-23 | [info](https://github.com/openai/parameter-golf/pull/1784) | | SmearGate + Attention Output Gate + Legal TTT | 1.0714 | MarioPaerle | On PR #1667: SmearGate, attention output gate, depth recurrence, parallel residuals, QK-Gain 5.25, quantization, and score-first TTT | 2026-04-16 | [info](https://github.com/openai/parameter-golf/pull/1667) | | VarLen Attention + Fused MLP + Multi-Phase Global SGD TTT | 1.0719 | dexhunter | On PR #1626: VarLen attention, fused MLP, multi-phase global SGD TTT, trimmed GPTQ, MLR 0.026, int7 embeddings, and adaptive clip | 2026-04-14 | [info](https://github.com/openai/parameter-golf/pull/1626) | From 13e76acdcd35ae79b032e66c9765ec6849e2f474 Mon Sep 17 00:00:00 2001 From: Alex Date: Tue, 28 Apr 2026 13:27:54 -0700 Subject: [PATCH 04/11] Apply p-value progression leaderboard cutoff Co-authored-by: Codex --- README.md | 4 ++-- 1 file changed, 2 insertions(+), 2 deletions(-) diff --git a/README.md b/README.md index 50beeb61d6..6634a32fb0 100644 --- a/README.md +++ b/README.md @@ -32,15 +32,15 @@ Happy training! |-----|------:|--------|---------|------|------| | BOS-Fixed SmearGate + LQER Asymmetric + PR1787 SparseAttn + Phased TTT | 1.0615 | aquariouseworkman | On PR #1851 with 3-seed support from PR #1868: follow-on run of dexhunter's BOS-masked SmearGate + LQER stack from PR #1797, using the PR #1787 SparseAttnGate/PolarNS/FusedCE base plus CaseOps and phased score-first TTT | 2026-04-27 | [info](https://github.com/openai/parameter-golf/pull/1851), [3-seed](https://github.com/openai/parameter-golf/pull/1868) | | PR1736 + PolarNS + MIN_LR + SparseAttnGate + FusedCE + Warm-A TTT | 1.0634 | nprime06 | On PR #1787: PR #1736 CaseOps stack plus Polar Express Newton-Schulz coefficients, MIN_LR=0.1, SparseAttnGate, fused softcapped CE, and PR #1767-style warm-start-A TTT | 2026-04-23 | [info](https://github.com/openai/parameter-golf/pull/1787) | -| PR1787 Base + BOS-Masked SmearGate + LQER Asymmetric + Phased TTT | 1.0641 | dexhunter | On PR #1797: dexhunter's original SmearGate + LQER asymmetric rank-4 stack on the PR #1787 base, rebanked with the BOS mask applied in both forward paths (commit 17f50d9); credits PR #1787 SparseAttnGate/PolarNS/FusedCE, CaseOps lineage from PR #1729/#1736, and phased score-first TTT lineage | 2026-04-24 | [info](https://github.com/openai/parameter-golf/pull/1797) | +| CaseOps + MLPClip12 + SmearGate/LoRA-TTT | 1.0645 | dexhunter | On PR #1769: CaseOps stack with SmearGate/LoRA-TTT refinements and MLPClip12; 5-seed mean improves the accepted CaseOps frontier under p<0.25 | 2026-04-22 | [info](https://github.com/openai/parameter-golf/pull/1769) | | SP8192 + CaseOps + GatedAttn + QuantGate + Loop45 + Phased TTT | 1.0655 | dexhunter | On PR #1736: adopts romeerp's lossless CaseOps transform from PR #1729 with byte-sidecar BPB accounting, then adds gated attention and quant-gate scaling on the PR #1530 SP8192 phased-TTT stack | 2026-04-19 | [info](https://github.com/openai/parameter-golf/pull/1736) | | CaseOps Tokenizer + Tapered WD + Phased TTT | 1.0678 | romeerp | On PR #1729: lossless CaseOps bijective case transform with validation byte sidecars, plus mild late Muon weight-decay taper on the PR #1626 legal phased-TTT stack | 2026-04-19 | [info](https://github.com/openai/parameter-golf/pull/1729) | -| GatedAttn + Alpha-Scaled LoRA + Warm-Start A + WD=1.0 | 1.0708 | renqianluo | On PR #1784: gated attention plus alpha-scaled LoRA TTT, warm-start A, and WD=1.0 on the post-#1667 lineage | 2026-04-23 | [info](https://github.com/openai/parameter-golf/pull/1784) | | SmearGate + Attention Output Gate + Legal TTT | 1.0714 | MarioPaerle | On PR #1667: SmearGate, attention output gate, depth recurrence, parallel residuals, QK-Gain 5.25, quantization, and score-first TTT | 2026-04-16 | [info](https://github.com/openai/parameter-golf/pull/1667) | | VarLen Attention + Fused MLP + Multi-Phase Global SGD TTT | 1.0719 | dexhunter | On PR #1626: VarLen attention, fused MLP, multi-phase global SGD TTT, trimmed GPTQ, MLR 0.026, int7 embeddings, and adaptive clip | 2026-04-14 | [info](https://github.com/openai/parameter-golf/pull/1626) | | VarLenAttn + PhasingTTT | 1.0728 | romeerp | On PR #1610: #1530-style VarLen/fused stack plus phased TTT over already-scored validation chunks | 2026-04-13 | [info](https://github.com/openai/parameter-golf/pull/1610) | | VarLen Attention + Fused MLP + Doc-Independent Legal TTT | 1.0734 | samacqua | On PR #1530: variable-length FA3 attention, fused Triton MLP, grouped small-parameter all-reduces, and doc-independent score-first LoRA TTT | 2026-04-11 | [info](https://github.com/openai/parameter-golf/pull/1530) | | Asymmetric Two-Lane Parallel Routing + Tap-In V6 + Legal TTT | 1.0739 | Abay Bektursun | On PR #1518: two-lane parallel residual routing, Tap-In V6 prefix n-gram eval nudge, and score-first legal TTT | 2026-04-12 | [info](https://github.com/openai/parameter-golf/pull/1518) | +| SP8192 + Muon 0.97 + Legal Score-First TTT | 1.0798 | dexhunter | On PR #1514: SP8192 with Muon 0.97 and legal score-first TTT; under the current p<0.25 rule, its 3-seed sweep beats #1493 (p≈0.020) | 2026-04-09 | [info](https://github.com/openai/parameter-golf/pull/1514) | | SP8192 + 3-Layer Recurrence + Parallel Residuals + Legal TTT | 1.0810 | bigbag | On PR #1493: 3-layer recurrence, parallel residuals, QK-Gain 5.25, and legal score-first TTT on the PR #1394 stack | 2026-04-09 | [info](records/track_10min_16mb/2026-04-09_SP8192_3LayerRecur_ParResid_QK525_LegalTTT/README.md) | | SP8192 + Parallel Residuals + Score-First TTT | 1.0822 | aryanbhosale | On PR #1477: parallel residuals on the PR #1413 SP8192 + legal score-first TTT stack | 2026-04-08 | [info](records/track_10min_16mb/2026-04-08_SP8192_ParallelResid_ScoreFirstTTT/README.md) | | SP8192 + QK-Gain 5 + Legal Score-First TTT | 1.0828 | dexhunter | On PR #1413: QK-Gain 5.0 + legal score-first TTT on the PR #1394 SP8192 stack | 2026-04-06 | [info](records/track_10min_16mb/2026-04-06_SP8192_QK5_LegalTTT_1.0828/README.md) | From dbc31de88a753fbb64e0cb84b93572a4f7a6e73f Mon Sep 17 00:00:00 2001 From: Alex Date: Tue, 28 Apr 2026 13:32:55 -0700 Subject: [PATCH 05/11] Address leaderboard review comments Co-authored-by: Codex --- README.md | 3 +++ 1 file changed, 3 insertions(+) diff --git a/README.md b/README.md index 6634a32fb0..832dfc36fb 100644 --- a/README.md +++ b/README.md @@ -30,6 +30,7 @@ Happy training! | Run | Score | Author | Summary | Date | Info | |-----|------:|--------|---------|------|------| +| BOS-Fixed SmearGate + LQER + SparseAttnGate + 9-Hparam Stack | 1.0608 | codemath3000 | On PR #1855: BOS-fixed #1797-derived stack with LQER, PR #1787 SparseAttnGate/PolarNS/FusedCE base, per-group lrzip compression, and 9 greedy hyperparameter overrides; 6-sample evidence including independent reproduction passes p<0.25 vs #1851/#1868 | 2026-04-27 | [info](https://github.com/openai/parameter-golf/pull/1855), [repro](https://github.com/openai/parameter-golf/pull/1855#issuecomment-4336629746) | | BOS-Fixed SmearGate + LQER Asymmetric + PR1787 SparseAttn + Phased TTT | 1.0615 | aquariouseworkman | On PR #1851 with 3-seed support from PR #1868: follow-on run of dexhunter's BOS-masked SmearGate + LQER stack from PR #1797, using the PR #1787 SparseAttnGate/PolarNS/FusedCE base plus CaseOps and phased score-first TTT | 2026-04-27 | [info](https://github.com/openai/parameter-golf/pull/1851), [3-seed](https://github.com/openai/parameter-golf/pull/1868) | | PR1736 + PolarNS + MIN_LR + SparseAttnGate + FusedCE + Warm-A TTT | 1.0634 | nprime06 | On PR #1787: PR #1736 CaseOps stack plus Polar Express Newton-Schulz coefficients, MIN_LR=0.1, SparseAttnGate, fused softcapped CE, and PR #1767-style warm-start-A TTT | 2026-04-23 | [info](https://github.com/openai/parameter-golf/pull/1787) | | CaseOps + MLPClip12 + SmearGate/LoRA-TTT | 1.0645 | dexhunter | On PR #1769: CaseOps stack with SmearGate/LoRA-TTT refinements and MLPClip12; 5-seed mean improves the accepted CaseOps frontier under p<0.25 | 2026-04-22 | [info](https://github.com/openai/parameter-golf/pull/1769) | @@ -40,6 +41,8 @@ Happy training! | VarLenAttn + PhasingTTT | 1.0728 | romeerp | On PR #1610: #1530-style VarLen/fused stack plus phased TTT over already-scored validation chunks | 2026-04-13 | [info](https://github.com/openai/parameter-golf/pull/1610) | | VarLen Attention + Fused MLP + Doc-Independent Legal TTT | 1.0734 | samacqua | On PR #1530: variable-length FA3 attention, fused Triton MLP, grouped small-parameter all-reduces, and doc-independent score-first LoRA TTT | 2026-04-11 | [info](https://github.com/openai/parameter-golf/pull/1530) | | Asymmetric Two-Lane Parallel Routing + Tap-In V6 + Legal TTT | 1.0739 | Abay Bektursun | On PR #1518: two-lane parallel residual routing, Tap-In V6 prefix n-gram eval nudge, and score-first legal TTT | 2026-04-12 | [info](https://github.com/openai/parameter-golf/pull/1518) | +| Improved Parallel Residuals + Systems Optimization | 1.0752 | codemath3000 | On PR #1584: systems-only speedup of PR #1529's dual-lane parallel-residual stack with fused Muon, batched EMA, and loader prealloc; identical ML, so the statistical-significance requirement is waived | 2026-04-13 | [info](https://github.com/openai/parameter-golf/pull/1584) | +| Improved Parallel Residuals + CUTLASS EVT | 1.0758 | msisovic | On PR #1529: dual-lane parallel residual routing on the PR #1523/#1518-era stack with inlined CUTLASS EVT path and legal score-first TTT; included by score-at-opening chronology before #1518's later score update | 2026-04-11 | [info](https://github.com/openai/parameter-golf/pull/1529) | | SP8192 + Muon 0.97 + Legal Score-First TTT | 1.0798 | dexhunter | On PR #1514: SP8192 with Muon 0.97 and legal score-first TTT; under the current p<0.25 rule, its 3-seed sweep beats #1493 (p≈0.020) | 2026-04-09 | [info](https://github.com/openai/parameter-golf/pull/1514) | | SP8192 + 3-Layer Recurrence + Parallel Residuals + Legal TTT | 1.0810 | bigbag | On PR #1493: 3-layer recurrence, parallel residuals, QK-Gain 5.25, and legal score-first TTT on the PR #1394 stack | 2026-04-09 | [info](records/track_10min_16mb/2026-04-09_SP8192_3LayerRecur_ParResid_QK525_LegalTTT/README.md) | | SP8192 + Parallel Residuals + Score-First TTT | 1.0822 | aryanbhosale | On PR #1477: parallel residuals on the PR #1413 SP8192 + legal score-first TTT stack | 2026-04-08 | [info](records/track_10min_16mb/2026-04-08_SP8192_ParallelResid_ScoreFirstTTT/README.md) | From 4f061405e310fd1536ae02ace4a66648740ce2e7 Mon Sep 17 00:00:00 2001 From: Alex Date: Wed, 29 Apr 2026 11:20:45 -0700 Subject: [PATCH 06/11] Clarify BOS fix leaderboard evidence Co-authored-by: Codex --- README.md | 4 ++-- 1 file changed, 2 insertions(+), 2 deletions(-) diff --git a/README.md b/README.md index 832dfc36fb..d5f1d0ec4b 100644 --- a/README.md +++ b/README.md @@ -30,8 +30,8 @@ Happy training! | Run | Score | Author | Summary | Date | Info | |-----|------:|--------|---------|------|------| -| BOS-Fixed SmearGate + LQER + SparseAttnGate + 9-Hparam Stack | 1.0608 | codemath3000 | On PR #1855: BOS-fixed #1797-derived stack with LQER, PR #1787 SparseAttnGate/PolarNS/FusedCE base, per-group lrzip compression, and 9 greedy hyperparameter overrides; 6-sample evidence including independent reproduction passes p<0.25 vs #1851/#1868 | 2026-04-27 | [info](https://github.com/openai/parameter-golf/pull/1855), [repro](https://github.com/openai/parameter-golf/pull/1855#issuecomment-4336629746) | -| BOS-Fixed SmearGate + LQER Asymmetric + PR1787 SparseAttn + Phased TTT | 1.0615 | aquariouseworkman | On PR #1851 with 3-seed support from PR #1868: follow-on run of dexhunter's BOS-masked SmearGate + LQER stack from PR #1797, using the PR #1787 SparseAttnGate/PolarNS/FusedCE base plus CaseOps and phased score-first TTT | 2026-04-27 | [info](https://github.com/openai/parameter-golf/pull/1851), [3-seed](https://github.com/openai/parameter-golf/pull/1868) | +| BOS-Fixed SmearGate + LQER + SparseAttnGate + 9-Hparam Stack | 1.0608 | codemath3000 | On PR #1855: BOS-fixed #1797-derived stack with LQER, PR #1787 SparseAttnGate/PolarNS/FusedCE base, per-group lrzip compression, and 9 greedy hyperparameter overrides; 6-sample evidence including independent reproduction passes p<0.25 vs PR #1868's latest compliance rerun | 2026-04-27 | [info](https://github.com/openai/parameter-golf/pull/1855), [repro](https://github.com/openai/parameter-golf/pull/1855#issuecomment-4336629746) | +| BOS-Fixed SmearGate + LQER Asymmetric + PR1787 SparseAttn + Phased TTT | 1.0614 | aquariouseworkman | On PR #1851 with 3-seed compliance-rerun support from PR #1868: follow-on run of dexhunter's BOS-masked SmearGate + LQER stack from PR #1797, using the PR #1787 SparseAttnGate/PolarNS/FusedCE base plus CaseOps and phased score-first TTT | 2026-04-27 | [info](https://github.com/openai/parameter-golf/pull/1851), [3-seed](https://github.com/openai/parameter-golf/pull/1868) | | PR1736 + PolarNS + MIN_LR + SparseAttnGate + FusedCE + Warm-A TTT | 1.0634 | nprime06 | On PR #1787: PR #1736 CaseOps stack plus Polar Express Newton-Schulz coefficients, MIN_LR=0.1, SparseAttnGate, fused softcapped CE, and PR #1767-style warm-start-A TTT | 2026-04-23 | [info](https://github.com/openai/parameter-golf/pull/1787) | | CaseOps + MLPClip12 + SmearGate/LoRA-TTT | 1.0645 | dexhunter | On PR #1769: CaseOps stack with SmearGate/LoRA-TTT refinements and MLPClip12; 5-seed mean improves the accepted CaseOps frontier under p<0.25 | 2026-04-22 | [info](https://github.com/openai/parameter-golf/pull/1769) | | SP8192 + CaseOps + GatedAttn + QuantGate + Loop45 + Phased TTT | 1.0655 | dexhunter | On PR #1736: adopts romeerp's lossless CaseOps transform from PR #1729 with byte-sidecar BPB accounting, then adds gated attention and quant-gate scaling on the PR #1530 SP8192 phased-TTT stack | 2026-04-19 | [info](https://github.com/openai/parameter-golf/pull/1736) | From f10108dec3971db2d9bebc1db6b8a2d00cfbbe5b Mon Sep 17 00:00:00 2001 From: Alex Date: Wed, 29 Apr 2026 11:25:47 -0700 Subject: [PATCH 07/11] Shorten leaderboard p-value notes Co-authored-by: Codex --- README.md | 6 +++--- 1 file changed, 3 insertions(+), 3 deletions(-) diff --git a/README.md b/README.md index d5f1d0ec4b..066d19873c 100644 --- a/README.md +++ b/README.md @@ -30,10 +30,10 @@ Happy training! | Run | Score | Author | Summary | Date | Info | |-----|------:|--------|---------|------|------| -| BOS-Fixed SmearGate + LQER + SparseAttnGate + 9-Hparam Stack | 1.0608 | codemath3000 | On PR #1855: BOS-fixed #1797-derived stack with LQER, PR #1787 SparseAttnGate/PolarNS/FusedCE base, per-group lrzip compression, and 9 greedy hyperparameter overrides; 6-sample evidence including independent reproduction passes p<0.25 vs PR #1868's latest compliance rerun | 2026-04-27 | [info](https://github.com/openai/parameter-golf/pull/1855), [repro](https://github.com/openai/parameter-golf/pull/1855#issuecomment-4336629746) | +| BOS-Fixed SmearGate + LQER + SparseAttnGate + 9-Hparam Stack | 1.0608 | codemath3000 | On PR #1855: BOS-fixed #1797-derived stack with LQER, PR #1787 SparseAttnGate/PolarNS/FusedCE base, per-group lrzip compression, and 9 greedy hyperparameter overrides; 6-sample evidence incl. independent reproduction (p=0.188 vs PR #1868 latest rerun) | 2026-04-27 | [info](https://github.com/openai/parameter-golf/pull/1855), [repro](https://github.com/openai/parameter-golf/pull/1855#issuecomment-4336629746) | | BOS-Fixed SmearGate + LQER Asymmetric + PR1787 SparseAttn + Phased TTT | 1.0614 | aquariouseworkman | On PR #1851 with 3-seed compliance-rerun support from PR #1868: follow-on run of dexhunter's BOS-masked SmearGate + LQER stack from PR #1797, using the PR #1787 SparseAttnGate/PolarNS/FusedCE base plus CaseOps and phased score-first TTT | 2026-04-27 | [info](https://github.com/openai/parameter-golf/pull/1851), [3-seed](https://github.com/openai/parameter-golf/pull/1868) | | PR1736 + PolarNS + MIN_LR + SparseAttnGate + FusedCE + Warm-A TTT | 1.0634 | nprime06 | On PR #1787: PR #1736 CaseOps stack plus Polar Express Newton-Schulz coefficients, MIN_LR=0.1, SparseAttnGate, fused softcapped CE, and PR #1767-style warm-start-A TTT | 2026-04-23 | [info](https://github.com/openai/parameter-golf/pull/1787) | -| CaseOps + MLPClip12 + SmearGate/LoRA-TTT | 1.0645 | dexhunter | On PR #1769: CaseOps stack with SmearGate/LoRA-TTT refinements and MLPClip12; 5-seed mean improves the accepted CaseOps frontier under p<0.25 | 2026-04-22 | [info](https://github.com/openai/parameter-golf/pull/1769) | +| CaseOps + MLPClip12 + SmearGate/LoRA-TTT | 1.0645 | dexhunter | On PR #1769: CaseOps stack with SmearGate/LoRA-TTT refinements and MLPClip12; 5-seed mean improves the accepted CaseOps frontier (p=0.063 vs #1736) | 2026-04-22 | [info](https://github.com/openai/parameter-golf/pull/1769) | | SP8192 + CaseOps + GatedAttn + QuantGate + Loop45 + Phased TTT | 1.0655 | dexhunter | On PR #1736: adopts romeerp's lossless CaseOps transform from PR #1729 with byte-sidecar BPB accounting, then adds gated attention and quant-gate scaling on the PR #1530 SP8192 phased-TTT stack | 2026-04-19 | [info](https://github.com/openai/parameter-golf/pull/1736) | | CaseOps Tokenizer + Tapered WD + Phased TTT | 1.0678 | romeerp | On PR #1729: lossless CaseOps bijective case transform with validation byte sidecars, plus mild late Muon weight-decay taper on the PR #1626 legal phased-TTT stack | 2026-04-19 | [info](https://github.com/openai/parameter-golf/pull/1729) | | SmearGate + Attention Output Gate + Legal TTT | 1.0714 | MarioPaerle | On PR #1667: SmearGate, attention output gate, depth recurrence, parallel residuals, QK-Gain 5.25, quantization, and score-first TTT | 2026-04-16 | [info](https://github.com/openai/parameter-golf/pull/1667) | @@ -43,7 +43,7 @@ Happy training! | Asymmetric Two-Lane Parallel Routing + Tap-In V6 + Legal TTT | 1.0739 | Abay Bektursun | On PR #1518: two-lane parallel residual routing, Tap-In V6 prefix n-gram eval nudge, and score-first legal TTT | 2026-04-12 | [info](https://github.com/openai/parameter-golf/pull/1518) | | Improved Parallel Residuals + Systems Optimization | 1.0752 | codemath3000 | On PR #1584: systems-only speedup of PR #1529's dual-lane parallel-residual stack with fused Muon, batched EMA, and loader prealloc; identical ML, so the statistical-significance requirement is waived | 2026-04-13 | [info](https://github.com/openai/parameter-golf/pull/1584) | | Improved Parallel Residuals + CUTLASS EVT | 1.0758 | msisovic | On PR #1529: dual-lane parallel residual routing on the PR #1523/#1518-era stack with inlined CUTLASS EVT path and legal score-first TTT; included by score-at-opening chronology before #1518's later score update | 2026-04-11 | [info](https://github.com/openai/parameter-golf/pull/1529) | -| SP8192 + Muon 0.97 + Legal Score-First TTT | 1.0798 | dexhunter | On PR #1514: SP8192 with Muon 0.97 and legal score-first TTT; under the current p<0.25 rule, its 3-seed sweep beats #1493 (p≈0.020) | 2026-04-09 | [info](https://github.com/openai/parameter-golf/pull/1514) | +| SP8192 + Muon 0.97 + Legal Score-First TTT | 1.0798 | dexhunter | On PR #1514: SP8192 with Muon 0.97 and legal score-first TTT; 3-seed sweep beats #1493 (p=0.020) | 2026-04-09 | [info](https://github.com/openai/parameter-golf/pull/1514) | | SP8192 + 3-Layer Recurrence + Parallel Residuals + Legal TTT | 1.0810 | bigbag | On PR #1493: 3-layer recurrence, parallel residuals, QK-Gain 5.25, and legal score-first TTT on the PR #1394 stack | 2026-04-09 | [info](records/track_10min_16mb/2026-04-09_SP8192_3LayerRecur_ParResid_QK525_LegalTTT/README.md) | | SP8192 + Parallel Residuals + Score-First TTT | 1.0822 | aryanbhosale | On PR #1477: parallel residuals on the PR #1413 SP8192 + legal score-first TTT stack | 2026-04-08 | [info](records/track_10min_16mb/2026-04-08_SP8192_ParallelResid_ScoreFirstTTT/README.md) | | SP8192 + QK-Gain 5 + Legal Score-First TTT | 1.0828 | dexhunter | On PR #1413: QK-Gain 5.0 + legal score-first TTT on the PR #1394 SP8192 stack | 2026-04-06 | [info](records/track_10min_16mb/2026-04-06_SP8192_QK5_LegalTTT_1.0828/README.md) | From c0a88198c510df46c6120660048cbb61dc080e01 Mon Sep 17 00:00:00 2001 From: Alex Date: Wed, 29 Apr 2026 11:29:49 -0700 Subject: [PATCH 08/11] Remove non-frontier leaderboard rows Co-authored-by: Codex --- README.md | 2 -- 1 file changed, 2 deletions(-) diff --git a/README.md b/README.md index 066d19873c..d9f3d976f7 100644 --- a/README.md +++ b/README.md @@ -41,8 +41,6 @@ Happy training! | VarLenAttn + PhasingTTT | 1.0728 | romeerp | On PR #1610: #1530-style VarLen/fused stack plus phased TTT over already-scored validation chunks | 2026-04-13 | [info](https://github.com/openai/parameter-golf/pull/1610) | | VarLen Attention + Fused MLP + Doc-Independent Legal TTT | 1.0734 | samacqua | On PR #1530: variable-length FA3 attention, fused Triton MLP, grouped small-parameter all-reduces, and doc-independent score-first LoRA TTT | 2026-04-11 | [info](https://github.com/openai/parameter-golf/pull/1530) | | Asymmetric Two-Lane Parallel Routing + Tap-In V6 + Legal TTT | 1.0739 | Abay Bektursun | On PR #1518: two-lane parallel residual routing, Tap-In V6 prefix n-gram eval nudge, and score-first legal TTT | 2026-04-12 | [info](https://github.com/openai/parameter-golf/pull/1518) | -| Improved Parallel Residuals + Systems Optimization | 1.0752 | codemath3000 | On PR #1584: systems-only speedup of PR #1529's dual-lane parallel-residual stack with fused Muon, batched EMA, and loader prealloc; identical ML, so the statistical-significance requirement is waived | 2026-04-13 | [info](https://github.com/openai/parameter-golf/pull/1584) | -| Improved Parallel Residuals + CUTLASS EVT | 1.0758 | msisovic | On PR #1529: dual-lane parallel residual routing on the PR #1523/#1518-era stack with inlined CUTLASS EVT path and legal score-first TTT; included by score-at-opening chronology before #1518's later score update | 2026-04-11 | [info](https://github.com/openai/parameter-golf/pull/1529) | | SP8192 + Muon 0.97 + Legal Score-First TTT | 1.0798 | dexhunter | On PR #1514: SP8192 with Muon 0.97 and legal score-first TTT; 3-seed sweep beats #1493 (p=0.020) | 2026-04-09 | [info](https://github.com/openai/parameter-golf/pull/1514) | | SP8192 + 3-Layer Recurrence + Parallel Residuals + Legal TTT | 1.0810 | bigbag | On PR #1493: 3-layer recurrence, parallel residuals, QK-Gain 5.25, and legal score-first TTT on the PR #1394 stack | 2026-04-09 | [info](records/track_10min_16mb/2026-04-09_SP8192_3LayerRecur_ParResid_QK525_LegalTTT/README.md) | | SP8192 + Parallel Residuals + Score-First TTT | 1.0822 | aryanbhosale | On PR #1477: parallel residuals on the PR #1413 SP8192 + legal score-first TTT stack | 2026-04-08 | [info](records/track_10min_16mb/2026-04-08_SP8192_ParallelResid_ScoreFirstTTT/README.md) | From 464db9522f80a886e486f301eed8e5bdc9d77ce9 Mon Sep 17 00:00:00 2001 From: Alex Date: Wed, 29 Apr 2026 11:39:11 -0700 Subject: [PATCH 09/11] Clarify SmearGate BOS fix attribution Co-authored-by: Codex --- README.md | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/README.md b/README.md index d9f3d976f7..c4e3d6554a 100644 --- a/README.md +++ b/README.md @@ -31,7 +31,7 @@ Happy training! | Run | Score | Author | Summary | Date | Info | |-----|------:|--------|---------|------|------| | BOS-Fixed SmearGate + LQER + SparseAttnGate + 9-Hparam Stack | 1.0608 | codemath3000 | On PR #1855: BOS-fixed #1797-derived stack with LQER, PR #1787 SparseAttnGate/PolarNS/FusedCE base, per-group lrzip compression, and 9 greedy hyperparameter overrides; 6-sample evidence incl. independent reproduction (p=0.188 vs PR #1868 latest rerun) | 2026-04-27 | [info](https://github.com/openai/parameter-golf/pull/1855), [repro](https://github.com/openai/parameter-golf/pull/1855#issuecomment-4336629746) | -| BOS-Fixed SmearGate + LQER Asymmetric + PR1787 SparseAttn + Phased TTT | 1.0614 | aquariouseworkman | On PR #1851 with 3-seed compliance-rerun support from PR #1868: follow-on run of dexhunter's BOS-masked SmearGate + LQER stack from PR #1797, using the PR #1787 SparseAttnGate/PolarNS/FusedCE base plus CaseOps and phased score-first TTT | 2026-04-27 | [info](https://github.com/openai/parameter-golf/pull/1851), [3-seed](https://github.com/openai/parameter-golf/pull/1868) | +| BOS-Fixed SmearGate + LQER Asymmetric + PR1787 SparseAttn + Phased TTT | 1.0614 | aquariouseworkman | On PR #1851 with 3-seed compliance-rerun support from PR #1868: BOS-boundary fix from PR #1851 applied to dexhunter's PR #1797 SmearGate + LQER stack, using the PR #1787 SparseAttnGate/PolarNS/FusedCE base plus CaseOps and phased score-first TTT | 2026-04-27 | [info](https://github.com/openai/parameter-golf/pull/1851), [3-seed](https://github.com/openai/parameter-golf/pull/1868) | | PR1736 + PolarNS + MIN_LR + SparseAttnGate + FusedCE + Warm-A TTT | 1.0634 | nprime06 | On PR #1787: PR #1736 CaseOps stack plus Polar Express Newton-Schulz coefficients, MIN_LR=0.1, SparseAttnGate, fused softcapped CE, and PR #1767-style warm-start-A TTT | 2026-04-23 | [info](https://github.com/openai/parameter-golf/pull/1787) | | CaseOps + MLPClip12 + SmearGate/LoRA-TTT | 1.0645 | dexhunter | On PR #1769: CaseOps stack with SmearGate/LoRA-TTT refinements and MLPClip12; 5-seed mean improves the accepted CaseOps frontier (p=0.063 vs #1736) | 2026-04-22 | [info](https://github.com/openai/parameter-golf/pull/1769) | | SP8192 + CaseOps + GatedAttn + QuantGate + Loop45 + Phased TTT | 1.0655 | dexhunter | On PR #1736: adopts romeerp's lossless CaseOps transform from PR #1729 with byte-sidecar BPB accounting, then adds gated attention and quant-gate scaling on the PR #1530 SP8192 phased-TTT stack | 2026-04-19 | [info](https://github.com/openai/parameter-golf/pull/1736) | From 69a89976e9a670a979571f156c934eca14d9a4d6 Mon Sep 17 00:00:00 2001 From: Alex Date: Wed, 29 Apr 2026 11:44:16 -0700 Subject: [PATCH 10/11] Exclude #1518 from chronological frontier Co-authored-by: Codex --- README.md | 1 - 1 file changed, 1 deletion(-) diff --git a/README.md b/README.md index c4e3d6554a..be74c9ca58 100644 --- a/README.md +++ b/README.md @@ -40,7 +40,6 @@ Happy training! | VarLen Attention + Fused MLP + Multi-Phase Global SGD TTT | 1.0719 | dexhunter | On PR #1626: VarLen attention, fused MLP, multi-phase global SGD TTT, trimmed GPTQ, MLR 0.026, int7 embeddings, and adaptive clip | 2026-04-14 | [info](https://github.com/openai/parameter-golf/pull/1626) | | VarLenAttn + PhasingTTT | 1.0728 | romeerp | On PR #1610: #1530-style VarLen/fused stack plus phased TTT over already-scored validation chunks | 2026-04-13 | [info](https://github.com/openai/parameter-golf/pull/1610) | | VarLen Attention + Fused MLP + Doc-Independent Legal TTT | 1.0734 | samacqua | On PR #1530: variable-length FA3 attention, fused Triton MLP, grouped small-parameter all-reduces, and doc-independent score-first LoRA TTT | 2026-04-11 | [info](https://github.com/openai/parameter-golf/pull/1530) | -| Asymmetric Two-Lane Parallel Routing + Tap-In V6 + Legal TTT | 1.0739 | Abay Bektursun | On PR #1518: two-lane parallel residual routing, Tap-In V6 prefix n-gram eval nudge, and score-first legal TTT | 2026-04-12 | [info](https://github.com/openai/parameter-golf/pull/1518) | | SP8192 + Muon 0.97 + Legal Score-First TTT | 1.0798 | dexhunter | On PR #1514: SP8192 with Muon 0.97 and legal score-first TTT; 3-seed sweep beats #1493 (p=0.020) | 2026-04-09 | [info](https://github.com/openai/parameter-golf/pull/1514) | | SP8192 + 3-Layer Recurrence + Parallel Residuals + Legal TTT | 1.0810 | bigbag | On PR #1493: 3-layer recurrence, parallel residuals, QK-Gain 5.25, and legal score-first TTT on the PR #1394 stack | 2026-04-09 | [info](records/track_10min_16mb/2026-04-09_SP8192_3LayerRecur_ParResid_QK525_LegalTTT/README.md) | | SP8192 + Parallel Residuals + Score-First TTT | 1.0822 | aryanbhosale | On PR #1477: parallel residuals on the PR #1413 SP8192 + legal score-first TTT stack | 2026-04-08 | [info](records/track_10min_16mb/2026-04-08_SP8192_ParallelResid_ScoreFirstTTT/README.md) | From 66a076a2e3ffbb4940c66712e784ff3f5d8c88e6 Mon Sep 17 00:00:00 2001 From: Alex Date: Wed, 29 Apr 2026 11:48:19 -0700 Subject: [PATCH 11/11] Use submitted #1855 score Co-authored-by: Codex --- README.md | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/README.md b/README.md index be74c9ca58..dbf4d011b3 100644 --- a/README.md +++ b/README.md @@ -30,7 +30,7 @@ Happy training! | Run | Score | Author | Summary | Date | Info | |-----|------:|--------|---------|------|------| -| BOS-Fixed SmearGate + LQER + SparseAttnGate + 9-Hparam Stack | 1.0608 | codemath3000 | On PR #1855: BOS-fixed #1797-derived stack with LQER, PR #1787 SparseAttnGate/PolarNS/FusedCE base, per-group lrzip compression, and 9 greedy hyperparameter overrides; 6-sample evidence incl. independent reproduction (p=0.188 vs PR #1868 latest rerun) | 2026-04-27 | [info](https://github.com/openai/parameter-golf/pull/1855), [repro](https://github.com/openai/parameter-golf/pull/1855#issuecomment-4336629746) | +| BOS-Fixed SmearGate + LQER + SparseAttnGate + 9-Hparam Stack | 1.0611 | codemath3000 | On PR #1855: BOS-fixed #1797-derived stack with LQER, PR #1787 SparseAttnGate/PolarNS/FusedCE base, per-group lrzip compression, and 9 greedy hyperparameter overrides; submitted 3-seed mean 1.06108 with broader reproduction support (p=0.188 vs PR #1868 latest rerun) | 2026-04-27 | [info](https://github.com/openai/parameter-golf/pull/1855), [repro](https://github.com/openai/parameter-golf/pull/1855#issuecomment-4336629746) | | BOS-Fixed SmearGate + LQER Asymmetric + PR1787 SparseAttn + Phased TTT | 1.0614 | aquariouseworkman | On PR #1851 with 3-seed compliance-rerun support from PR #1868: BOS-boundary fix from PR #1851 applied to dexhunter's PR #1797 SmearGate + LQER stack, using the PR #1787 SparseAttnGate/PolarNS/FusedCE base plus CaseOps and phased score-first TTT | 2026-04-27 | [info](https://github.com/openai/parameter-golf/pull/1851), [3-seed](https://github.com/openai/parameter-golf/pull/1868) | | PR1736 + PolarNS + MIN_LR + SparseAttnGate + FusedCE + Warm-A TTT | 1.0634 | nprime06 | On PR #1787: PR #1736 CaseOps stack plus Polar Express Newton-Schulz coefficients, MIN_LR=0.1, SparseAttnGate, fused softcapped CE, and PR #1767-style warm-start-A TTT | 2026-04-23 | [info](https://github.com/openai/parameter-golf/pull/1787) | | CaseOps + MLPClip12 + SmearGate/LoRA-TTT | 1.0645 | dexhunter | On PR #1769: CaseOps stack with SmearGate/LoRA-TTT refinements and MLPClip12; 5-seed mean improves the accepted CaseOps frontier (p=0.063 vs #1736) | 2026-04-22 | [info](https://github.com/openai/parameter-golf/pull/1769) |