Record: SP8192 + Polar Express NS + Multi-Phase Global TTT — val_bpb 1.0771 (3-seed mean) by aamodbhatt · Pull Request #1802 · openai/parameter-golf

aamodbhatt · 2026-04-24T02:56:08Z

Summary

val_bpb = 1.0771 (3-seed mean, std 0.0005) | ~15.99 MB | 8xH100 SXM

3-Seed Results

Seed	Steps	EMA BPB	Sliding BPB	MP-TTT BPB	Artifact
42	4,672	1.08634	1.08111	1.07700	15,992,539
314	4,672	1.08611	1.08067	1.07676	15,993,299
999	4,664	1.08695	1.08161	1.07763	15,990,992
Mean	4,669	1.08647	1.08113	1.07713	15,992,277

Merged SOTA (PR #1493): 1.0810 BPP. Delta: -0.0039 BPP.

Key Innovations

Multi-Phase Global TTT (Novel) — Instead of per-chunk score-then-train, score ALL windows globally, train ALL chunks, repeat for 3 phases. SGD (lr=0.015, momentum=0.9), cosine LR decay across chunks. Gives -0.0040 BPB improvement from TTT (vs -0.0017 for standard per-chunk TTT). Total eval time: ~440s.
Polar Express Newton-Schulz Coefficients — 5 per-iteration minimax-optimal NS tuples replacing fixed (3.4445, -4.775, 2.0315). Higher-quality polar factor with same compute budget. (PR Record: SP4096 + Polar Express + MuonEq-R + Depth Recurrence — 1.0923 BPB (3-seed) #1344)
MIN_LR=0.10 Warmdown Floor — Floors LR at 10% of peak during warmdown, enabling meaningful gradient updates throughout. +~70 extra training steps.

Compliance (Issue #1017, Track B)

✅ Train < 600s (596s actual)
✅ Eval < 600s (440s actual)
✅ Artifact < 16MB (15.99MB)
✅ Score before update (each phase scores ALL before ANY training)
✅ No SLOT, no pre-quant TTT, no ETLB, no n-gram cache
✅ Standard softmax, strictly causal, single pass per phase

Credits

@bigbag — Base SOTA stack (PR Record: SP8192 + 3-Layer Recurrence + Parallel Residuals + QK-Gain 5.25 + Legal TTT — val_bpb 1.0810 (3-seed mean) #1493)
@leloykun — Polar Express NS coefficients (PR Record: SP4096 + Polar Express + MuonEq-R + Depth Recurrence — 1.0923 BPB (3-seed) #1344)
@clarkkev — SP8192 + GPTQ SDClip (PR Record: SP8192 + GPTQ Embeddings + Depth Recurrence + MuonEq-R + SDClip — val_bpb 1.08563 (5 seed mean) #1394)
@dexhunter — Depth recurrence (PR Record: MuonEq-R + 3-Layer Recurrence + WD=0.095 + MLR=0.022 + All-Int6 — val_bpb 1.0900 (3-seed mean) #1331, Record: SP8192 + Parallel Residuals + 3-Layer Recurrence + Token-Only N-gram Tilt — val_bpb 1.08091 (5-seed mean, causal-corrected) #1437)
@abaybektursun — Score-first TTT framework (PR Record: LeakyReLU² + Legal Score-First TTT + Parallel Muon — val_bpb 1.1194 (3-seed mean) #549)
@Robby955, @msisovic — Parallel residuals (PR Record: SP8192 + Parallel Residuals + Hessian-Aware SDClip — val_bpb 1.08354 (3-seed mean) #1412, Record: ParallelResiduals + MiniDepthRecurrence, 1.1063 BPB / 1.8679 nats, -0.0072 vs PR #1179, -0.0143 vs merged SOTA #1204)
PR Record: PR #1736 + Polar Express NS + MIN_LR + Sparse Attn Gate + Fused CE + PR #1767 TTT — val_bpb 1.06335 #1787 — MIN_LR warmdown concept

…1.0771 (3-seed mean)

@valerio-oai

…ai#1787 Polar Express NS new base; PR openai#1795 PPM 1.01252; Issue openai#1604 deadline passed; Session 20 - Merged SOTA 1.0810 confirmed Day 15 (README not updated despite Scylla record commit) - Scylla 0.9485 committed to track_10min_16mb/ on Apr 23 (PR openai#1184) but byte accounting disputed by PR openai#1271 (corrected ~1.1289 bpb); treat merged SOTA as 1.0810 - PR openai#771 CLOSED/REJECTED confirmed; PR openai#727 CLOSED (illegal); PR openai#758 open but dead; PR openai#731 still awaiting seeds 1337+2024 - Issue openai#1604 (CaseOps ruling): NO @valerio-oai response in 11 days; self-deadline Apr 24 passed; proceed with clean legal stack immediately - NEW: PR openai#1787 (nprime06, 1.06335) — new community-consensus clean base with Polar Express Newton-Schulz (arXiv:2505.16932, ICLR 2026) + MIN_LR=0.10 warmdown floor - NEW: PR openai#1795 (OE-GOD, 1.01252) — byte-level PPM order-4 adaptive mixture; gate legality concern fixed; await organizer ruling before implementing - NEW: PR openai#1797 (dexhunter, 1.06157) — PR openai#1787 + SmearGate + LQER Asym; new dexhunter best - NEW: PR openai#1802 (aamodbhatt, 1.0771) — Polar Express NS + Multi-Phase Global TTT - TECHNIQUE: Polar Express NS (arXiv:2505.16932) and Gram NS (Dao-AILab) added to table - TECHNIQUE: MIN_LR=0.10 warmdown floor added to best-stack approach - Updated competition strategy: stop waiting for CaseOps, implement clean stack with Polar Express NS + MIN_LR immediately (6 days to deadline) https://claude.ai/code/session_01JZ3FiS937NwLHt3Fv9WHPD

Record: SP8192 + Polar Express NS + Multi-Phase Global TTT — val_bpb …

be58d65

…1.0771 (3-seed mean)

aamodbhatt mentioned this pull request Apr 24, 2026

Record: SP8192 + Pre-Quant TTT (QK 5.25, 8ep, freeze-1) — val_bpb 1.0787 (3-seed mean) #1482

Closed

8 tasks

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Record: SP8192 + Polar Express NS + Multi-Phase Global TTT — val_bpb 1.0771 (3-seed mean)#1802

Record: SP8192 + Polar Express NS + Multi-Phase Global TTT — val_bpb 1.0771 (3-seed mean)#1802
aamodbhatt wants to merge 1 commit intoopenai:mainfrom
aamodbhatt:submission/polar-express-mp-ttt

aamodbhatt commented Apr 24, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

Conversation

aamodbhatt commented Apr 24, 2026

Summary

3-Seed Results

Key Innovations

Compliance (Issue #1017, Track B)

Credits

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant