Record: GDN-Hybrid + Sliding Window Attention + compressed-code warmdown1000 - val_bpb 1.01671 (3-seed mean) by joshkmartinez · Pull Request #1576 · openai/parameter-golf

joshkmartinez · 2026-04-12T23:56:24Z

Summary

val_bpb = 1.01671233 (3-seed mean, std 0.00134386)
15.71–15.90 MB

Improves the GDN-Hybrid fixed-predictor line with a warmdown1000 schedule and compressed-code packaging w/o eval-time adaptation.

Seed	Steps	EMA BPB	val_bpb	XSA BPB	Artifact bytes
42	2227	1.007164	1.016200	1.021202	15,733,879
1337	2242	1.007164	1.015700	1.020105	15,903,365
2024	2227	1.009032	1.018237	1.024111	15,713,422
Mean	—	1.007787	1.01671233	1.021806	15,783,555.33
Std	—	—	0.00134386	—	—

Architecture / Technique Stack

SP1024 tokenizer
GDN-Hybrid backbone: [GDN×5] → SWA → [GDN×5] → SWA_shared
Fixed-predictor evaluation path (no TTT / no SLOT / no eval-time adaptation)
MuonEq-R + AdamW training mix
EMA = 0.997
warmdown = 1000
GPTQ int6 + zstd-22 packaging
Compressed-code packaging for train_gpt.py / architectures.py / configs.py to recover artifact headroom

Compliance

Fixed-predictor / Track A style submission
No TTT
No SLOT
No RLS
No eval-time adaptation
All three artifacts under 16,000,000 bytes
Training run stays within the 10-minute 8xH100 submission budget

Notes

XSA telemetry is reported for completeness, but the submitted score is the fixed-model quantized_bpb result above.

Credits

PR Record: GDN-Hybrid (Gated DeltaNet + SWA) — val_bpb 1.028308 (3-seed cold-cache mean) #1545 for the GDN-Hybrid architecture line
PR Record: SP8192 + QK-Gain 5 + Legal Score-First TTT — val_bpb 1.08279 (3-seed mean) #1413 for the compressed self-extracting LZMA pattern

bigbag · 2026-04-13T12:36:06Z

BPB metric bug: space bytes double-counted (inherited from closed parent PR #1545)

The decompressed train_gpt.py in this PR contains the same build_sentencepiece_luts bug that @SPThole identified in PR #1545, and which @Abhishek8108 acknowledged when closing that PR ("The corrected BPB is ~1.18, not 1.028").

Bugged code in this PR (decompressed from the LZMA self-extractor):

# build_sentencepiece_luts, around line 217
if piece.startswith("▁"):
    has_space[i] = True
    base_bytes[i] = len(piece[1:].encode("utf-8")) + 1   # +1 adds the space byte

Then the eval loop adds the same space byte again:

tb += (has_leading_space_lut[tgt] & ~is_boundary_token_lut[prev]).to(torch.float64)

Reference implementation (train_gpt.py at repo root, lines 186–189):

piece = sp.id_to_piece(token_id)
if piece.startswith("▁"):
    has_leading_space_np[token_id] = True
    piece = piece[1:]                       # strip ▁
base_bytes_np[token_id] = len(piece.encode("utf-8"))   # NO +1 here

The reference counts the space byte exactly once (in the eval loop, conditioned on ~is_boundary_token_lut[prev]). The bugged version counts it in both places for every ▁-prefixed token, inflating the byte denominator and deflating the reported BPB.

Running the parent PR's corrected LUT on the same checkpoint lands in the ~1.16–1.18 range (per @Abhishek8108's own correction on #1545), not 1.01671.

Bug was missed here because the training code is wrapped in an LZMA self-extractor, which hides it from standard review. Suggest the maintainers decompress and re-score before this shifts the leaderboard.

…ai#1586 per-layer GPTQ highest-EV - PR openai#758 n-gram effectively dead: MatoTeziTanka (Apr 12) flagged XOR hash includes target token, same illegality as openai#727/openai#741 - GDN-Hybrid BPB bug confirmed: PR openai#1576 space-token double-count inflates denominator ~14%; actual score ~1.16-1.18, not 1.01671 - PR openai#1586 (dexhunter, 1.07493): Per-Layer Adaptive GPTQ MLP=12σ/Attn=13σ + int7 Emb (saves 530KB) + MLR=0.026; -0.0127 nats vs SOTA; implement now - PR openai#1584: systems-only (fused Muon, batched EMA, loader prealloc) ~+20 steps - Casefold Tokenizer (openai#1578/openai#1585): legality debated; await organizer ruling - New paper: arXiv:2604.06169 In-Place TTT (Apr 7) NTP-aligned score-first TTT - Merged SOTA 1.0810 unchanged (4-day stable streak); target ≤1.0760; 17 days https://claude.ai/code/session_01BE8wc8zxvZAo52QBXSNiL8

joshkmartinez · 2026-04-14T18:30:47Z

Good catch, thanks!

Joshua Martinez added 7 commits April 12, 2026 22:29

record: add run051 safe submission update

43b2c27

chore: rename warmdown submission folder and simplify copy

a77f4ef

chore: remove TensorPool-specific submission copy

19106fe

chore: trim submission readme headline bullets

97de504

chore: trim submission readme sections

c720aff

chore: remove submission provenance section

0cf9ef9

chore: normalize submission metadata format

cb8865f

Bortlesboat mentioned this pull request Apr 13, 2026

[Tool] parameter-golf-checker: static analysis reviewer aid for submission triage #1603

Open

joshkmartinez closed this Apr 14, 2026

andrewbaggio1 mentioned this pull request Apr 16, 2026

Record: GDN-Hybrid + TMA Megakernel + Brotli-11 — val_bpb 1.01195 (3-seed mean) #1672

Closed

6 tasks

tejasnaladala mentioned this pull request Apr 16, 2026

Record: GDN-Hybrid (Gated DeltaNet + SWA) — val_bpb 1.0274 (2-seed mean) #1632

Closed

5 tasks

bigbag mentioned this pull request Apr 17, 2026

Record: K_KVShare_Wider full-recipe FLA — val_bpb 1.04090 (3-seed mean) #1687

Closed

This was referenced Apr 18, 2026

Record: GatedDeltaNet FLA + Brotli (No TTT) — val_bpb 1.01902 (3-seed mean) #1712

Closed

Byte-accounting bug in build_sentencepiece_luts affects GDN-family submissions #1719

Open

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Record: GDN-Hybrid + Sliding Window Attention + compressed-code warmdown1000 - val_bpb 1.01671 (3-seed mean)#1576

Record: GDN-Hybrid + Sliding Window Attention + compressed-code warmdown1000 - val_bpb 1.01671 (3-seed mean)#1576
joshkmartinez wants to merge 7 commits intoopenai:mainfrom
joshkmartinez:gdn-hybrid-warmdown

joshkmartinez commented Apr 12, 2026

Uh oh!

bigbag commented Apr 13, 2026

Uh oh!

joshkmartinez commented Apr 14, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

Conversation

joshkmartinez commented Apr 12, 2026

Summary

Architecture / Technique Stack

Compliance

Notes

Credits

Uh oh!

bigbag commented Apr 13, 2026

Uh oh!

joshkmartinez commented Apr 14, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants