Record: GatedDeltaNet FLA + Brotli (No TTT) — val_bpb 1.01902 (3-seed mean)#1712
Record: GatedDeltaNet FLA + Brotli (No TTT) — val_bpb 1.01902 (3-seed mean)#1712aamodbhatt wants to merge 1 commit intoopenai:mainfrom
Conversation
… mean) GatedDeltaNet linear attention (FLA) K_KVShare_Wider + brotli-11 compression. No TTT — pure fixed predictor (Track A). 3-seed mean: 1.01902 BPB (std 0.0017). All artifacts under 16 MB. Seeds: 1337 (1.01720), 42 (1.02054), 2025 (1.01933) Based on PR openai#1687 by @resouer.
|
Thank you for the submission. I believe there's a byte-accounting bug that invalidates the reported score — flagging in case you weren't aware. The issue In LUT construction at if piece.startswith("\u2581"):
has_space[i] = True
base_bytes[i] = len(piece[1:].encode("utf-8")) + 1 # pre-credits +1Then at eval accumulation at tb = base_bytes_lut[tgt].to(torch.float64)
tb += (has_leading_space_lut[tgt] & ~is_boundary_token_lut[prev]).to(torch.float64) # +1 againFor any Canonical reference Merged PR #1019 — if piece.startswith("\u2581"):
has_leading_space_np[token_id] = True
piece = piece[1:] # strip ▁ first
base_bytes_np[token_id] = len(piece.encode("utf-8")) # no +1 in LUT
# +1 applied once in eval via has_leading_space & ~is_boundary_tokenNumerical impact (sp8192 val stream, 40,540,160 tokens)
Applying the canonical LUT to the
Suggested fix - base_bytes[i] = len(piece[1:].encode("utf-8")) + 1
+ base_bytes[i] = len(piece[1:].encode("utf-8"))After fixing, re-eval with the existing accumulator (which already adds Family note: the same LUT pattern appears in several upstream GDN-family PRs (#1576, #1632, #1687, #1698, #1711) through inherited Happy to help verify corrected numbers if useful. |
|
Closing — same byte-counting bug as PR #1711. The build_sentencepiece_luts LUT double-counts the leading-space byte. Will fix and re-evaluate. |
Record Summary
Final submitted score:
val_bpb 1.01902(std0.0017)Hardware: 8×H100 80GB SXM | Artifact: ~15.6 MB | Train: 600s wallclock | Pure fixed predictor (Track A)
What Changed
3-Seed Results
Submission Checklist
records/track_10min_16mb/Metric Verification
final_int6_roundtrip_exactin each seed logCredits