Record: 11L XSA4 + EMA + LoRA TTT + Partial RoPE + dim480 — val_bpb 1.13112 (3-seed)#1127
Record: 11L XSA4 + EMA + LoRA TTT + Partial RoPE + dim480 — val_bpb 1.13112 (3-seed)#1127dentity007 wants to merge 2 commits intoopenai:mainfrom
Conversation
Community Review — Record: 11L XSA4 + EMA + LoRA TTT + Partial RoPE + dim480 — val_bpb 1.13112 (3-seed)BPB: 1.13112 | Compliance: FLAG — Pre-Quant TTT runs multi-epoch on What I found in the code (head SHA At line 936 the pre-quant TTT function takes Per Issue #402 and Issue #677 (@valerio-oai, 2026-03-27), TTT is valid only if each token is scored BEFORE the adapter trains on it; multi-epoch TTT that scores only on the final pass is explicitly called out as invalid. This implementation matches the pattern that closed PR #1376 (stukenov) and was subsequently confirmed in #1485/#1487/#1488/#1489/#1517/#1539 — see Issue #677 meta-comment from 2026-04-11 which lists the 6+ PRs in the cluster. Contrast with the legal Pre-Quant TTT pattern (e.g. PR #1416 / PR #1423 lineage): those train the adapter on a held-out slice of training data (not CPU smoke test (CT2038 proteus-engine, 2026-04-11): import OK in 0.03s, dim=512, layers=11, vocab=1024, code=64073 B, SMOKE_TEST_PASS Verdict: COMPLIANCE FLAG — same pattern as the closed Pre-Quant TTT cluster. Recommendation to @cocohearts @valerio-oai @0hq @yuzhougu-oai @notapplica: CLOSE under the same ruling as #1376 and the rest of the cluster. A resubmission with the TTT function taking a training-data slice instead of Reviewed by @MatoTeziTanka — The Agora. CPU smoke test (CT2038 proteus-engine, 2026-04-11): import OK in 0.03s, dim=512, layers=11, vocab=1024, code=64073 B, SMOKE_TEST_PASS. Classification via deterministic AST-based |
…ka review pattern Proactive self-flag before the Agora compliance review reaches this PR. Same illegal pattern as PR openai#1193 and PR openai#406: ttt_adapt() runs on val_tokens for 1 epoch with no score-first discipline before the final eval. Changes: - train_gpt.py: TTT_ENABLED default changed from "1" to "0". Added comment explaining the fix and cross-referencing the flagged sibling PRs. - submission.json: val_bpb set to null, val_bpb_retracted preserved for record. Status set to "retracted". - README.md: Update notice at top explaining the retraction, original summary struck through. Unlike PR openai#406 which had clean DIAGNOSTIC post_swa numbers in the train logs, this submission has no pre-TTT diagnostic numbers preserved, so no clean substitute BPB is available.
Proactive compliance documentation while awaiting maintainer ruling on hash-based eval-time n-gram caches per Issue openai#402, Issue openai#677, and PR openai#886. No code changes. Just README documenting: - The open dispute (valerio-oai leaning legal, abaybektursun openai#886 disputing via hash collision density, Robert-Sneiderman openai#900 defending Dirichlet formula validity) - What this submission does (backward-looking causal n-gram cache with Dirichlet-Multinomial smoothing) - What it does NOT do (no training on val_tokens, no backward passes, model frozen during eval) - Explicit statement that I asked on Issue openai#402 on April 2 and will retract if ruled invalid Distinct from the TTT-on-val class of violations I retracted in PR openai#1193, PR openai#406, and PR openai#1127.
Same approach as PR openai#948 compliance note. This submission extends openai#948 with order-20 backoff but uses the same eval-time hash n-gram cache architecture under the same community dispute (Issue openai#402, Issue openai#677, PR openai#886, PR openai#900). No code changes. README documents: - The open dispute and relevant threads - What this submission does (causal backward-looking cache, Dirichlet smoothing, model frozen) - What it does NOT do (no training on val_tokens, no backward passes) - Distinct from the TTT-on-val class I retracted in openai#1193, openai#406, openai#1127 - Will retract if maintainers rule the class invalid
Self-flag (proactive compliance fix, April 13, 2026)Following the community review pattern @MatoTeziTanka applied to my PR #1193 and PR #406, I audited this PR on my end and confirmed the same illegal TTT-on-val pattern. What I found:
Fix pushed (commit 23dab4f):
No clean substitute BPB available. Unlike PR #406 where train logs preserved clean DIAGNOSTIC post_swa numbers, this submission does not have pre-TTT diagnostic numbers in the records folder. If a legal no-TTT rerun becomes available in the future, I will update this PR. Otherwise, treating this as withdrawn for the record track. Other PRs audited on my end:
Thanks to @MatoTeziTanka and The Agora for the systematic compliance review. Self-flagging before the queue reaches a PR is faster than waiting for the audit. |
11L XSA4 + EMA + LoRA TTT + Partial RoPE + GPTQ-lite (dim480)
val_bpb: 1.13112 (3-seed mean, std 0.00051) | ~15.5 MB | 8×H100 SXM Reykjavík Iceland
PR #462 architecture compressed to fit 16MB with MODEL_DIM=480.
3-seed validation
Key components
Compliance
Note on reproduction
Current
runpod/parameter-golf:latest(PyTorch 2.9.1+cu128) requires manual FA3 install: