Skip to content
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
2 changes: 2 additions & 0 deletions README.md
Original file line number Diff line number Diff line change
Expand Up @@ -39,6 +39,8 @@ Happy training!
| MuonEq-R + Depth Recurrence + WD=0.090 + All-Int6 GPTQ | 1.0912 | dexhunter | On PR #1285: MuonEq-R + layers 4-5 recurrence + higher weight decay + all-int6 GPTQ | 2026-04-03 | [info](records/track_10min_16mb/2026-04-03_MuonEqR_DepthRecurrence_WD090_AllInt6/README.md) |
| 4096-Vocab + Larger Model + High WD + Simplifications | 1.0979 | Kevin Clark | On PR #1218: SP4096 + 4x MLP + high weight decay, with TTT, hash embeddings, SmearGate, and value residuals removed | 2026-04-01 | [info](records/track_10min_16mb/2026-04-01_Vocab4096_MLPMult4_WD085/README.md) |
| Parallel Residuals + Mini Depth Recurrence | 1.1063 | Marko Sisovic | On PR #1204: mini recurrence on layers 4-5 + parallel attention/MLP residual lanes + AR self-generated GPTQ calibration | 2026-03-31 | [info](records/track_10min_16mb/2026-03-31_ParallelResiduals_MiniDepthRecurrence/README.md) |
| Rascal | 1.1099 | newjordan | On PR #1120: XSA-all + Parallel Muon + coprime loader + Bigram2048/RoPE16 + SWA/late QAT without GPTQ | 2026-03-30 | [info](https://github.com/openai/parameter-golf/pull/1120) |
| Coprime-Stride Loader + Full GPTQ + XSA-all | 1.1122 | dexhunter | On PR #1060: coprime multi-shard loader + Full Hessian GPTQ + XSA on all layers + BigramHash(2816x112) | 2026-03-29 | [info](records/track_10min_16mb/2026-03-29_Loader_FullGPTQ_XSA11_BigramHash2816/README.md) |
| 11L AR Self-Gen GPTQ + XSA | 1.1147 | abaybektursun | On PR #1019: Self-Generated GPTQ Calibration Data + all-layer XSA on the PR #549 stack | 2026-03-25 | [info](records/track_10min_16mb/2026-03-25_ValCalib_GPTQ_XSA_BigramHash3072/README.md) |
| LeakyReLU² + Legal Score-First TTT + Parallel Muon | 1.1194 | abaybektursun | On PR #549: LeakyReLU(0.5)^2 + TTT + Parallel Muon on the PR #414 stack | 2026-03-23 | [info](records/track_10min_16mb/2026-03-23_LeakyReLU_LegalTTT_ParallelMuon/README.md) |
| 11L EMA + GPTQ-lite + warmdown3500 | 1.1228 | signalrush | On PR #374: GPTQ-lite clip search + EMA, plus warmdown3500 and [email protected] | 2026-03-22 | [info](records/track_10min_16mb/2026-03-22_11L_EMA_GPTQ-lite_warmdown3500_QAT015_1.1233/README.md) |
Expand Down

This file was deleted.

This file was deleted.

Loading