Skip to content

Record: PR #1908 base + GPTQ module-damp + Asym Logit Rescale — val_bpb 1.06048 (3-seed mean)#2051

Closed
dexhunter wants to merge 1 commit intoopenai:mainfrom
dexhunter:2026-04-30-final-record
Closed

Record: PR #1908 base + GPTQ module-damp + Asym Logit Rescale — val_bpb 1.06048 (3-seed mean)#2051
dexhunter wants to merge 1 commit intoopenai:mainfrom
dexhunter:2026-04-30-final-record

Conversation

@dexhunter
Copy link
Copy Markdown
Contributor

Record candidate: val_bpb 1.06048 (3-seed mean, std 0.00074)

Extends the PR #1908 native base (SparseAttnGate + AWQ-lite int8 + LQER asym rank-4 group-64 + BOS-masked SmearGate) with GPTQ per-module damping: GPTQ_DAMP_EMBED=0.005, GPTQ_DAMP_MLP=0.02, GPTQ_DAMP_ATTN=0.01 replacing uniform damp_frac=0.01. Composes with PR #1945 Asymmetric Logit Rescale.

Results (8×H100 80GB SXM, 600s train / 600s eval, Phased TTT)

Seed val_bpb eval_time_s artifact_bytes
314 1.06006259 404.2 15,868,104
42 1.06003198 413.3 15,869,714
7 1.06133559 442.4 15,865,773
Mean 1.06047672 419.97 15,867,864
Std 0.00074396

All gates pass: train ≤ 599.083s, eval ≤ 442.4s, artifact ≤ 15,869,714 B.

Compliance

  • C1 strict-causal: flash-attention + cu_seqlens packed-doc, no future-token leak
  • C2 full normalized softmax over SP8192 alphabet at every scored position
  • C3 score-before-update: phased TTT scores each chunk before optimizer step; per-document LoRA reset
  • C4 single L→R pass: each val token contributes exactly one BPB term in quantized_ttt_phased
  • Section V: canonical BPB via sentencepiece piece table; full val shards; 16M-byte cap; no SLOT/PPM/n-gram cache/casefold/oracle/pre-quant TTT

Lineage

See records/track_10min_16mb/2026-04-30_HEV20_07_module_damp_3seed/ for full artifacts.

@dexhunter
Copy link
Copy Markdown
Contributor Author

Withdrawing — submission was filed at 1.06048 (3-seed mean), which is below the current open clean leader PR #1953 (1.05855) and far below new open clean leader PR #2014 (1.05759). Doesn't clear the empirical merge floor. Apologies for the noise — this was filed in error during the deadline crunch when our preferred candidate (single-seed best 1.05767) couldn't complete 3-seed verification in time.

@dexhunter dexhunter closed this May 1, 2026
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant