Skip to content

Non-record: SDClip-matched FakeQuantize — reduces quant degradation from +0.17 to +0.044#1773

Open
Amanbig wants to merge 1 commit intoopenai:mainfrom
Amanbig:submission/v11-sdclip-fakequant
Open

Non-record: SDClip-matched FakeQuantize — reduces quant degradation from +0.17 to +0.044#1773
Amanbig wants to merge 1 commit intoopenai:mainfrom
Amanbig:submission/v11-sdclip-fakequant

Conversation

@Amanbig
Copy link
Copy Markdown

@Amanbig Amanbig commented Apr 22, 2026

Non-record submission

Documenting a QAT/quantizer mismatch fix.

Key finding

When QAT FakeQuantize uses a different clipping formula than the save-time quantizer, the model learns to rely on patterns that disappear post-quant:

Version Pre-quant BPB Post-quant BPB Degradation
v8 (naive absmax FakeQuantize) 1.1387 1.3103 +0.17 💀
v11 (SDClip-matched) 1.1872 1.2313 +0.044

Stack

  • SP8192, 11L × 512, 40.5M params
  • GQA 8H/4KV + Partial RoPE 16/64 + QK-Gain 5.25
  • MuonEq-R WD=0.095, matrix_lr=0.022
  • Parallel Residuals (layers 7+)
  • 3-Layer Depth Recurrence (L3,4,5) @ 35%
  • BigramHash + SmearGate + Value Embeddings
  • EMA 0.9965 from 50%, Warmdown 72%
  • SDClip-matched FakeQuantize from 80% QAT
  • Legal Score-First TTT (SGD lr=0.005, mom=0.9, 3 cosine epochs)
  • Mixed int5 MLP / int6 Attn / int8 Embed (k=12.85/20.0)

Compute

1×H100 Kaggle, 4000 steps, single seed 1337. Not a record — gap to SOTA (1.0810) is compute, not architecture. Submitted to document the QAT fix.

Credits

Builds on PR #1394 (@clarkkev), PR #1412 (@Robby955), PR #1493 (@bigbag).

Note

Final BPB numbers reflect the trajectory through step 3500 + estimated post-quant/TTT based on v10's measured +0.044 degradation. Happy to re-run with scaled compute for verification.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant