Add SP8192 + ParResid + DR + LoRA TTT + Mixed int4/int6/int8 + AWQ su… by dev-pratap-singh · Pull Request #1919 · openai/parameter-golf

dev-pratap-singh · 2026-04-29T08:13:03Z

…bmission

Adds an unverified submission targeting val_bpb=1.0587 (3-seed mean) on the 10-min 8xH100 track, mirrored into the non-record track. Both folders contain the README, submission.json, and single-file train_gpt.py entry point.

Status: NOT YET VERIFIED ON H100 — per-seed train logs and runtime compliance flags are pending the 8xH100 reproduction run.

…bmission Adds an unverified submission targeting val_bpb=1.0587 (3-seed mean) on the 10-min 8xH100 track, mirrored into the non-record track. Both folders contain the README, submission.json, and single-file train_gpt.py entry point. Status: NOT YET VERIFIED ON H100 — per-seed train logs and runtime compliance flags are pending the 8xH100 reproduction run.

dexhunter · 2026-04-29T13:08:33Z

Hi @dev-pratap-singh — thanks for sharing this submission.

Quick technical note for the community thread, since I noticed there are no other comments yet:

The AWQ activation-aware scale calibration in train_gpt.py (around lines 1796-1819, where AWQ activations are collected from the first 4×2048 val tokens before int4 quantization, per the README's explicit description) appears to violate Issue #1017 Condition 3 (score-before-update):

"fix that position's score contribution before any x_t-dependent update or accounting rule"

The AWQ rescaling factors s_in are computed from activations on the first 8192 val tokens, then folded into the int4 weights. The artifact's scoring weights for those same val tokens therefore depend on the val tokens' own activations — i.e., position t's score uses a transform learned from x_t's activation.

This is the same class as PR #1350 / PR #1351 pre-quant calibration on val data (flagged in earlier reviews), which Issue #677 (illegal-submissions megathread) covers.

A clean fix: calibrate AWQ scales on a held-out slice of train shards (e.g., the last N tokens of train_files[0]) and freeze before eval. This preserves the AWQ benefit while keeping Condition 3 satisfied.

The rest of the mechanic looks quite clean — parallel residuals, depth recurrence, and LoRA score-first TTT all appear well-formed. Happy to discuss if I've misread the calibration path.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Add SP8192 + ParResid + DR + LoRA TTT + Mixed int4/int6/int8 + AWQ su…#1919

Add SP8192 + ParResid + DR + LoRA TTT + Mixed int4/int6/int8 + AWQ su…#1919
dev-pratap-singh wants to merge 1 commit intoopenai:mainfrom
dev-pratap-singh:submission/2026-04-29-sp8192-parresid-dr-loratt-mixedquant-awq

dev-pratap-singh commented Apr 29, 2026

Uh oh!

dexhunter commented Apr 29, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

Conversation

dev-pratap-singh commented Apr 29, 2026

Uh oh!

dexhunter commented Apr 29, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants