Skip to content

Add BOS-masked SmearGate record#1869

Open
ZanePeycke wants to merge 1 commit intoopenai:mainfrom
ZanePeycke:codex/autozany-bos-smear-submission
Open

Add BOS-masked SmearGate record#1869
ZanePeycke wants to merge 1 commit intoopenai:mainfrom
ZanePeycke:codex/autozany-bos-smear-submission

Conversation

@ZanePeycke
Copy link
Copy Markdown

@ZanePeycke ZanePeycke commented Apr 27, 2026

Adds a clean 10min/16MB record candidate based on the PR #1787/#1797 CaseOps + SparseAttnGate + SmearGate + asymmetric LQER + phased score-before-update TTT lineage.

Key result:

  • 3-seed mean: 1.06654079 BPB
  • Sample std: 0.00122292 BPB
  • Seeds: 42, 314, 1234
  • Max artifact: 15,950,966 bytes
  • Train time: 599.579s-599.652s
  • TTT eval time: 272.1s-356.7s

Results:

Seed Steps Train shards Pre-quant BPB Quantized BPB Post-TTT BPB Artifact bytes Train time TTT eval time
42 5047 240 1.06944908 1.07891097 1.06642781 15,950,222 599.600s 356.7s
314 5036 240 1.06882706 1.07790190 1.06537828 15,950,966 599.579s 322.1s
1234 5026 240 1.07098567 1.08042471 1.06781627 15,950,455 599.652s 272.1s
Mean 1.06975394 1.07907919 1.06654079 15,950,548 599.610s 317.0s
Std 0.00111113 0.00126979 0.00122292 381 0.038s 42.5s

Compliance notes:

  • Includes train_seed42.log, train_seed314.log, train_seed1234.log.
  • Artifact stays under the decimal 16,000,000-byte cap.
  • Training and evaluation stay under the 600s budgets on 8xH100 SXM.
  • TTT remains score-before-update.
  • CaseOps BPB uses the original UTF-8 byte sidecar.
  • SmearGate is BOS-masked in both normal and TTT forward paths to prevent cross-document residual carry into a new document.

This is intended as a clean/reproducible submission candidate and explicitly addresses the document-boundary SmearGate issue discussed in the #1797 audit thread.

@ZanePeycke ZanePeycke changed the title Add Autozany BOS-masked SmearGate record Add BOS-masked SmearGate record Apr 27, 2026
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant