Skip to content

Add LR0.85 prefix2750 legal TTT record#2047

Open
ZanePeycke wants to merge 1 commit intoopenai:mainfrom
ZanePeycke:codex/final-day-lrmult085-prefix2750
Open

Add LR0.85 prefix2750 legal TTT record#2047
ZanePeycke wants to merge 1 commit intoopenai:mainfrom
ZanePeycke:codex/final-day-lrmult085-prefix2750

Conversation

@ZanePeycke
Copy link
Copy Markdown

@ZanePeycke ZanePeycke commented Apr 30, 2026

Record submission: AutoZany LR0.85 + prefix2750 legal phased TTT

This PR adds one new 10min/16MB record folder:

records/track_10min_16mb/2026-04-30_AutoZany_LR085_Prefix2750_LegalTTT_1.05908/

Result

3-seed mean val_bpb: 1.05907559
Population std: 0.00041335
Mean val_loss: 2.31764997
Hardware: 8x H100 SXM
Track: 10min_16mb

Seed Stop step Train ms Pre-quant BPB Quant no-TTT BPB Final BPB Eval ms Artifact bytes
42 4889 596017 1.06193713 1.07022525 1.05849788 473829 15,976,870
0 4888 596020 1.06282344 1.07113858 1.05928718 464754 15,980,787
1234 4906 596115 1.06264173 1.07122159 1.05944171 468273 15,984,508

What changed

This is a conservative final-day variant on the public PR #1953 / PR #1945 lineage. It keeps the same legal score-first phased TTT path and changes the final TTT evaluation neighborhood:

TTT_LOCAL_LR_MULT=0.85
PHASED_TTT_PREFIX_DOCS=2750
PHASED_TTT_NUM_PHASES=3
EVAL_SEQ_LEN=2560
TTT_EVAL_SEQ_LEN=2560
TTT_MASK=no_qv
TTT_Q_LORA=0
TTT_V_LORA=0
QK_GAIN_INIT=5.25
ASYM_LOGIT_RESCALE=1
AWQ_LITE_ENABLED=1
COMPRESSOR=pergroup

The submitted train_gpt.py is the PR #1953 stack source used for the verified runs. The final BPBs above come from TTT_EVAL_ONLY=1 re-evaluations of the saved artifacts with PHASED_TTT_PREFIX_DOCS=2750.

Compliance checklist

  • Adds only one new folder under records/track_10min_16mb/.
  • Includes README.md.
  • Includes submission.json with author, GitHub ID, score, seeds, and metadata.
  • Includes train_gpt.py that runs from inside the record folder.
  • Includes full training logs for seeds 42, 0, and 1234.
  • Includes final prefix2750 TTT eval logs for seeds 42, 0, and 1234.
  • Training is strictly under 600s for all seeds. Max observed: 596115ms.
  • Evaluation is under 600s for all seeds. Max observed: 473829ms.
  • Artifact is under decimal 16MB for all seeds. Max observed: 15,984,508 bytes.
  • Full fixed validation set with CaseOps byte sidecar.
  • Score-first phased TTT only: validation chunks are scored before adaptation updates.
  • No n-gram cache, PPM mixture, SLOT, validation pretraining, pre-quant TTT, validation lookahead, external data, or network access.

Files included

  • README.md: method summary, results table, compliance notes, reproduction command, lineage.
  • submission.json: structured metadata and per-seed results.
  • train_gpt.py: executable training/eval script.
  • train_seed42.log, train_seed0.log, train_seed1234.log: full train + quantization logs.
  • ttt_prefix2750_seed42.log, ttt_prefix2750_seed0.log, ttt_prefix2750_seed1234.log: final TTT_EVAL_ONLY=1 prefix2750 eval logs.

Reproduction

Run the script once per seed with the config shown in the record README. To reproduce the final reported score from a saved artifact, rerun with:

TTT_EVAL_ONLY=1 \
PHASED_TTT_PREFIX_DOCS=2750 \
TTT_LOCAL_LR_MULT=0.85 \
TTT_MASK=no_qv \
TTT_Q_LORA=0 \
TTT_V_LORA=0 \
TTT_EVAL_SEQ_LEN=2560 \
TTT_CHUNK_SIZE=48 \
COMPRESSOR=pergroup \
torchrun --standalone --nproc_per_node=8 train_gpt.py

Lineage

Built on the public PR #1953 / PR #1945 / PR #1855 lineage: AWQ-lite, Asymmetric Logit Rescale, CaseOps tokenizer, SparseAttnGate, SmearGate, LQER, QK gain, and legal score-first phased TTT. This PR contributes the final-day TTT_LOCAL_LR_MULT=0.85 + PHASED_TTT_PREFIX_DOCS=2750 legal eval selection and 3-seed verification under the hard time and artifact limits.

@ZanePeycke ZanePeycke changed the title Add AutoZany LR0.85 prefix2750 legal TTT record Add LR0.85 prefix2750 legal TTT record May 1, 2026
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant