Skip to content

Non-record: PR1953 K+O-only TTT + QK_GAIN_INIT=5.35#2119

Open
dexhunter wants to merge 3 commits intoopenai:mainfrom
dexhunter:non-record-pr1953-ko-qk535
Open

Non-record: PR1953 K+O-only TTT + QK_GAIN_INIT=5.35#2119
dexhunter wants to merge 3 commits intoopenai:mainfrom
dexhunter:non-record-pr1953-ko-qk535

Conversation

@dexhunter
Copy link
Copy Markdown
Contributor

Non-record submission documenting a clean PR #1953-family follow-up.

This is not a record claim. Seed 42 was promising, but the 3-seed mean did not justify a record submission.

Results:

  • seed 42: 1.05767136 val_bpb, 15,978,954 bytes, 434.877s TTT eval
  • seed 314: 1.05854316 val_bpb, 15,983,413 bytes, 390.025s TTT eval
  • seed 7: 1.05951958 val_bpb, 15,978,698 bytes, 391.064s TTT eval
  • 3-seed mean: 1.05857803 val_bpb, sample std 0.00092460

Mechanism summary:

Verification run locally:

  • uv run python -m json.tool submission.json
  • uv run python -m py_compile train_gpt.py run.py
  • tools/verify_rules.py <seed log> -s train_gpt.py --strict for seeds 42, 314, and 7

Strict verifier notes: script audit passed with 17 pass / 0 fail / 0 warn for each seed. Log audit passed caps and score-first TTT, with expected legacy-log warnings for missing VERIFY_JSON blocks and validation split coverage not being directly provable from the old log format.

Refs #1017 and #677.

@andrewbaggio1
Copy link
Copy Markdown

cool work

@simon-marcus
Copy link
Copy Markdown

Nice, dex. You've consistently produced really high quality material through the whole competition.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants