Record: PR1855/PR1953 base + Progressive context growth (val_bpb: 1.05759, 3-seed) by simonbissonnette · Pull Request #2014 · openai/parameter-golf

simonbissonnette · 2026-04-30T21:00:47Z

Record candidate: SP8192 CaseOps + Progressive 3k Context Growth + Short-Doc Score-First TTT

val_bpb: 1.05759 (3-seed mean, std 0.00034) | val_loss: 2.31441 nats (std 0.00075) | 15.98 MB max | 8xH100 SXM | 600s train / 600s eval

Improvement over merged PR #1855 leaderboard record (1.06107587 BPB):
-0.00348 BPB / -0.00762 nats

This stacks a progressive training-context schedule and a short-document TTT schedule on top of the late-April CaseOps/SP8192/LQER/SparseAttnGate/BOS-fixed SmearGate lineage. The direct leaderboard comparison is PR #1855, which is the current merged leader used here as the baseline.

Results

Seed	Steps	ms/step	Train ms	Pre-quant BPB	Quant BPB	Post-TTT BPB	TTT eval s	Artifact bytes
42	4,888	121.9	596,025	1.05993108	1.06833072	1.05740567	572.4	15,981,945
314	4,882	122.1	595,976	1.05975470	1.06832443	1.05730104	489.9	15,984,387
0	4,884	122.0	596,022	1.06072266	1.06902034	1.05807084	493.5	15,981,122
Mean	4,884.7	122.0	596,008	1.06013615	1.06855850	1.05759252	518.6	15,982,485

3-seed population std: 0.00034091 BPB / 0.00074604 nats.

All included seeds are under the 16,000,000-byte artifact cap and the 600s train/eval budgets as logged. The maximum artifact is 15,984,387 bytes and the maximum validation-data TTT pass is 572.4s.

Full validation coverage

All three logs evaluate the full CaseOps validation shard target set:

Seed	`val_tokens`	`target_tokens`
42	47,853,343	47,853,343
314	47,853,343	47,853,343
0	47,853,343	47,853,343

The training script explicitly keeps the validation tail via EVAL_INCLUDE_TAIL=1. This avoids the older multiple-of-context truncation and makes the standard diagnostic eval and quantized TTT eval agree on the same target count.

The tokenizer, CaseOps transform, training shards, validation shard, and byte sidecar format are the same canonical HF-hosted CaseOps export used by the merged PR #1855 setup. If a reviewer already has the clean #1855/HF CaseOps data staged, those same staged shards can be reused here. The included tokenizer/prep files are present only to make this submission self-contained; the preferred reproduction path is to download the canonical HF CaseOps export directly.

What changed vs PR #1855

This submission keeps the same overall 11-layer SP8192 CaseOps recurrent-transformer family as PR #1855, then adds the following levers:

Lever	Setting	Purpose
Progressive train context	`[email protected],[email protected],[email protected]`	Train cheaply at 1k early, move to 2k for most of training, then finish at 3k context.
Final/eval context	`TRAIN_SEQ_LEN=3072`, `EVAL_SEQ_LEN=3072`, `TTT_EVAL_SEQ_LEN=3072`, `EVAL_STRIDE=1536`	Extend the final model and TTT scoring context beyond 2k without the 4k eval-time cost.
Long-context TTT mask	`TTT_MASK=no_qv`, `TTT_Q_LORA=0`, `TTT_V_LORA=0`	Keep K/O/MLP LoRA adaptation while removing Q/V adapters that were less helpful at longer context.
TTT local LR	`TTT_LOCAL_LR_MULT=0.75`	Slightly softer per-document LoRA adaptation.
Short-doc score-first chunks	`TTT_SHORT_SCORE_FIRST_STEPS=256:8,2000:24`, default chunk 48	Use smaller score-before-update chunks for short documents, preserving causality while improving adaptation.
TTT phases	`PHASED_TTT_NUM_PHASES=1`, `PHASED_TTT_PREFIX_DOCS=2500`	Single score-first phased pass with a 2500-doc prefix budget.
QK gain	`QK_GAIN_INIT=5.25`	Public long-context sweep result from the PR #1953 lineage.
Compression/quant stack	`COMPRESSOR=pergroup`, AWQ-lite, asymmetric logit rescale	Inherited from public late-April quantization/compression work stacked on the PR #1855 base.

The short-doc TTT schedule does not train on future validation tokens. It only changes the chunk granularity used inside the existing score-before-update loop: each chunk is scored first, then the LoRA update is applied for future chunks.

Architecture and training stack

Component	Setting
Model	11 layers, 512d, 8 query heads, 4 KV heads, MLP 4x
Tokenizer/data	SP8192 CaseOps lossless caps with byte sidecar accounting
RoPE	Partial RoPE, 16 dims
Recurrence	Layers 3-5 looped, enabled at `frac=0.35`
Parallel decoder	Parallel lane from layer 8, mean final lane
XSA	All 11 layers
Gates	BOS-fixed SmearGate, SparseAttnGate with `gate_window=12`, scale 0.5
Optimizer	Muon on matrix params, Adam on embedding/scalars, `BETA2=0.99`
EMA	`ema_decay=0.9965`
Quantization	GPTQ int6 matrices, int7 embeddings, LQER asymmetric rank-4 correction
GPTQ reserve	`GPTQ_RESERVE_SECONDS=4.0`; logs show `gptq:reserving 4s, effective=596000ms`
Compression	Per-group compression
TTT	Quantized phased LoRA TTT, score-first, no_qv mask, short-doc chunk schedule

Compliance notes

Artifact cap: all seeds <= 15,984,387 bytes.
Training wallclock: all training loops stop around 596.0s with GPTQ_RESERVE_SECONDS=4.0; GPTQ hessian collection is logged immediately after (67 Hessians in 4.1s) for transparency.
Eval wallclock: all validation-data TTT passes are <= 572.4s. The ttt_lora:compile warmup uses random tokens and no validation data; it is logged separately from total_eval_time.
Score-before-update: quantized_ttt_phased scores each chunk before applying that chunk's LoRA update. The short-doc schedule only changes chunk size.
Full validation targets: val_tokens == target_tokens == 47853343 in all included logs.
No validation data in training: training uses only training shards. TTT accesses validation documents left-to-right under the score-first rule.
No external cache or direct memorization: no SLOT, n-gram cache, PPM mixture, logit bias table, or validation-derived precomputation.
Original-byte BPB: CaseOps byte sidecar accounting is preserved.

Reproduction

Install the dependencies in requirements.txt. FlashAttention 3 and the lrzip system binary are noted there because they require separate install paths.

This submission uses the clean canonical CaseOps SP8192 export hosted on Hugging Face. The logs were produced from a 50,000-document validation split with 80 training shards (train_shards: 80, ttt_phased: total_docs:50000, and val_tokens == target_tokens == 47853343 in every included log).

Preferred data setup:

python3 - <<'PY'
from huggingface_hub import snapshot_download

snapshot_download(
    repo_id="romeerp/parameter-golf-caseops-v1",
    repo_type="dataset",
    local_dir="./data/datasets/fineweb10B_sp8192_caseops",
    allow_patterns=[
        "datasets/datasets/fineweb10B_sp8192_lossless_caps_caseops_v1_reserved/*",
        "datasets/tokenizers/fineweb_8192_bpe_lossless_caps_caseops_v1_reserved.model",
    ],
    max_workers=8,
)
PY

Then set:

DATA_PATH=./data/datasets/fineweb10B_sp8192_caseops/datasets/datasets/fineweb10B_sp8192_lossless_caps_caseops_v1_reserved
TOKENIZER_PATH=./data/datasets/fineweb10B_sp8192_caseops/datasets/tokenizers/fineweb_8192_bpe_lossless_caps_caseops_v1_reserved.model

Fallback local rebuild: if the HF export is unavailable, rebuild from the canonical docs_selected.jsonl with the included prepare_caseops_data.py, lossless_caps.py, and tokenizer. Use --val-docs 50000 and write into a fresh output directory. The prep script now defaults to 50,000 validation docs and refuses to write over existing fineweb_*.bin shards unless --overwrite is passed, to avoid accidentally mixing stale validation shards with a new train split.

Run one seed at a time, replacing DATA_PATH and TOKENIZER_PATH with the staged CaseOps paths:

for SEED in 42 314 0; do
  NCCL_NET=Socket \
  DATA_DIR=./data \
  DATA_PATH=./data/datasets/fineweb10B_sp8192_lossless_caps_caseops_v1_reserved \
  TOKENIZER_PATH=./data/tokenizers/fineweb_8192_bpe_lossless_caps_caseops_v1_reserved.model \
  CASEOPS_ENABLED=1 \
  VOCAB_SIZE=8192 \
  ITERATIONS=20000 \
  MAX_WALLCLOCK_SECONDS=600 \
  EVAL_INCLUDE_TAIL=1 \
  TRAIN_SEQ_LEN=3072 \
  ROPE_TRAIN_SEQ_LEN=3072 \
  [email protected],[email protected],[email protected] \
  TRAIN_SEQ_SCHEDULE_MODE=wallclock \
  SEQ_CHANGE_WARMUP_STEPS=32 \
  EVAL_SEQ_LEN=3072 \
  EVAL_STRIDE=1536 \
  TTT_ENABLED=1 \
  TTT_EVAL_SEQ_LEN=3072 \
  TTT_BATCH_SIZE=24 \
  TTT_CHUNK_SIZE=48 \
  TTT_SHORT_SCORE_FIRST_ENABLED=1 \
  TTT_SHORT_DOC_LEN=2000 \
  TTT_SHORT_CHUNK_SIZE=24 \
  TTT_SHORT_SCORE_FIRST_STEPS=256:8,2000:24 \
  TTT_LORA_RANK=80 \
  TTT_LORA_LR=0.0001 \
  TTT_LOCAL_LR_MULT=0.75 \
  TTT_MASK=no_qv \
  TTT_Q_LORA=0 \
  TTT_V_LORA=0 \
  TTT_WEIGHT_DECAY=0.5 \
  TTT_BETA2=0.99 \
  PHASED_TTT_PREFIX_DOCS=2500 \
  PHASED_TTT_NUM_PHASES=1 \
  WARMDOWN_FRAC=0.85 \
  BETA2=0.99 \
  QK_GAIN_INIT=5.25 \
  SPARSE_ATTN_GATE_ENABLED=1 \
  SPARSE_ATTN_GATE_SCALE=0.5 \
  GATED_ATTN_QUANT_GATE=1 \
  SMEAR_GATE_ENABLED=1 \
  GATE_WINDOW=12 \
  FUSED_CE_ENABLED=1 \
  MATRIX_LR=0.026 \
  MIN_LR=0.1 \
  GRAD_CLIP_NORM=0.3 \
  EMBED_BITS=7 \
  EMBED_CLIP_SIGMAS=14.0 \
  MATRIX_CLIP_SIGMAS=12.85 \
  ATTN_CLIP_SIGMAS=13.0 \
  MLP_CLIP_SIGMAS=11.5 \
  LQER_ENABLED=1 \
  LQER_RANK=4 \
  LQER_TOP_K=3 \
  LQER_FACTOR_BITS=4 \
  LQER_ASYM_ENABLED=1 \
  LQER_ASYM_GROUP=64 \
  AWQ_LITE_ENABLED=1 \
  AWQ_LITE_BITS=8 \
  AWQ_LITE_GROUP_TOP_K=1 \
  AWQ_LITE_GROUP_SIZE=64 \
  ASYM_LOGIT_RESCALE=1 \
  GPTQ_RESERVE_SECONDS=4.0 \
  GPTQ_CALIBRATION_BATCHES=16 \
  COMPRESSOR=pergroup \
  VAL_LOSS_EVERY=0 \
  SEED=$SEED \
  torchrun --standalone --nproc_per_node=8 train_gpt.py \
      > train_seed${SEED}.log 2>&1
done

Included files

train_gpt.py - full training/eval script used for the logs.
train_seed42.log, train_seed314.log, train_seed0.log - full per-seed logs.
submission.json - structured metadata and per-seed results.
README.md - this file.
requirements.txt - Python dependencies plus notes for FA3 and lrzip.
prepare_caseops_data.py - fallback CaseOps dataset/token/byte-sidecar preparation; defaults to the canonical 50,000-doc validation split and refuses mixed/stale output directories by default.
lossless_caps.py - reversible CaseOps transform, same as the PR Record: SP8192 + LQER + Sparse Attn Gate + BOS-Fixed SmearGate + 9-Hparam Greedy Stack — val_bpb 1.06108 (3-seed mean) #1855 CaseOps setup.
tokenizers/fineweb_8192_bpe_lossless_caps_caseops_v1_reserved.model - SentencePiece tokenizer used by the logs; identical CaseOps tokenizer lineage as PR Record: SP8192 + LQER + Sparse Attn Gate + BOS-Fixed SmearGate + 9-Hparam Greedy Stack — val_bpb 1.06108 (3-seed mean) #1855.

Lineage and credits

This submission is a stack on top of the public CaseOps/SP8192 record lineage:

PR Record: SP8192 + LQER + Sparse Attn Gate + BOS-Fixed SmearGate + 9-Hparam Greedy Stack — val_bpb 1.06108 (3-seed mean) #1855 by @codemath3000 - merged leaderboard record and direct comparison baseline.
PR Record: V22 = V21 + PR #1953 levers + EVAL_SEQ_LEN=2816 -- val_bpb 1.05877 (3-seed mean, all strict <600s) #1945 / PR Record: PR #1855 base + activation-aware GPTQ mixed precision - val_bpb 1.06081 (3-seed mean) #1908 / PR Record: SP8192 #1855 Base + Asymmetric Logit Rescale + AWQ-lite — val_bpb 1.05971 (3-seed mean, full val) #1923 public late-April quantization stack - AWQ-lite and asymmetric logit rescale lineage.
PR Record: PR #1945 base + 2560 long-context + no_qv TTT mask + TTT LR 0.75 + QK_GAIN 5.25 — val_bpb 1.05855 (3-seed mean) #1953 - long-context/no_qv/QK-gain sweep ideas.
PR Record: PR #1787 base + Smear Gate + LQER Asym — val_bpb 1.06157 #1797 by @dexhunter - SmearGate and LQER asymmetric rank-4 lineage.
PR Record: PR #1736 + Polar Express NS + MIN_LR + Sparse Attn Gate + Fused CE + PR #1767 TTT — val_bpb 1.06335 #1787 by @nprime06 - Polar Express Muon, MIN_LR, SparseAttnGate, fused CE.
PR Record: SP8192 + CaseOps + GatedAttn + QuantGate + Loop45 + PhasedTTT — val_bpb 1.06549 #1736 and PR Record: CaseOps Tokenizer + Tapered WD - val_bpb 1.0678 (3-seed mean) #1729 by @dexhunter / @romeerp - CaseOps integration and byte sidecar accounting.
PR RECORD: SmearGate + Attention Output Gate + Legal TTT | val_bpb=1.07139 #1667 by @MarioPaerle - SmearGate lineage.
PR Record: VarLen Attention + Fused MLP + Multi-Phase Global SGD TTT — val_bpb 1.07193 (3-seed mean) #1626 / PR Record: VarLenAttn + PhasingTTT - val_bpb 1.0728 (3-seed mean) #1610 - phased score-first TTT lineage.
Issue A Field Guide to Valid Submissions #1017 by @cocohearts - score-first validation criteria.

The new contribution here is the combination of progressive 3k train/eval context growth with the short-document score-first TTT chunk schedule, while preserving the full validation target count and staying under the artifact/eval budgets.

Pull PR openai#2014's record dir from openai/parameter-golf and reproduce its 1.05759 3-seed mean. Key new levers vs openai#1953: EVAL_SEQ_LEN=3072, train_seq_schedule 1024->2048->3072, single-phase TTT (NUM_PHASES=1, PREFIX=2500), short-doc score-first chunking (TTT_SHORT_SCORE_FIRST_STEPS=256:8,2000:24). Even with our infra's ~1.5-2 milli-BPB inflation pattern, reproducing openai#2014 should land ~1.0590 — close enough to record bar to potentially clear it. Co-Authored-By: Claude Opus 4.7 (1M context) <[email protected]>

Port the 040C 'middle 5x / late 3.4x' allocation onto simonbissonnette's progressive-3k base (openai#2014) and screen vs uniform 4.0 baseline. Training-only, 4xH100 1200s, single seed. Code on exp/300-040c-on-2014 @ d174313. Spec flags the column-slice-in-compile hazard from feedback memory and mandates a compile-sanity check before scaling. PREQUANT_ONLY=1 keeps the screen cheap by skipping serialize/GPTQ/TTT. Co-Authored-By: Claude Opus 4.7 (1M context) <[email protected]>

…ean 1.05831 BPB Clears record bar (1.05914) by 0.83 milli-BPB. Welch t = -6.49 vs PR openai#1855 (1.06108), p < 0.0001. All 3 seeds produce 15.99 MB artifacts under the 16 MB cap, all under the 600s wallclock budget. Per-seed: - 42: ttt=1.05793 art=15,986,149 eval=572.6s - 314: ttt=1.05852 art=15,987,257 eval=553.7s - 1234: ttt=1.05849 art=15,989,895 eval=574.1s Submission directory at records/track_10min_16mb/2026-04-30_PR2014_Reproduction_1.0583/ contains PR openai#2014's verbatim train_gpt.py + tokenizer + our seed_results.csv + a detailed README documenting the lineage (openai#1797 -> openai#1851 -> openai#1855 -> openai#1908 -> openai#1923 -> openai#1953 -> openai#2014), the new levers vs each parent, and the full 4-condition C1-C4 legality check. submission.json author/github_id are placeholders pending the user's choice of submitting account. Reproduction script: runpod/phase_x_pr2014.sh — runs end-to-end on a single 8xH100 SXM pod (~2.5h wall, ~$66 cost). Co-Authored-By: Claude Opus 4.7 (1M context) <[email protected]>

…chedule

AsymLogit Rescale (PR openai#1923) ported as 2 TTT-adaptable scalar params (softcap_pos, softcap_neg). Pre-quant 1.06160 (slightly worse than S55's 1.06058 — AsymLogit hurts un-adapted model). TTT recovery -0.01267 (much better than S55's -0.01103) — AsymLogit gives massive adaptive capacity. Final 1.05759 = -0.00055 vs S55. Single-seed matches PR openai#2014's 3-seed mean. Eval 521.7s (under 600s cap), Size 15,946,610. softcap_pos and softcap_neg init to logit_softcap=30.0, adapted per-doc via TTT-LoRA optimizer.

User pushed back on openai#2014's LEAK call as too inference-based. Verified directly: - README says "uses same shards as PR openai#1855. If you don't have them, prepare with included prepare_caseops_data.py" — phrasing implies inheritance from openai#1855 (LEAK) but doesn't explicitly invoke prep - No setup.sh, no shell script invoking prep - No HF download script - Path /dev/shm/pgolf_caseops_data_80_l17_final is custom flat RAM-disk dir (not triple-nested local-prep signature) - Could be either HF-flattened download OR local-prep copy Demoted openai#2014 from LEAK to AMBIGUOUS (lean LEAK based on "same shards as openai#1855" English, but not iron-clad). Updated tally: CLEAN 9, LEAK 20 (was 21), AMBIGUOUS 4 (was 3), INHERIT 1.

…E_OUTSIDE=0 Seed 314: pre-quant 1.06128 / quant 1.06962 / final 1.05701 / eval 571.7s Compliance: ngram_hint_precompute_outside=False, precompute (166.95s) INSIDE timer per PR openai#1514 precedent. Token-only tilt: within_gate=0, word_gate=0 - legal per PR openai#1514. Size 15,943,530 bytes. Single seed beats openai#2014's 3-seed mean (1.05759). Validating seeds 42 and 1234.

…015, WD=0.25

Beats PR openai#1855 (merged rank 1, 1.06108) by 0.00438 BPB. Beats PR openai#2014 (best open, 1.05759) by 0.00089 BPB. Beats PR openai#2060 (1.05792) by 0.00122 BPB. Stack: - Token-only n-gram tilt (PR openai#1514 merged precedent, within/word channels disabled) - AsymLogit Rescale (2 trainable scalars adapted by global TTT) - 3 hyperparameter levers from PR openai#2060 (MATRIX_LR=0.028, LQER_ASYM_GROUP=32, TTT_LORA_LR=8e-5) - PHASED_TTT_NUM_PHASES=1 (matches PR openai#2014) - NGRAM_HINT_PRECOMPUTE_OUTSIDE=0 (precompute INSIDE eval timer per PR openai#1514) Compliance: - All seeds eval ≤533.1s (cap 600s, 67-80s margin) - All artifacts ≤15.95MB (cap 16MB) - Token-only n-gram channel (within_gate=0, word_gate=0) - Score-first TTT (per PR openai#402)

…LR=0.00015, WD=0.25

…entions

Add progressive 3k CaseOps record candidate

c9843c9

Maheshram1 mentioned this pull request Apr 30, 2026

Recurrent Transformer RT-KV experiment #2034

Draft

anderamondarainh-stack mentioned this pull request Apr 30, 2026

Predicted val_bpb ~1.054 on PR #2014 base — Gated XSA + Reverse-Chol GPTQ + Leaky 0.3 stack (code complete, asking for compute to verify) #2054

Open

dexhunter mentioned this pull request May 1, 2026

Record: PR #1908 base + GPTQ module-damp + Asym Logit Rescale — val_bpb 1.06048 (3-seed mean) #2051

Closed

Idan3011 pushed a commit to Idan3011/parameter-golf that referenced this pull request May 1, 2026

Add progressive seq schedule (PR openai#2014 lineage) + batch-token s…

df0a390

…chedule

hi-aduek mentioned this pull request May 1, 2026

Support: independent PR2014 prefix-2400 reproduction, seed 42 val_bpb 1.05804 #2078

Open

This was referenced May 1, 2026

Record: SP8192 CaseOps v13 PPM tuned gate — fresh 3-seed mean 0.94175270 #2083

Open

Record: BIJEPAX-lite JEPA + SP8192 CaseOps PPM — val_bpb 0.97271 #2080

Open

himanshudongre mentioned this pull request May 1, 2026

Non-record: competition research notes #2111

Open

andrewbaggio1 mentioned this pull request May 1, 2026

Record [corrected] : 1.05770 Gated XSA + token-only n-gram tilt + LQER top-1 + AWQ-lite + AsymLogit) with GPTQ_RESERVE_SECONDS=2.0 and corrected CaseOps data preparation #2118

Open

Clarify canonical CaseOps reproduction split

fcd0b83

leon2k2k2k mentioned this pull request May 1, 2026

Train/val data leakage in CaseOps records — prepare_caseops_data.py default overlaps 80% of val docs with training data #2127

Open

varunneal added a commit to varunneal/parameter-golf that referenced this pull request May 1, 2026

peer-LoRA ensemble on PR openai#2014: K=4, thresh=0.5, w=0.8, LR=0.00…

93b2286

…015, WD=0.25

TanishGudise mentioned this pull request May 1, 2026

Record candidate: 1.05670 BPB — token-only n-gram tilt + AsymLogit + #2060 levers + NUM_PHASES=1 #2130

Open

varunneal added a commit to varunneal/parameter-golf that referenced this pull request May 1, 2026

TTT Peer-LoRA Ensemble on PR openai#2014: K=3, threshold=0.5, w=0.8, …

bf23ef1

…LR=0.00015, WD=0.25

simon-marcus mentioned this pull request May 1, 2026

Corrected: PR #2014 stack + LeakyReLU 0.3 + token-only in-timer n-gram TTT (val_bpb 1.0570) #2140

Open

varunneal mentioned this pull request May 1, 2026

[Record candidate] TTT Peer-LoRA Ensemble on PR #2014, val_bpb = 1.05749 #2139

Closed

codemath3000 mentioned this pull request May 2, 2026

Record candidate: PR #2130 base + GPTQ_CALIBRATION_BATCHES=32 — val_bpb 1.05651 (3-seed mean) #2135

Open

codemath3000 added a commit to codemath3000/parameter-golf that referenced this pull request May 2, 2026

Align README and submission.json with PR openai#2014/openai#2130 conv…

ff90522

…entions

cocohearts mentioned this pull request May 2, 2026

Update leaderboard with May 1 audited rows #2146

Draft

This was referenced May 2, 2026

Record candidate: PR #2014 base + GATE_WINDOW=8 #2131

Closed

Record candidate: PR #2014 base + GPTQ_CALIBRATION_BATCHES=32 #2132

Closed

Record candidate: PR #2014 base + GATE_WINDOW=8 + GPTQ_CALIBRATION_BATCHES=32 #2133

Closed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Record: PR1855/PR1953 base + Progressive context growth (val_bpb: 1.05759, 3-seed)#2014

Record: PR1855/PR1953 base + Progressive context growth (val_bpb: 1.05759, 3-seed)#2014
simonbissonnette wants to merge 2 commits intoopenai:mainfrom
simonbissonnette:submission/final-growth-candidate

simonbissonnette commented Apr 30, 2026 •

edited

Loading

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

Conversation

simonbissonnette commented Apr 30, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Record candidate: SP8192 CaseOps + Progressive 3k Context Growth + Short-Doc Score-First TTT

Results

Full validation coverage

What changed vs PR #1855

Architecture and training stack

Compliance notes

Reproduction

Included files

Lineage and credits

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

simonbissonnette commented Apr 30, 2026 •

edited

Loading