openai · yangguohao · Apr 12, 2026 · Apr 14, 2026 · Apr 16, 2026 · Apr 17, 2026
diff --git a/records/track_non_record_16mb/2026-04-21_RandomLinearMaps/README.md b/records/track_non_record_16mb/2026-04-21_RandomLinearMaps/README.md
@@ -0,0 +1,34 @@
+# Non-Record Run: RandomLinearMaps (Random Subspace Optimization)
+
+This folder contains a non-record experiment snapshot. The README only documents information that can be directly verified from files in this directory.
+
+## Hardware and Precision
+
+- Device: `4x Quadro RTX 8000`
+- GPU architecture: `SM 7.5` (Turing)
+- Training precision: `float32`
+- Reason: this setup uses FP32 training for compatibility and stability on this GPU architecture.
+
+## Files in This Directory
+
+- `train_gpt.py`: training script with RSOAdamW
+- `train_log.txt`: full run output (script echo, config, training progress, and final metrics)
+- `train.sh`: a short launch command example
+
+## Training Configuration (from `train_log.txt`)
+
+- `world_size=4`, `grad_accum_steps=2`
+- `train_batch_tokens=524288`, `train_seq_len=1024`
+- `iterations=2000`, `warmup_steps=0`, `max_wallclock_seconds=0.0` (no wallclock early stop)
+- Model parameters: `20,893,768`
+- Tokenizer: `fineweb_1024_bpe.model`
+
+## Key Results (this run)
+
+- Final validation at `step 2000/2000`: `val_loss=2.5683`, `val_bpb=1.5211`
+- Quantized round-trip eval: `final_int8_zlib_roundtrip_exact val_bpb=:1.53014584`
+- Peak memory allocated: `38824 MiB`
+- Submission size:
+  - `Total submission size = 68323975 bytes`
+  - `Total submission size int8+zlib = 12358817 bytes`
+
diff --git a/records/track_non_record_16mb/2026-04-21_RandomLinearMaps/submission.json b/records/track_non_record_16mb/2026-04-21_RandomLinearMaps/submission.json
@@ -0,0 +1,10 @@
+{
+  "track": "non_record_16mb",
+  "date": "2026-04-21",
+  "name": "RandomLinearMaps (RSO + Muon)",
+  "author": "GUOHAO YANG",
+  "github_id": "yangguohao",
+  "val_bpb": 1.5211,
+  "val_loss": 2.5683,
+  "bytes_total": 12409638
+}
diff --git a/records/track_non_record_16mb/2026-04-21_RandomLinearMaps/train.sh b/records/track_non_record_16mb/2026-04-21_RandomLinearMaps/train.sh
@@ -0,0 +1 @@
+ITERATIONS=2000 TRAIN_LOG_EVERY=20 WARMUP_STEPS=0 MAX_WALLCLOCK_SECONDS=0 torchrun --standalone --nproc_per_node=4 train_gpt.py
Original file line number	Diff line number	Diff line change
		@@ -0,0 +1 @@
		ITERATIONS=2000 TRAIN_LOG_EVERY=20 WARMUP_STEPS=0 MAX_WALLCLOCK_SECONDS=0 torchrun --standalone --nproc_per_node=4 train_gpt.py