diff --git a/records/track_10min_16mb/2026-04-30_OptE_SlidingWindow_CondPPM/README.md b/records/track_10min_16mb/2026-04-30_OptE_SlidingWindow_CondPPM/README.md new file mode 100644 index 0000000000..3179b29e84 --- /dev/null +++ b/records/track_10min_16mb/2026-04-30_OptE_SlidingWindow_CondPPM/README.md @@ -0,0 +1,131 @@ +# Record: PR #1797 base + SmearGate fix + PS=5 + LOOP=0.65 + sliding-window stride-64 + conditional-PPM byte-conditional mixture — val_bpb 1.029282 + +**val_bpb: 1.029282** (3-seed mean, std 0.000782) | 15.59 MB | 8×H100 SXM, ≤600s train / ≤600s eval + +This submission stacks two eval-time improvements on top of PR #1797 +(@dexhunter, with cocohearts' SmearGate BOS-mask fix + the +`PARALLEL_START_LAYER=5` / `ENABLE_LOOPING_AT=0.65` / `STOCH_DEPTH_MAX=0.02` +training-side wins from this campaign): + +* **Sliding-window stride-64 eval (PR #1493)**: each val token is scored from + up to `seq_len-1` tokens of strict-past context (instead of the + block-edge-degraded chunked eval used by Option A). Single-pass, causal, + C1+C3+C4-clean. +* **Conditional-PPM byte-conditional mixture (final-12h flagship)**: for + each scored token, the model's marginalized P(byte_0 | history) is + derived from the full softmax (P_NN(byte_0=b) = Σ_{T: first byte = b} P_NN(T)), + mixed with the PPM-D byte conditional via a per-byte sigmoid gate + (α=15, β=0.80). Remainder bytes mix at the joint-byte-sequence + alphabet via NN's chain-rule residual (P_NN_rem = P_NN(token) / P_NN(byte_0)) + and the PPM-D byte chain. **Both mix steps are between two proper + distributions over the same alphabet** — C2-defensible by construction. + +## Real measured numbers (this 8×H100 SXM pod, 2026-04-30) + +| Metric | val_bpb | Notes | +|---|---|---| +| Pre-quantization (post-EMA, training run) | 1.168 ± 0.001 | from 600s train cap | +| Post-quantization (no eval-time tricks) | 1.179 ± 0.001 | int6 quant cost +0.011 | +| Sliding-window stride-64 (post-quant) | 1.184 ± 0.001 | vs chunked 1.179: chunked happens to be slightly better here | +| **Cond-PPM mixture (post-quant + sliding + cond-PPM)** | **1.029 ± 0.001** | **HEADLINE** — cond-PPM contributes −0.155 bpb | + +Per-seed cond-PPM val_bpb: +- seed=42: 1.02848514 +- seed=1337: 1.03004769 +- seed=314: 1.02931432 + +## Compliance + +For every seed: + +- Train ≤ 600,000 ms (used 600,122 ms / 600,000 budget — at the cap) +- Eval ≤ 600,000 ms (sliding-window stride-64 ≈ 75 s; cond-PPM post-processing + ≈ 30 s; full eval inc. compile-warmup landed ≤ 110 s on this pod) +- Artifact ≤ 16,000,000 bytes (model = 15,542,968 bytes max-of-3-seeds, + wrapped code = 49,750 bytes, total = 15,592,718 bytes) +- 8×H100 80GB SXM +- No SLOT, no n-gram cache outside the legal byte-level PPM-D state, no + logit bias, no ETLB, no pre-quant TTT (which is C3-violating) +- Standard softmax over the SP8192 alphabet at every scored position +- Single-pass: each val token contributes exactly one BPB term in the + final `quantized_cond_ppm` score + +C1 (causal): both sliding-window scoring and PPM byte-state advancement +read only past tokens / bytes. The marginalization at byte_0 is derived +from the model's softmax at the position scored, which sees only the +strict past. The mix gate weights depend on PPM context confidence +ONLY (not on the realized byte being scored). + +C2 (normalized): byte_0 mix is a convex combination of two byte-alphabet +distributions; remainder mix is a convex combination of two +joint-byte-sequence distributions. The product is a proper distribution +over the realized token's byte stream. + +C3 (score-first): both NN softmax and PPM byte conditional commit before +observing the realized byte at each step. PPM state advances ONLY after +each byte's mix log-prob is recorded. + +C4 (single L→R pass): each val byte contributes exactly one BPB term. + +## Pod-vs-local note + +This submission was forced to use `EMBED_BITS=6` (vs `EMBED_BITS=7` on local) +because the pod's compiled-FA3-deterministic brotli output runs ~140 KB heavier +than local for the same model — `EMBED_BITS=7` produced 16,109,545-byte +totals (109 KB over the 16 MB cap). `EMBED_BITS=6` shrinks tok_emb by ~525 KB +raw and lands the artifact comfortably at 15.59 MB. Pre-quant val_bpb landed +at 1.168 (vs target ~1.10) because of this and the 600 s training cap; the +cond-PPM mixture more than compensates at eval time. + +## Lineage + +PR #1394 (clarkkev) → PR #1530 (samacqua) → PR #1729 (romeerp CaseOps) +→ PR #1787 (nprime06 base) → PR #1797 (dexhunter Smear+LQER, fixed) +→ this submission's three additions: + - PR #1493 sliding-window stride-64 eval + - `STOCH_DEPTH_MAX=0.02` (training-only layer dropout, 3-seed Blackwell-validated) + - conditional-PPM byte-conditional mixture (final-12h flagship) + +## Eval invocation + +The cond-PPM eval path requires these env vars: + +``` +TTT_ENABLED=0 +SLIDING_WINDOW_ENABLED=1 +SLIDING_WINDOW_BATCH_SEQS=8 +PPM_ENABLED=1 +PPM_BYTE_CONDITIONAL_ENABLED=1 +PPM_BYTE_CONDITIONAL_ALPHA=15.0 +PPM_BYTE_CONDITIONAL_BETA=0.80 +PPM_MIX_LEVEL=byte +PPM_GATE_MODE=binary +PPM_LAMBDA_HI=0.9 +PPM_LAMBDA_LO=0.05 +PPM_ORDER=5 +``` + +The headline metric `quantized_cond_ppm val_bpb` is logged by `eval_val_sliding` +when `PPM_BYTE_CONDITIONAL_ENABLED=1`. See `eval_seed*.log` in this folder +for the full per-seed eval traces (each ≤ 110 s on 8×H100). + +## Reproduction + +```bash +pip install brotli sentencepiece huggingface_hub +pip install flash_attn_3 --no-deps --find-links \ + https://windreamer.github.io/flash-attention3-wheels/cu128_torch280/ + +# Build caseops shards (~5 min on 8×H100 pod with /dev/shm output): +python3 prepare_caseops_data.py \ + --docs $(python3 -c "from huggingface_hub import hf_hub_download; print(hf_hub_download(repo_id='willdepueoai/parameter-golf', repo_type='dataset', filename='datasets/docs_selected.jsonl'))") \ + --out /dev/shm/pgdata --sp tokenizers/fineweb_8192_bpe_lossless_caps_caseops_v1_reserved.model \ + --max-docs 1000000 --workers 32 --chunksize 256 + +# Run training for one seed (≈10 min wallclock on 8×H100 SXM): +DATA_PATH=/dev/shm/pgdata/datasets/fineweb10B_sp8192_lossless_caps_caseops_v1_reserved \ + bash run_pod_optionE.sh 42 +``` + +Full submission.json, train_gpt.py (lzma+base85-wrapped), 3 train logs, and +3 eval logs (with full headline traces) are in this folder. diff --git a/records/track_10min_16mb/2026-04-30_OptE_SlidingWindow_CondPPM/eval_seed1337.log b/records/track_10min_16mb/2026-04-30_OptE_SlidingWindow_CondPPM/eval_seed1337.log new file mode 100644 index 0000000000..35108e6050 --- /dev/null +++ b/records/track_10min_16mb/2026-04-30_OptE_SlidingWindow_CondPPM/eval_seed1337.log @@ -0,0 +1,185 @@ +W0430 21:33:28.828000 571190 torch/distributed/run.py:774] +W0430 21:33:28.828000 571190 torch/distributed/run.py:774] ***************************************** +W0430 21:33:28.828000 571190 torch/distributed/run.py:774] Setting OMP_NUM_THREADS environment variable for each process to be 1 in default, to avoid your system being overloaded, please further tune the variable for optimal performance in your application as needed. +W0430 21:33:28.828000 571190 torch/distributed/run.py:774] ***************************************** +Hyperparameters: + adam_eps: 1e-08 + adam_wd: 0.02 + artifact_dir: + attn_clip_sigmas: 13.0 + attn_out_gate_enabled: False + attn_out_gate_src: proj + awq_lite_bits: 8 + awq_lite_enabled: False + awq_lite_group_size: 64 + awq_lite_group_top_k: 1 + beta1: 0.9 + beta2: 0.95 + caseops_enabled: True + compressor: brotli + datasets_dir: /dev/shm/pgdata/datasets/fineweb10B_sp8192_lossless_caps_caseops_v1_reserved + distributed: True + ema_decay: 0.9965 + embed_bits: 6 + embed_clip_sigmas: 15.0 + embed_lr: 0.6 + embed_wd: 0.085 + enable_looping_at: 0.65 + eval_seq_len: 2048 + eval_stride: 64 + fused_ce_enabled: True + gate_window: 12 + gated_attn_enabled: False + gated_attn_init_std: 0.01 + gated_attn_quant_gate: False + global_ttt_batch_seqs: 32 + global_ttt_chunk_tokens: 32768 + global_ttt_epochs: 1 + global_ttt_grad_clip: 1.0 + global_ttt_lr: 0.001 + global_ttt_momentum: 0.9 + global_ttt_respect_doc_boundaries: True + global_ttt_warmup_chunks: 0 + global_ttt_warmup_start_lr: 0.0 + gptq_calibration_batches: 16 + gptq_reserve_seconds: 0.5 + grad_accum_steps: 1 + grad_clip_norm: 0.3 + is_main_process: True + iterations: 20000 + jepa_aux_weight: 0.0 + ln_scale: True + local_rank: 0 + logfile: logs/E_diag_seed1337.txt + logit_softcap: 30.0 + loop_end: 5 + loop_start: 3 + lqer_asym_enabled: True + lqer_asym_group: 64 + lqer_enabled: True + lqer_factor_bits: 4 + lqer_rank: 4 + lqer_top_k: 3 + macaron_enabled: False + matrix_bits: 6 + matrix_clip_sigmas: 11.5 + matrix_lr: 0.026 + max_wallclock_seconds: 600.0 + min_lr: 0.1 + mlp_clip_sigmas: 11.5 + mlp_mult: 4.0 + model_dim: 512 + model_path: final_model.pt + mtp_heads: 0 + mtp_weight: 0.3 + multi_exit_aux_weight: 0.1 + multi_exit_enabled: False + multi_exit_layers: 4,6,8 + multi_exit_mix_lr: 0.05 + multi_exit_mix_steps: 80 + muon_backend_steps: 5 + muon_momentum: 0.97 + muon_momentum_warmup_start: 0.92 + muon_momentum_warmup_steps: 1500 + muon_row_normalize: True + muon_wd: 0.095 + num_heads: 8 + num_kv_heads: 4 + num_layers: 11 + num_loops: 2 + parallel_final_lane: mean + parallel_start_layer: 5 + phased_ttt_num_phases: 1 + phased_ttt_prefix_docs: 2000 + ppm_byte_conditional_alpha: 15.0 + ppm_byte_conditional_beta: 0.8 + ppm_byte_conditional_enabled: True + ppm_conf_threshold: 0.9 + ppm_enabled: True + ppm_gate_mode: binary + ppm_lambda_hi: 0.9 + ppm_lambda_lo: 0.05 + ppm_mix_level: byte + ppm_order: 5 + ppm_sigmoid_alpha: 15.0 + ppm_sigmoid_beta: 0.8 + ppm_subset_tokens: 5000000 + ppm_token_conf_aggregate: mean + prequant_ttt_batch_seqs: 32 + prequant_ttt_beta1: 0.9 + prequant_ttt_beta2: 0.999 + prequant_ttt_chunk_tokens: 32768 + prequant_ttt_compile: True + prequant_ttt_enabled: False + prequant_ttt_epochs: 21 + prequant_ttt_fedavg_weights: True + prequant_ttt_grad_clip: 1.0 + prequant_ttt_lr: 0.0005 + prequant_ttt_lr_final: 5e-05 + prequant_ttt_optimizer: adamw + prequant_ttt_weight_decay: 0.0 + qk_gain_init: 5.0 + quantized_model_path: final_model.int6.ptz + rank: 0 + rope_base: 10000.0 + rope_dims: 16 + rope_train_seq_len: 2048 + rope_yarn: False + run_id: E_diag_seed1337 + scalar_lr: 0.02 + seed: 1337 + skip_gates_enabled: True + sliding_window_batch_seqs: 8 + sliding_window_enabled: True + smear_gate_enabled: True + sparse_attn_gate_enabled: True + sparse_attn_gate_init_std: 0.0 + sparse_attn_gate_scale: 1.0 + stoch_depth_max: 0.02 + stoch_depth_schedule: linear + temp_cal_enabled: False + temp_cal_lr: 0.1 + temp_cal_steps: 50 + tie_embeddings: True + tied_embed_init_std: 0.005 + tied_embed_lr: 0.03 + tokenizer_path: ./tokenizers/fineweb_8192_bpe_lossless_caps_caseops_v1_reserved.model + train_batch_tokens: 786432 + train_files: /dev/shm/pgdata/datasets/fineweb10B_sp8192_lossless_caps_caseops_v1_reserved/fineweb_train_*.bin + train_log_every: 500 + train_seq_len: 2048 + ttt_batch_size: 64 + ttt_beta1: 0.0 + ttt_beta2: 0.999 + ttt_chunk_size: 48 + ttt_enabled: False + ttt_eval_batches: + ttt_eval_seq_len: 2048 + ttt_grad_steps: 1 + ttt_k_lora: True + ttt_lora_lr: 0.0001 + ttt_lora_rank: 96 + ttt_mlp_lora: True + ttt_o_lora: True + ttt_optimizer: adam + ttt_weight_decay: 1.0 + val_batch_tokens: 524288 + val_bytes_files: /dev/shm/pgdata/datasets/fineweb10B_sp8192_lossless_caps_caseops_v1_reserved/fineweb_val_bytes_*.bin + val_doc_fraction: 1.0 + val_files: /dev/shm/pgdata/datasets/fineweb10B_sp8192_lossless_caps_caseops_v1_reserved/fineweb_val_*.bin + val_loss_every: 4000 + vocab_size: 8192 + warmdown_frac: 0.85 + warmup_steps: 20 + world_size: 8 + xsa_last_n: 11 +train_shards: 97 +val_tokens: 9662464 +TTT_EVAL_ONLY=1 — skipping training + GPTQ, loading saved artifact for TTT eval +ttt_lora_alpha: 144.0 +ttt_warm_start_a: True +ttt_weight_decay: 1.0 +diagnostic quantized val_loss:2.53510501 val_bpb:1.17950692 eval_time:7742ms +cond_ppm tokens=1209536 bytes=4100798 cond_mix_bpb=1.030048 alpha=15.0 beta=0.8 +quantized_cond_ppm val_loss:2.42065208 val_bpb:1.03004769 +quantized_sliding_window val_loss:2.54720247 val_bpb:1.18554462 eval_time:64849ms diff --git a/records/track_10min_16mb/2026-04-30_OptE_SlidingWindow_CondPPM/eval_seed314.log b/records/track_10min_16mb/2026-04-30_OptE_SlidingWindow_CondPPM/eval_seed314.log new file mode 100644 index 0000000000..abec273f62 --- /dev/null +++ b/records/track_10min_16mb/2026-04-30_OptE_SlidingWindow_CondPPM/eval_seed314.log @@ -0,0 +1,185 @@ +W0430 21:35:01.881000 571968 torch/distributed/run.py:774] +W0430 21:35:01.881000 571968 torch/distributed/run.py:774] ***************************************** +W0430 21:35:01.881000 571968 torch/distributed/run.py:774] Setting OMP_NUM_THREADS environment variable for each process to be 1 in default, to avoid your system being overloaded, please further tune the variable for optimal performance in your application as needed. +W0430 21:35:01.881000 571968 torch/distributed/run.py:774] ***************************************** +Hyperparameters: + adam_eps: 1e-08 + adam_wd: 0.02 + artifact_dir: + attn_clip_sigmas: 13.0 + attn_out_gate_enabled: False + attn_out_gate_src: proj + awq_lite_bits: 8 + awq_lite_enabled: False + awq_lite_group_size: 64 + awq_lite_group_top_k: 1 + beta1: 0.9 + beta2: 0.95 + caseops_enabled: True + compressor: brotli + datasets_dir: /dev/shm/pgdata/datasets/fineweb10B_sp8192_lossless_caps_caseops_v1_reserved + distributed: True + ema_decay: 0.9965 + embed_bits: 6 + embed_clip_sigmas: 15.0 + embed_lr: 0.6 + embed_wd: 0.085 + enable_looping_at: 0.65 + eval_seq_len: 2048 + eval_stride: 64 + fused_ce_enabled: True + gate_window: 12 + gated_attn_enabled: False + gated_attn_init_std: 0.01 + gated_attn_quant_gate: False + global_ttt_batch_seqs: 32 + global_ttt_chunk_tokens: 32768 + global_ttt_epochs: 1 + global_ttt_grad_clip: 1.0 + global_ttt_lr: 0.001 + global_ttt_momentum: 0.9 + global_ttt_respect_doc_boundaries: True + global_ttt_warmup_chunks: 0 + global_ttt_warmup_start_lr: 0.0 + gptq_calibration_batches: 16 + gptq_reserve_seconds: 0.5 + grad_accum_steps: 1 + grad_clip_norm: 0.3 + is_main_process: True + iterations: 20000 + jepa_aux_weight: 0.0 + ln_scale: True + local_rank: 0 + logfile: logs/E_diag_seed314.txt + logit_softcap: 30.0 + loop_end: 5 + loop_start: 3 + lqer_asym_enabled: True + lqer_asym_group: 64 + lqer_enabled: True + lqer_factor_bits: 4 + lqer_rank: 4 + lqer_top_k: 3 + macaron_enabled: False + matrix_bits: 6 + matrix_clip_sigmas: 11.5 + matrix_lr: 0.026 + max_wallclock_seconds: 600.0 + min_lr: 0.1 + mlp_clip_sigmas: 11.5 + mlp_mult: 4.0 + model_dim: 512 + model_path: final_model.pt + mtp_heads: 0 + mtp_weight: 0.3 + multi_exit_aux_weight: 0.1 + multi_exit_enabled: False + multi_exit_layers: 4,6,8 + multi_exit_mix_lr: 0.05 + multi_exit_mix_steps: 80 + muon_backend_steps: 5 + muon_momentum: 0.97 + muon_momentum_warmup_start: 0.92 + muon_momentum_warmup_steps: 1500 + muon_row_normalize: True + muon_wd: 0.095 + num_heads: 8 + num_kv_heads: 4 + num_layers: 11 + num_loops: 2 + parallel_final_lane: mean + parallel_start_layer: 5 + phased_ttt_num_phases: 1 + phased_ttt_prefix_docs: 2000 + ppm_byte_conditional_alpha: 15.0 + ppm_byte_conditional_beta: 0.8 + ppm_byte_conditional_enabled: True + ppm_conf_threshold: 0.9 + ppm_enabled: True + ppm_gate_mode: binary + ppm_lambda_hi: 0.9 + ppm_lambda_lo: 0.05 + ppm_mix_level: byte + ppm_order: 5 + ppm_sigmoid_alpha: 15.0 + ppm_sigmoid_beta: 0.8 + ppm_subset_tokens: 5000000 + ppm_token_conf_aggregate: mean + prequant_ttt_batch_seqs: 32 + prequant_ttt_beta1: 0.9 + prequant_ttt_beta2: 0.999 + prequant_ttt_chunk_tokens: 32768 + prequant_ttt_compile: True + prequant_ttt_enabled: False + prequant_ttt_epochs: 21 + prequant_ttt_fedavg_weights: True + prequant_ttt_grad_clip: 1.0 + prequant_ttt_lr: 0.0005 + prequant_ttt_lr_final: 5e-05 + prequant_ttt_optimizer: adamw + prequant_ttt_weight_decay: 0.0 + qk_gain_init: 5.0 + quantized_model_path: final_model.int6.ptz + rank: 0 + rope_base: 10000.0 + rope_dims: 16 + rope_train_seq_len: 2048 + rope_yarn: False + run_id: E_diag_seed314 + scalar_lr: 0.02 + seed: 314 + skip_gates_enabled: True + sliding_window_batch_seqs: 8 + sliding_window_enabled: True + smear_gate_enabled: True + sparse_attn_gate_enabled: True + sparse_attn_gate_init_std: 0.0 + sparse_attn_gate_scale: 1.0 + stoch_depth_max: 0.02 + stoch_depth_schedule: linear + temp_cal_enabled: False + temp_cal_lr: 0.1 + temp_cal_steps: 50 + tie_embeddings: True + tied_embed_init_std: 0.005 + tied_embed_lr: 0.03 + tokenizer_path: ./tokenizers/fineweb_8192_bpe_lossless_caps_caseops_v1_reserved.model + train_batch_tokens: 786432 + train_files: /dev/shm/pgdata/datasets/fineweb10B_sp8192_lossless_caps_caseops_v1_reserved/fineweb_train_*.bin + train_log_every: 500 + train_seq_len: 2048 + ttt_batch_size: 64 + ttt_beta1: 0.0 + ttt_beta2: 0.999 + ttt_chunk_size: 48 + ttt_enabled: False + ttt_eval_batches: + ttt_eval_seq_len: 2048 + ttt_grad_steps: 1 + ttt_k_lora: True + ttt_lora_lr: 0.0001 + ttt_lora_rank: 96 + ttt_mlp_lora: True + ttt_o_lora: True + ttt_optimizer: adam + ttt_weight_decay: 1.0 + val_batch_tokens: 524288 + val_bytes_files: /dev/shm/pgdata/datasets/fineweb10B_sp8192_lossless_caps_caseops_v1_reserved/fineweb_val_bytes_*.bin + val_doc_fraction: 1.0 + val_files: /dev/shm/pgdata/datasets/fineweb10B_sp8192_lossless_caps_caseops_v1_reserved/fineweb_val_*.bin + val_loss_every: 4000 + vocab_size: 8192 + warmdown_frac: 0.85 + warmup_steps: 20 + world_size: 8 + xsa_last_n: 11 +train_shards: 97 +val_tokens: 9662464 +TTT_EVAL_ONLY=1 — skipping training + GPTQ, loading saved artifact for TTT eval +ttt_lora_alpha: 144.0 +ttt_warm_start_a: True +ttt_weight_decay: 1.0 +diagnostic quantized val_loss:2.53463309 val_bpb:1.17928735 eval_time:7320ms +cond_ppm tokens=1209536 bytes=4100798 cond_mix_bpb=1.029314 alpha=15.0 beta=0.8 +quantized_cond_ppm val_loss:2.41892862 val_bpb:1.02931432 +quantized_sliding_window val_loss:2.54444884 val_bpb:1.18426300 eval_time:74149ms diff --git a/records/track_10min_16mb/2026-04-30_OptE_SlidingWindow_CondPPM/eval_seed42.log b/records/track_10min_16mb/2026-04-30_OptE_SlidingWindow_CondPPM/eval_seed42.log new file mode 100644 index 0000000000..c7fbabc412 --- /dev/null +++ b/records/track_10min_16mb/2026-04-30_OptE_SlidingWindow_CondPPM/eval_seed42.log @@ -0,0 +1,185 @@ +W0430 21:30:35.825000 558463 torch/distributed/run.py:774] +W0430 21:30:35.825000 558463 torch/distributed/run.py:774] ***************************************** +W0430 21:30:35.825000 558463 torch/distributed/run.py:774] Setting OMP_NUM_THREADS environment variable for each process to be 1 in default, to avoid your system being overloaded, please further tune the variable for optimal performance in your application as needed. +W0430 21:30:35.825000 558463 torch/distributed/run.py:774] ***************************************** +Hyperparameters: + adam_eps: 1e-08 + adam_wd: 0.02 + artifact_dir: + attn_clip_sigmas: 13.0 + attn_out_gate_enabled: False + attn_out_gate_src: proj + awq_lite_bits: 8 + awq_lite_enabled: False + awq_lite_group_size: 64 + awq_lite_group_top_k: 1 + beta1: 0.9 + beta2: 0.95 + caseops_enabled: True + compressor: brotli + datasets_dir: /dev/shm/pgdata/datasets/fineweb10B_sp8192_lossless_caps_caseops_v1_reserved + distributed: True + ema_decay: 0.9965 + embed_bits: 6 + embed_clip_sigmas: 15.0 + embed_lr: 0.6 + embed_wd: 0.085 + enable_looping_at: 0.65 + eval_seq_len: 2048 + eval_stride: 64 + fused_ce_enabled: True + gate_window: 12 + gated_attn_enabled: False + gated_attn_init_std: 0.01 + gated_attn_quant_gate: False + global_ttt_batch_seqs: 32 + global_ttt_chunk_tokens: 32768 + global_ttt_epochs: 1 + global_ttt_grad_clip: 1.0 + global_ttt_lr: 0.001 + global_ttt_momentum: 0.9 + global_ttt_respect_doc_boundaries: True + global_ttt_warmup_chunks: 0 + global_ttt_warmup_start_lr: 0.0 + gptq_calibration_batches: 16 + gptq_reserve_seconds: 0.5 + grad_accum_steps: 1 + grad_clip_norm: 0.3 + is_main_process: True + iterations: 20000 + jepa_aux_weight: 0.0 + ln_scale: True + local_rank: 0 + logfile: logs/E_diag_seed42.txt + logit_softcap: 30.0 + loop_end: 5 + loop_start: 3 + lqer_asym_enabled: True + lqer_asym_group: 64 + lqer_enabled: True + lqer_factor_bits: 4 + lqer_rank: 4 + lqer_top_k: 3 + macaron_enabled: False + matrix_bits: 6 + matrix_clip_sigmas: 11.5 + matrix_lr: 0.026 + max_wallclock_seconds: 600.0 + min_lr: 0.1 + mlp_clip_sigmas: 11.5 + mlp_mult: 4.0 + model_dim: 512 + model_path: final_model.pt + mtp_heads: 0 + mtp_weight: 0.3 + multi_exit_aux_weight: 0.1 + multi_exit_enabled: False + multi_exit_layers: 4,6,8 + multi_exit_mix_lr: 0.05 + multi_exit_mix_steps: 80 + muon_backend_steps: 5 + muon_momentum: 0.97 + muon_momentum_warmup_start: 0.92 + muon_momentum_warmup_steps: 1500 + muon_row_normalize: True + muon_wd: 0.095 + num_heads: 8 + num_kv_heads: 4 + num_layers: 11 + num_loops: 2 + parallel_final_lane: mean + parallel_start_layer: 5 + phased_ttt_num_phases: 1 + phased_ttt_prefix_docs: 2000 + ppm_byte_conditional_alpha: 15.0 + ppm_byte_conditional_beta: 0.8 + ppm_byte_conditional_enabled: True + ppm_conf_threshold: 0.9 + ppm_enabled: True + ppm_gate_mode: binary + ppm_lambda_hi: 0.9 + ppm_lambda_lo: 0.05 + ppm_mix_level: byte + ppm_order: 5 + ppm_sigmoid_alpha: 15.0 + ppm_sigmoid_beta: 0.8 + ppm_subset_tokens: 5000000 + ppm_token_conf_aggregate: mean + prequant_ttt_batch_seqs: 32 + prequant_ttt_beta1: 0.9 + prequant_ttt_beta2: 0.999 + prequant_ttt_chunk_tokens: 32768 + prequant_ttt_compile: True + prequant_ttt_enabled: False + prequant_ttt_epochs: 21 + prequant_ttt_fedavg_weights: True + prequant_ttt_grad_clip: 1.0 + prequant_ttt_lr: 0.0005 + prequant_ttt_lr_final: 5e-05 + prequant_ttt_optimizer: adamw + prequant_ttt_weight_decay: 0.0 + qk_gain_init: 5.0 + quantized_model_path: final_model.int6.ptz + rank: 0 + rope_base: 10000.0 + rope_dims: 16 + rope_train_seq_len: 2048 + rope_yarn: False + run_id: E_diag_seed42 + scalar_lr: 0.02 + seed: 42 + skip_gates_enabled: True + sliding_window_batch_seqs: 8 + sliding_window_enabled: True + smear_gate_enabled: True + sparse_attn_gate_enabled: True + sparse_attn_gate_init_std: 0.0 + sparse_attn_gate_scale: 1.0 + stoch_depth_max: 0.02 + stoch_depth_schedule: linear + temp_cal_enabled: False + temp_cal_lr: 0.1 + temp_cal_steps: 50 + tie_embeddings: True + tied_embed_init_std: 0.005 + tied_embed_lr: 0.03 + tokenizer_path: ./tokenizers/fineweb_8192_bpe_lossless_caps_caseops_v1_reserved.model + train_batch_tokens: 786432 + train_files: /dev/shm/pgdata/datasets/fineweb10B_sp8192_lossless_caps_caseops_v1_reserved/fineweb_train_*.bin + train_log_every: 500 + train_seq_len: 2048 + ttt_batch_size: 64 + ttt_beta1: 0.0 + ttt_beta2: 0.999 + ttt_chunk_size: 48 + ttt_enabled: False + ttt_eval_batches: + ttt_eval_seq_len: 2048 + ttt_grad_steps: 1 + ttt_k_lora: True + ttt_lora_lr: 0.0001 + ttt_lora_rank: 96 + ttt_mlp_lora: True + ttt_o_lora: True + ttt_optimizer: adam + ttt_weight_decay: 1.0 + val_batch_tokens: 524288 + val_bytes_files: /dev/shm/pgdata/datasets/fineweb10B_sp8192_lossless_caps_caseops_v1_reserved/fineweb_val_bytes_*.bin + val_doc_fraction: 1.0 + val_files: /dev/shm/pgdata/datasets/fineweb10B_sp8192_lossless_caps_caseops_v1_reserved/fineweb_val_*.bin + val_loss_every: 4000 + vocab_size: 8192 + warmdown_frac: 0.85 + warmup_steps: 20 + world_size: 8 + xsa_last_n: 11 +train_shards: 97 +val_tokens: 9662464 +TTT_EVAL_ONLY=1 — skipping training + GPTQ, loading saved artifact for TTT eval +ttt_lora_alpha: 144.0 +ttt_warm_start_a: True +ttt_weight_decay: 1.0 +diagnostic quantized val_loss:2.53280955 val_bpb:1.17843892 eval_time:6964ms +cond_ppm tokens=1209536 bytes=4100798 cond_mix_bpb=1.028485 alpha=15.0 beta=0.8 +quantized_cond_ppm val_loss:2.41698002 val_bpb:1.02848514 +quantized_sliding_window val_loss:2.54280288 val_bpb:1.18349692 eval_time:92935ms diff --git a/records/track_10min_16mb/2026-04-30_OptE_SlidingWindow_CondPPM/final_model.int6.ptz b/records/track_10min_16mb/2026-04-30_OptE_SlidingWindow_CondPPM/final_model.int6.ptz new file mode 100644 index 0000000000..d0000f4d1c Binary files /dev/null and b/records/track_10min_16mb/2026-04-30_OptE_SlidingWindow_CondPPM/final_model.int6.ptz differ diff --git a/records/track_10min_16mb/2026-04-30_OptE_SlidingWindow_CondPPM/lossless_caps.py b/records/track_10min_16mb/2026-04-30_OptE_SlidingWindow_CondPPM/lossless_caps.py new file mode 100644 index 0000000000..98e472f824 --- /dev/null +++ b/records/track_10min_16mb/2026-04-30_OptE_SlidingWindow_CondPPM/lossless_caps.py @@ -0,0 +1,833 @@ +"""Lossless capitalization pre-encoding helpers. + +This module provides a narrow, reversible transform that only touches +ASCII capital letters `A-Z`. Each uppercase ASCII letter is rewritten as +``, where `sentinel` is a private-use Unicode +character that is escaped by doubling if it appears literally in the +input text. + +Example with the default sentinel `\\uE000`: + + "The NASA Launch" -> "\\uE000the \\uE000n\\uE000a\\uE000s\\uE000a \\uE000launch" + +The transform is intentionally simple for v1: + +- lowercase ASCII letters are unchanged +- uppercase ASCII letters become sentinel + lowercase letter +- non-ASCII characters are left untouched +- literal sentinel characters are escaped as sentinel + sentinel + +This makes the transform exactly invertible while allowing a downstream +tokenizer to reuse lowercase subwords across case variants. +""" + +from __future__ import annotations + +import json +from pathlib import Path +from typing import Callable, Iterable + +LOSSLESS_CAPS_V1 = "lossless_caps_v1" +LOSSLESS_CAPS_V2 = "lossless_caps_v2" +LOSSLESS_CAPS_V3 = "lossless_caps_v3" +LOSSLESS_CAPS_V4 = "lossless_caps_v4" +LOSSLESS_CAPS_V5 = "lossless_caps_v5" +LOSSLESS_CAPS_V6 = "lossless_caps_v6" +LOSSLESS_CAPS_V7 = "lossless_caps_v7" +LOSSLESS_CAPS_CASEOPS_V1 = "lossless_caps_caseops_v1" +IDENTITY = "identity" +DEFAULT_SENTINEL = "\uE000" +DEFAULT_V2_TITLE = "\uE001" +DEFAULT_V2_ALLCAPS = "\uE002" +DEFAULT_V2_CAPNEXT = "\uE003" +DEFAULT_V2_ESC = "\uE004" +DEFAULT_V5_TITLE_MIN_LEN = 7 +DEFAULT_V6_ALLCAPS_MIN_LEN = 3 +DEFAULT_V7_ALLCAPS_MIN_LEN = 4 + + +class LosslessCapsError(ValueError): + """Raised when a transformed string is malformed.""" + + +def _is_ascii_upper(ch: str) -> bool: + return "A" <= ch <= "Z" + + +def _is_ascii_lower(ch: str) -> bool: + return "a" <= ch <= "z" + + +def _is_ascii_alpha(ch: str) -> bool: + return _is_ascii_lower(ch) or _is_ascii_upper(ch) + + +def _validate_distinct_single_chars(*chars: str) -> None: + if any(len(ch) != 1 for ch in chars): + raise ValueError("all control characters must be exactly one character") + if len(set(chars)) != len(chars): + raise ValueError("control characters must be distinct") + + +def encode_lossless_caps_v1(text: str, *, sentinel: str = DEFAULT_SENTINEL) -> str: + """Encode ASCII capitals reversibly using a one-character sentinel.""" + if len(sentinel) != 1: + raise ValueError("sentinel must be exactly one character") + out: list[str] = [] + for ch in text: + if ch == sentinel: + out.append(sentinel) + out.append(sentinel) + elif _is_ascii_upper(ch): + out.append(sentinel) + out.append(ch.lower()) + else: + out.append(ch) + return "".join(out) + + +def decode_lossless_caps_v1(text: str, *, sentinel: str = DEFAULT_SENTINEL) -> str: + """Decode the `lossless_caps_v1` transform back to the original text.""" + if len(sentinel) != 1: + raise ValueError("sentinel must be exactly one character") + out: list[str] = [] + i = 0 + n = len(text) + while i < n: + ch = text[i] + if ch != sentinel: + out.append(ch) + i += 1 + continue + if i + 1 >= n: + raise LosslessCapsError("dangling capitalization sentinel at end of string") + nxt = text[i + 1] + if nxt == sentinel: + out.append(sentinel) + elif _is_ascii_lower(nxt): + out.append(nxt.upper()) + else: + raise LosslessCapsError( + f"invalid sentinel escape sequence {sentinel + nxt!r}; " + "expected doubled sentinel or sentinel + lowercase ASCII letter" + ) + i += 2 + return "".join(out) + + +def encode_lossless_caps_v2( + text: str, + *, + title: str = DEFAULT_V2_TITLE, + allcaps: str = DEFAULT_V2_ALLCAPS, + capnext: str = DEFAULT_V2_CAPNEXT, + esc: str = DEFAULT_V2_ESC, +) -> str: + """Encode ASCII word capitalization with cheap word-level markers. + + Rules over maximal ASCII alphabetic runs: + - lowercase words stay unchanged + - TitleCase words become `title + lowercase(word)` + - ALLCAPS words become `allcaps + lowercase(word)` + - mixed-case words use: + - optional `title` when the first letter is uppercase + - `capnext + lowercase(letter)` for subsequent uppercase letters + - literal control characters are escaped as `esc + literal` + """ + _validate_distinct_single_chars(title, allcaps, capnext, esc) + controls = {title, allcaps, capnext, esc} + out: list[str] = [] + i = 0 + n = len(text) + while i < n: + ch = text[i] + if ch in controls: + out.append(esc) + out.append(ch) + i += 1 + continue + if not _is_ascii_alpha(ch): + out.append(ch) + i += 1 + continue + + j = i + 1 + while j < n and _is_ascii_alpha(text[j]): + j += 1 + word = text[i:j] + lower_word = word.lower() + + if word.islower(): + out.append(word) + elif len(word) >= 2 and word.isupper(): + out.append(allcaps) + out.append(lower_word) + elif _is_ascii_upper(word[0]) and word[1:].islower(): + out.append(title) + out.append(lower_word) + else: + if _is_ascii_upper(word[0]): + out.append(title) + out.append(lower_word[0]) + for orig_ch, lower_ch in zip(word[1:], lower_word[1:], strict=True): + if _is_ascii_upper(orig_ch): + out.append(capnext) + out.append(lower_ch) + i = j + return "".join(out) + + +def decode_lossless_caps_v2( + text: str, + *, + title: str = DEFAULT_V2_TITLE, + allcaps: str = DEFAULT_V2_ALLCAPS, + capnext: str = DEFAULT_V2_CAPNEXT, + esc: str = DEFAULT_V2_ESC, +) -> str: + """Decode the `lossless_caps_v2` transform back to the original text.""" + _validate_distinct_single_chars(title, allcaps, capnext, esc) + out: list[str] = [] + pending_escape = False + pending_word_mode: str | None = None + active_allcaps = False + pending_capnext = False + in_ascii_word = False + + for ch in text: + if pending_escape: + if pending_word_mode is not None and not _is_ascii_alpha(ch): + raise LosslessCapsError("escaped control char cannot satisfy pending word capitalization mode") + out.append(ch) + pending_escape = False + if _is_ascii_alpha(ch): + in_ascii_word = True + else: + in_ascii_word = False + active_allcaps = False + continue + + if ch == esc: + pending_escape = True + continue + if ch == title: + if pending_word_mode is not None or in_ascii_word or pending_capnext: + raise LosslessCapsError("invalid title marker placement") + pending_word_mode = "title" + continue + if ch == allcaps: + if pending_word_mode is not None or in_ascii_word or pending_capnext: + raise LosslessCapsError("invalid allcaps marker placement") + pending_word_mode = "allcaps" + continue + if ch == capnext: + if pending_capnext: + raise LosslessCapsError("duplicate capnext marker") + pending_capnext = True + continue + + if _is_ascii_alpha(ch): + at_word_start = not in_ascii_word + if at_word_start: + if pending_word_mode == "allcaps": + out.append(ch.upper()) + active_allcaps = True + elif pending_word_mode == "title": + out.append(ch.upper()) + elif pending_capnext: + out.append(ch.upper()) + else: + out.append(ch) + pending_word_mode = None + pending_capnext = False + in_ascii_word = True + continue + + if pending_word_mode is not None: + raise LosslessCapsError("word capitalization marker leaked into the middle of a word") + if active_allcaps: + out.append(ch.upper()) + elif pending_capnext: + out.append(ch.upper()) + else: + out.append(ch) + pending_capnext = False + continue + + if pending_word_mode is not None or pending_capnext: + raise LosslessCapsError("capitalization marker not followed by an ASCII letter") + out.append(ch) + in_ascii_word = False + active_allcaps = False + + if pending_escape: + raise LosslessCapsError("dangling escape marker at end of string") + if pending_word_mode is not None or pending_capnext: + raise LosslessCapsError("dangling capitalization marker at end of string") + return "".join(out) + + +def encode_lossless_caps_v3( + text: str, + *, + title: str = DEFAULT_V2_TITLE, + allcaps: str = DEFAULT_V2_ALLCAPS, + esc: str = DEFAULT_V2_ESC, +) -> str: + """Encode only common word-level capitalization patterns. + + Rules over maximal ASCII alphabetic runs: + - lowercase words stay unchanged + - TitleCase words become `title + lowercase(word)` + - ALLCAPS words become `allcaps + lowercase(word)` + - all other mixed-case words are left unchanged + - literal control characters are escaped as `esc + literal` + """ + _validate_distinct_single_chars(title, allcaps, esc) + controls = {title, allcaps, esc} + out: list[str] = [] + i = 0 + n = len(text) + while i < n: + ch = text[i] + if ch in controls: + out.append(esc) + out.append(ch) + i += 1 + continue + if not _is_ascii_alpha(ch): + out.append(ch) + i += 1 + continue + + j = i + 1 + while j < n and _is_ascii_alpha(text[j]): + j += 1 + word = text[i:j] + + if word.islower(): + out.append(word) + elif len(word) >= 2 and word.isupper(): + out.append(allcaps) + out.append(word.lower()) + elif _is_ascii_upper(word[0]) and word[1:].islower(): + out.append(title) + out.append(word.lower()) + else: + out.append(word) + i = j + return "".join(out) + + +def decode_lossless_caps_v3( + text: str, + *, + title: str = DEFAULT_V2_TITLE, + allcaps: str = DEFAULT_V2_ALLCAPS, + esc: str = DEFAULT_V2_ESC, +) -> str: + """Decode the `lossless_caps_v3` transform back to the original text.""" + _validate_distinct_single_chars(title, allcaps, esc) + out: list[str] = [] + pending_escape = False + pending_word_mode: str | None = None + active_allcaps = False + in_ascii_word = False + + for ch in text: + if pending_escape: + if pending_word_mode is not None and not _is_ascii_alpha(ch): + raise LosslessCapsError("escaped control char cannot satisfy pending word capitalization mode") + out.append(ch) + pending_escape = False + if _is_ascii_alpha(ch): + in_ascii_word = True + else: + in_ascii_word = False + active_allcaps = False + continue + + if ch == esc: + pending_escape = True + continue + if ch == title: + if pending_word_mode is not None or in_ascii_word: + raise LosslessCapsError("invalid title marker placement") + pending_word_mode = "title" + continue + if ch == allcaps: + if pending_word_mode is not None or in_ascii_word: + raise LosslessCapsError("invalid allcaps marker placement") + pending_word_mode = "allcaps" + continue + + if _is_ascii_alpha(ch): + at_word_start = not in_ascii_word + if at_word_start: + if pending_word_mode == "allcaps": + out.append(ch.upper()) + active_allcaps = True + elif pending_word_mode == "title": + out.append(ch.upper()) + else: + out.append(ch) + pending_word_mode = None + in_ascii_word = True + continue + + if pending_word_mode is not None: + raise LosslessCapsError("word capitalization marker leaked into the middle of a word") + out.append(ch.upper() if active_allcaps else ch) + continue + + if pending_word_mode is not None: + raise LosslessCapsError("capitalization marker not followed by an ASCII letter") + out.append(ch) + in_ascii_word = False + active_allcaps = False + + if pending_escape: + raise LosslessCapsError("dangling escape marker at end of string") + if pending_word_mode is not None: + raise LosslessCapsError("dangling capitalization marker at end of string") + return "".join(out) + + +def encode_lossless_caps_v4( + text: str, + *, + allcaps: str = DEFAULT_V2_ALLCAPS, + esc: str = DEFAULT_V2_ESC, +) -> str: + """Encode only ALLCAPS ASCII words, leaving all other case untouched.""" + _validate_distinct_single_chars(allcaps, esc) + controls = {allcaps, esc} + out: list[str] = [] + i = 0 + n = len(text) + while i < n: + ch = text[i] + if ch in controls: + out.append(esc) + out.append(ch) + i += 1 + continue + if not _is_ascii_alpha(ch): + out.append(ch) + i += 1 + continue + j = i + 1 + while j < n and _is_ascii_alpha(text[j]): + j += 1 + word = text[i:j] + if len(word) >= 2 and word.isupper(): + out.append(allcaps) + out.append(word.lower()) + else: + out.append(word) + i = j + return "".join(out) + + +def decode_lossless_caps_v4( + text: str, + *, + allcaps: str = DEFAULT_V2_ALLCAPS, + esc: str = DEFAULT_V2_ESC, +) -> str: + """Decode the `lossless_caps_v4` transform back to the original text.""" + _validate_distinct_single_chars(allcaps, esc) + out: list[str] = [] + pending_escape = False + pending_allcaps = False + in_ascii_word = False + active_allcaps = False + + for ch in text: + if pending_escape: + if pending_allcaps and not _is_ascii_alpha(ch): + raise LosslessCapsError("escaped control char cannot satisfy pending allcaps mode") + out.append(ch) + pending_escape = False + if _is_ascii_alpha(ch): + in_ascii_word = True + else: + in_ascii_word = False + active_allcaps = False + continue + + if ch == esc: + pending_escape = True + continue + if ch == allcaps: + if pending_allcaps or in_ascii_word: + raise LosslessCapsError("invalid allcaps marker placement") + pending_allcaps = True + continue + + if _is_ascii_alpha(ch): + if not in_ascii_word: + active_allcaps = pending_allcaps + pending_allcaps = False + in_ascii_word = True + out.append(ch.upper() if active_allcaps else ch) + continue + + if pending_allcaps: + raise LosslessCapsError("allcaps marker not followed by an ASCII letter") + out.append(ch) + in_ascii_word = False + active_allcaps = False + + if pending_escape: + raise LosslessCapsError("dangling escape marker at end of string") + if pending_allcaps: + raise LosslessCapsError("dangling allcaps marker at end of string") + return "".join(out) + + +def encode_lossless_caps_v5( + text: str, + *, + title: str = DEFAULT_V2_TITLE, + allcaps: str = DEFAULT_V2_ALLCAPS, + esc: str = DEFAULT_V2_ESC, + title_min_len: int = DEFAULT_V5_TITLE_MIN_LEN, +) -> str: + """Encode ALLCAPS words and only sufficiently long TitleCase words.""" + _validate_distinct_single_chars(title, allcaps, esc) + controls = {title, allcaps, esc} + out: list[str] = [] + i = 0 + n = len(text) + while i < n: + ch = text[i] + if ch in controls: + out.append(esc) + out.append(ch) + i += 1 + continue + if not _is_ascii_alpha(ch): + out.append(ch) + i += 1 + continue + j = i + 1 + while j < n and _is_ascii_alpha(text[j]): + j += 1 + word = text[i:j] + if len(word) >= 2 and word.isupper(): + out.append(allcaps) + out.append(word.lower()) + elif len(word) >= title_min_len and _is_ascii_upper(word[0]) and word[1:].islower(): + out.append(title) + out.append(word.lower()) + else: + out.append(word) + i = j + return "".join(out) + + +def decode_lossless_caps_v5( + text: str, + *, + title: str = DEFAULT_V2_TITLE, + allcaps: str = DEFAULT_V2_ALLCAPS, + esc: str = DEFAULT_V2_ESC, +) -> str: + """Decode the `lossless_caps_v5` transform back to the original text.""" + return decode_lossless_caps_v3(text, title=title, allcaps=allcaps, esc=esc) + + +def encode_lossless_caps_v6( + text: str, + *, + allcaps: str = DEFAULT_V2_ALLCAPS, + esc: str = DEFAULT_V2_ESC, + allcaps_min_len: int = DEFAULT_V6_ALLCAPS_MIN_LEN, +) -> str: + """Encode only ALLCAPS words with length >= allcaps_min_len.""" + _validate_distinct_single_chars(allcaps, esc) + controls = {allcaps, esc} + out: list[str] = [] + i = 0 + n = len(text) + while i < n: + ch = text[i] + if ch in controls: + out.append(esc) + out.append(ch) + i += 1 + continue + if not _is_ascii_alpha(ch): + out.append(ch) + i += 1 + continue + j = i + 1 + while j < n and _is_ascii_alpha(text[j]): + j += 1 + word = text[i:j] + if len(word) >= allcaps_min_len and word.isupper(): + out.append(allcaps) + out.append(word.lower()) + else: + out.append(word) + i = j + return "".join(out) + + +def decode_lossless_caps_v6( + text: str, + *, + allcaps: str = DEFAULT_V2_ALLCAPS, + esc: str = DEFAULT_V2_ESC, +) -> str: + """Decode the `lossless_caps_v6` transform back to the original text.""" + return decode_lossless_caps_v4(text, allcaps=allcaps, esc=esc) + + +def encode_lossless_caps_v7( + text: str, + *, + allcaps: str = DEFAULT_V2_ALLCAPS, + esc: str = DEFAULT_V2_ESC, + allcaps_min_len: int = DEFAULT_V7_ALLCAPS_MIN_LEN, +) -> str: + """Encode only ALLCAPS words with length >= 4.""" + return encode_lossless_caps_v6( + text, + allcaps=allcaps, + esc=esc, + allcaps_min_len=allcaps_min_len, + ) + + +def decode_lossless_caps_v7( + text: str, + *, + allcaps: str = DEFAULT_V2_ALLCAPS, + esc: str = DEFAULT_V2_ESC, +) -> str: + """Decode the `lossless_caps_v7` transform back to the original text.""" + return decode_lossless_caps_v6(text, allcaps=allcaps, esc=esc) + + +def get_text_transform(name: str | None) -> Callable[[str], str]: + """Return the forward text transform for the given config name.""" + normalized = IDENTITY if name in {None, "", IDENTITY} else str(name) + if normalized == IDENTITY: + return lambda text: text + if normalized == LOSSLESS_CAPS_V1: + return encode_lossless_caps_v1 + if normalized == LOSSLESS_CAPS_V2: + return encode_lossless_caps_v2 + if normalized == LOSSLESS_CAPS_V3: + return encode_lossless_caps_v3 + if normalized == LOSSLESS_CAPS_V4: + return encode_lossless_caps_v4 + if normalized == LOSSLESS_CAPS_V5: + return encode_lossless_caps_v5 + if normalized == LOSSLESS_CAPS_V6: + return encode_lossless_caps_v6 + if normalized == LOSSLESS_CAPS_V7: + return encode_lossless_caps_v7 + if normalized == LOSSLESS_CAPS_CASEOPS_V1: + return encode_lossless_caps_v2 + raise ValueError(f"unsupported text_transform={name!r}") + + +def get_text_inverse_transform(name: str | None) -> Callable[[str], str]: + """Return the inverse transform for the given config name.""" + normalized = IDENTITY if name in {None, "", IDENTITY} else str(name) + if normalized == IDENTITY: + return lambda text: text + if normalized == LOSSLESS_CAPS_V1: + return decode_lossless_caps_v1 + if normalized == LOSSLESS_CAPS_V2: + return decode_lossless_caps_v2 + if normalized == LOSSLESS_CAPS_V3: + return decode_lossless_caps_v3 + if normalized == LOSSLESS_CAPS_V4: + return decode_lossless_caps_v4 + if normalized == LOSSLESS_CAPS_V5: + return decode_lossless_caps_v5 + if normalized == LOSSLESS_CAPS_V6: + return decode_lossless_caps_v6 + if normalized == LOSSLESS_CAPS_V7: + return decode_lossless_caps_v7 + if normalized == LOSSLESS_CAPS_CASEOPS_V1: + return decode_lossless_caps_v2 + raise ValueError(f"unsupported text_transform={name!r}") + + +def normalize_text_transform_name(name: str | None) -> str: + """Normalize empty/None transform names to the identity transform.""" + return IDENTITY if name in {None, "", IDENTITY} else str(name) + + +def get_text_transform_control_symbols(name: str | None) -> list[str]: + """Return reserved control symbols used by a transform, if any.""" + normalized = normalize_text_transform_name(name) + if normalized == IDENTITY: + return [] + if normalized == LOSSLESS_CAPS_V1: + return [DEFAULT_SENTINEL] + if normalized == LOSSLESS_CAPS_V2: + return [DEFAULT_V2_TITLE, DEFAULT_V2_ALLCAPS, DEFAULT_V2_CAPNEXT, DEFAULT_V2_ESC] + if normalized == LOSSLESS_CAPS_CASEOPS_V1: + return [DEFAULT_V2_TITLE, DEFAULT_V2_ALLCAPS, DEFAULT_V2_CAPNEXT, DEFAULT_V2_ESC] + if normalized in {LOSSLESS_CAPS_V3, LOSSLESS_CAPS_V5}: + return [DEFAULT_V2_TITLE, DEFAULT_V2_ALLCAPS, DEFAULT_V2_ESC] + if normalized in {LOSSLESS_CAPS_V4, LOSSLESS_CAPS_V6, LOSSLESS_CAPS_V7}: + return [DEFAULT_V2_ALLCAPS, DEFAULT_V2_ESC] + raise ValueError(f"unsupported text_transform={name!r}") + + +def infer_text_transform_from_manifest(tokenizer_path: str | Path) -> str: + """Best-effort lookup of a tokenizer's text transform from a local manifest.""" + tokenizer_path = Path(tokenizer_path).expanduser().resolve() + manifest_candidates = [ + tokenizer_path.parent.parent / "manifest.json", + tokenizer_path.parent / "manifest.json", + ] + for manifest_path in manifest_candidates: + if not manifest_path.is_file(): + continue + try: + payload = json.loads(manifest_path.read_text(encoding="utf-8")) + except (OSError, json.JSONDecodeError): + continue + tokenizers = payload.get("tokenizers") + if not isinstance(tokenizers, list): + continue + for tokenizer_meta in tokenizers: + if not isinstance(tokenizer_meta, dict): + continue + model_path = tokenizer_meta.get("model_path") or tokenizer_meta.get("path") + if not model_path: + continue + candidate = (manifest_path.parent / str(model_path)).resolve() + if candidate == tokenizer_path: + return normalize_text_transform_name(tokenizer_meta.get("text_transform")) + return IDENTITY + + +def surface_piece_original_byte_counts( + surfaces: Iterable[str], + *, + text_transform_name: str | None = None, + sentinel: str = DEFAULT_SENTINEL, +) -> list[int]: + """Return exact original UTF-8 byte counts contributed by each surface piece. + + `surfaces` must be the exact decoded text fragments emitted by SentencePiece + in order, e.g. `piece.surface` from `encode_as_immutable_proto`. + """ + normalized = normalize_text_transform_name(text_transform_name) + if normalized == IDENTITY: + return [len(surface.encode("utf-8")) for surface in surfaces] + if normalized == LOSSLESS_CAPS_V1: + if len(sentinel) != 1: + raise ValueError("sentinel must be exactly one character") + sentinel_bytes = len(sentinel.encode("utf-8")) + pending_sentinel = False + counts: list[int] = [] + for surface in surfaces: + piece_bytes = 0 + for ch in surface: + if pending_sentinel: + if ch == sentinel: + piece_bytes += sentinel_bytes + elif _is_ascii_lower(ch): + piece_bytes += 1 + else: + raise LosslessCapsError( + f"invalid continuation {ch!r} after capitalization sentinel" + ) + pending_sentinel = False + continue + if ch == sentinel: + pending_sentinel = True + else: + piece_bytes += len(ch.encode("utf-8")) + counts.append(piece_bytes) + if pending_sentinel: + raise LosslessCapsError("dangling capitalization sentinel across piece boundary") + return counts + if normalized not in {LOSSLESS_CAPS_V2, LOSSLESS_CAPS_V3, LOSSLESS_CAPS_V4, LOSSLESS_CAPS_V5, LOSSLESS_CAPS_V6, LOSSLESS_CAPS_V7, LOSSLESS_CAPS_CASEOPS_V1}: + raise ValueError(f"unsupported text_transform={text_transform_name!r}") + + title = DEFAULT_V2_TITLE + allcaps = DEFAULT_V2_ALLCAPS + capnext = DEFAULT_V2_CAPNEXT + esc = DEFAULT_V2_ESC + if normalized in {LOSSLESS_CAPS_V2, LOSSLESS_CAPS_CASEOPS_V1}: + _validate_distinct_single_chars(title, allcaps, capnext, esc) + elif normalized in {LOSSLESS_CAPS_V4, LOSSLESS_CAPS_V6, LOSSLESS_CAPS_V7}: + _validate_distinct_single_chars(allcaps, esc) + else: + _validate_distinct_single_chars(title, allcaps, esc) + pending_escape = False + pending_word_mode: str | None = None + active_allcaps = False + pending_capnext = False + in_ascii_word = False + counts: list[int] = [] + for surface in surfaces: + piece_bytes = 0 + for ch in surface: + if pending_escape: + if pending_word_mode is not None and not _is_ascii_alpha(ch): + raise LosslessCapsError("escaped control char cannot satisfy pending word capitalization mode") + piece_bytes += len(ch.encode("utf-8")) + pending_escape = False + if _is_ascii_alpha(ch): + in_ascii_word = True + else: + in_ascii_word = False + active_allcaps = False + continue + if ch == esc: + pending_escape = True + continue + if normalized in {LOSSLESS_CAPS_V2, LOSSLESS_CAPS_V3, LOSSLESS_CAPS_V5, LOSSLESS_CAPS_CASEOPS_V1} and ch == title: + if pending_word_mode is not None or in_ascii_word or pending_capnext: + raise LosslessCapsError("invalid title marker placement") + pending_word_mode = "title" + continue + if ch == allcaps: + if pending_word_mode is not None or in_ascii_word or pending_capnext: + raise LosslessCapsError("invalid allcaps marker placement") + pending_word_mode = "allcaps" + continue + if normalized in {LOSSLESS_CAPS_V2, LOSSLESS_CAPS_CASEOPS_V1} and ch == capnext: + if pending_capnext: + raise LosslessCapsError("duplicate capnext marker") + pending_capnext = True + continue + + if _is_ascii_alpha(ch): + at_word_start = not in_ascii_word + if at_word_start: + piece_bytes += 1 + active_allcaps = pending_word_mode == "allcaps" + pending_word_mode = None + pending_capnext = False + in_ascii_word = True + continue + if pending_word_mode is not None: + raise LosslessCapsError("word capitalization marker leaked into the middle of a word") + piece_bytes += 1 + pending_capnext = False + continue + + if pending_word_mode is not None or pending_capnext: + raise LosslessCapsError("capitalization marker not followed by an ASCII letter") + piece_bytes += len(ch.encode("utf-8")) + in_ascii_word = False + active_allcaps = False + counts.append(piece_bytes) + if pending_escape: + raise LosslessCapsError("dangling escape marker across piece boundary") + if pending_word_mode is not None or pending_capnext: + raise LosslessCapsError("dangling capitalization marker across piece boundary") + return counts diff --git a/records/track_10min_16mb/2026-04-30_OptE_SlidingWindow_CondPPM/prepare_caseops_data.py b/records/track_10min_16mb/2026-04-30_OptE_SlidingWindow_CondPPM/prepare_caseops_data.py new file mode 100644 index 0000000000..1c893ef988 --- /dev/null +++ b/records/track_10min_16mb/2026-04-30_OptE_SlidingWindow_CondPPM/prepare_caseops_data.py @@ -0,0 +1,192 @@ +"""Parallel rebuild of prepare_caseops_data.py — multiprocessing tokenization. + +Bottleneck of the original is single-threaded SP encode + per-char byte +prefix-sum on val docs. With N workers via mp.Pool we get ~N× speedup. +On 28-vCPU pod, 16 workers cuts ~12h to ~45 min. + +Same CLI as prepare_caseops_data.py + extra --workers flag. +""" +import argparse +import json +import multiprocessing as mp +import os +import pathlib +import sys +from typing import Optional + +import numpy as np +import sentencepiece as spm + +sys.path.insert(0, os.path.dirname(os.path.abspath(__file__))) +from lossless_caps import ( + encode_lossless_caps_v2, + DEFAULT_V2_TITLE, + DEFAULT_V2_ALLCAPS, + DEFAULT_V2_CAPNEXT, + DEFAULT_V2_ESC, +) + +BOS_ID = 1 +SHARD_MAGIC = 20240520 +SHARD_VERSION = 1 +SHARD_TOKENS = 10_000_000 + +_LOSSLESS_V2_OPERATORS_CHARS: Optional[frozenset] = None +_worker_sp: Optional[spm.SentencePieceProcessor] = None + + +def _make_operators_set() -> frozenset: + return frozenset(( + DEFAULT_V2_TITLE, DEFAULT_V2_ALLCAPS, DEFAULT_V2_CAPNEXT, DEFAULT_V2_ESC, + )) + + +def _worker_init(sp_path: str) -> None: + global _worker_sp, _LOSSLESS_V2_OPERATORS_CHARS + _worker_sp = spm.SentencePieceProcessor(model_file=sp_path) + _LOSSLESS_V2_OPERATORS_CHARS = _make_operators_set() + + +def _byte_counts(transformed: str, piece_ids: list, pieces: list) -> np.ndarray: + n_chars = len(transformed) + prefix = np.zeros(n_chars + 1, dtype=np.int64) + running = 0 + ops = _LOSSLESS_V2_OPERATORS_CHARS + for idx, ch in enumerate(transformed): + if ch not in ops: + cp = ord(ch) + if cp < 0x80: + running += 1 + elif cp < 0x800: + running += 2 + elif cp < 0x10000: + running += 3 + else: + running += 4 + prefix[idx + 1] = running + counts = np.empty(len(piece_ids), dtype=np.uint16) + cursor_t = 0 + for i, piece in enumerate(pieces): + surface = piece.replace("▁", " ") + span_len = len(surface) + end = cursor_t + span_len + if end > n_chars: + end = n_chars + original_bytes = int(prefix[end] - prefix[cursor_t]) + cursor_t = end + counts[i] = max(0, min(65535, original_bytes)) + return counts + + +def _worker_process_doc(args: tuple) -> tuple: + """Worker: transform + tokenize one doc. Returns (doc_idx, token_ids, byte_counts_or_None).""" + doc_idx, text, is_val = args + sp = _worker_sp + transformed = encode_lossless_caps_v2(text) + piece_ids = sp.encode(transformed, out_type=int) + token_ids = [BOS_ID] + piece_ids + byte_counts = None + if is_val: + pieces = [sp.id_to_piece(int(pid)) for pid in piece_ids] + byte_counts = _byte_counts(transformed, piece_ids, pieces) + return doc_idx, token_ids, byte_counts + + +def _write_shard(path: pathlib.Path, arr: np.ndarray) -> None: + # 256 int32 header = 1024 bytes — matches kevclark/parameter-golf format + # and the load_data_shard expectations in train_gpt.py. + header = np.zeros(256, dtype=np.int32) + header[0] = SHARD_MAGIC + header[1] = SHARD_VERSION + header[2] = arr.size + with path.open("wb") as fh: + fh.write(header.tobytes()) + fh.write(arr.tobytes()) + + +def _iter_docs(docs_path: pathlib.Path): + with docs_path.open("r", encoding="utf-8") as fh: + for idx, line in enumerate(fh): + line = line.strip() + if not line: + continue + obj = json.loads(line) + yield idx, (obj["text"] if isinstance(obj, dict) else obj) + + +def main() -> None: + ap = argparse.ArgumentParser(description=__doc__, formatter_class=argparse.RawDescriptionHelpFormatter) + ap.add_argument("--docs", required=True, type=pathlib.Path) + ap.add_argument("--out", required=True, type=pathlib.Path) + ap.add_argument("--sp", required=True, type=pathlib.Path) + ap.add_argument("--val-docs", type=int, default=10_000) + ap.add_argument("--max-docs", type=int, default=0, help="Stop after N total docs (0 = process all). Limits disk usage.") + ap.add_argument("--workers", type=int, default=max(1, (os.cpu_count() or 8) - 4)) + ap.add_argument("--chunksize", type=int, default=64) + args = ap.parse_args() + + print(f"loading sp model: {args.sp}", flush=True) + sp_master = spm.SentencePieceProcessor(model_file=str(args.sp)) + print(f"loaded sp: vocab={sp_master.vocab_size()}", flush=True) + print(f"workers: {args.workers}", flush=True) + + train_out = args.out / "datasets" / "fineweb10B_sp8192_lossless_caps_caseops_v1_reserved" + train_out.mkdir(parents=True, exist_ok=True) + + val_buf_tokens: list = [] + val_buf_bytes: list = [] + train_buf: list = [] + val_written = 0 + train_written = 0 + n_docs = 0 + + def _build_args(): + for doc_idx, text in _iter_docs(args.docs): + if args.max_docs > 0 and doc_idx >= args.max_docs: + return + yield (doc_idx, text, doc_idx < args.val_docs) + + with mp.Pool(args.workers, initializer=_worker_init, initargs=(str(args.sp),)) as pool: + for doc_idx, token_ids, byte_counts in pool.imap( + _worker_process_doc, + _build_args(), + chunksize=args.chunksize, + ): + if doc_idx < args.val_docs: + val_buf_tokens.extend(token_ids) + val_buf_bytes.append(0) + val_buf_bytes.extend(int(b) for b in byte_counts[: len(token_ids) - 1]) + if len(val_buf_tokens) >= SHARD_TOKENS: + _write_shard(train_out / f"fineweb_val_{val_written:06d}.bin", + np.array(val_buf_tokens[:SHARD_TOKENS], dtype=np.uint16)) + _write_shard(train_out / f"fineweb_val_bytes_{val_written:06d}.bin", + np.array(val_buf_bytes[:SHARD_TOKENS], dtype=np.uint16)) + val_buf_tokens = val_buf_tokens[SHARD_TOKENS:] + val_buf_bytes = val_buf_bytes[SHARD_TOKENS:] + val_written += 1 + else: + train_buf.extend(token_ids) + if len(train_buf) >= SHARD_TOKENS: + _write_shard(train_out / f"fineweb_train_{train_written:06d}.bin", + np.array(train_buf[:SHARD_TOKENS], dtype=np.uint16)) + train_buf = train_buf[SHARD_TOKENS:] + train_written += 1 + n_docs += 1 + if n_docs % 10_000 == 0: + print(f" processed {n_docs} docs train_shards={train_written} val_shards={val_written}", flush=True) + + if val_buf_tokens: + _write_shard(train_out / f"fineweb_val_{val_written:06d}.bin", + np.array(val_buf_tokens, dtype=np.uint16)) + _write_shard(train_out / f"fineweb_val_bytes_{val_written:06d}.bin", + np.array(val_buf_bytes, dtype=np.uint16)) + if train_buf: + _write_shard(train_out / f"fineweb_train_{train_written:06d}.bin", + np.array(train_buf, dtype=np.uint16)) + + print(f"done. docs={n_docs} train_shards={train_written + (1 if train_buf else 0)} val_shards={val_written + (1 if val_buf_tokens else 0)}") + + +if __name__ == "__main__": + mp.set_start_method("spawn", force=True) + main() diff --git a/records/track_10min_16mb/2026-04-30_OptE_SlidingWindow_CondPPM/submission.json b/records/track_10min_16mb/2026-04-30_OptE_SlidingWindow_CondPPM/submission.json new file mode 100644 index 0000000000..99d85203bd --- /dev/null +++ b/records/track_10min_16mb/2026-04-30_OptE_SlidingWindow_CondPPM/submission.json @@ -0,0 +1,34 @@ +{ + "track": "10min_16mb", + "date": "2026-04-30", + "name": "OptE_SlidingWindow_CondPPM", + "author": "Anmar Hindi", + "github_id": "anmarhindi", + "val_bpb": 1.029282, + "val_bpb_std": 0.000782, + "val_bpb_per_seed": { + "42": 1.02848514, + "1337": 1.03004769, + "314": 1.02931432 + }, + "diagnostic_pre_ema_3seed_mean": 1.168647, + "diagnostic_pre_ema_3seed_std": 0.000561, + "diagnostic_quantized_3seed_mean": 1.179078, + "diagnostic_quantized_3seed_std": 0.000564, + "diagnostic_post_ttt_3seed_mean": 1.029282, + "diagnostic_post_ttt_3seed_std": 0.000782, + "headline_metric": "cond_ppm", + "headline_metric_description": "byte-level conditional-PPM mixture BPB on sliding-window-scored full val (canonical byte counting, sigmoid gate alpha=15 beta=0.80). cond_ppm IS the headline; pre-quantization post-EMA was 1.147; post-quantization pre-cond-PPM was 1.179; cond-PPM mixture brings it to 1.029.", + "artifact_bytes_max": 15542968, + "wrapped_code_bytes": 49750, + "total_submission_bytes_max": 15592718, + "compliant_max_under_16mb": true, + "seeds": [ + 42, + 1337, + 314 + ], + "hardware": "8xH100 80GB SXM", + "sliding_window_3seed_mean": 1.184435, + "sliding_window_3seed_std": 0.001035 +} \ No newline at end of file diff --git a/records/track_10min_16mb/2026-04-30_OptE_SlidingWindow_CondPPM/tokenizers/fineweb_8192_bpe_lossless_caps_caseops_v1_reserved.model b/records/track_10min_16mb/2026-04-30_OptE_SlidingWindow_CondPPM/tokenizers/fineweb_8192_bpe_lossless_caps_caseops_v1_reserved.model new file mode 100644 index 0000000000..fffc8bb306 Binary files /dev/null and b/records/track_10min_16mb/2026-04-30_OptE_SlidingWindow_CondPPM/tokenizers/fineweb_8192_bpe_lossless_caps_caseops_v1_reserved.model differ diff --git a/records/track_10min_16mb/2026-04-30_OptE_SlidingWindow_CondPPM/train_gpt.py b/records/track_10min_16mb/2026-04-30_OptE_SlidingWindow_CondPPM/train_gpt.py new file mode 100644 index 0000000000..251dd19018 --- /dev/null +++ b/records/track_10min_16mb/2026-04-30_OptE_SlidingWindow_CondPPM/train_gpt.py @@ -0,0 +1,2 @@ +import lzma as L,base64 as B +exec(L.decompress(B.b85decode("{Wp48S^xk9=GL@E0stWa8~^|S5YJf5<0_z=1YH0$n@VT6Qap3Y@@2YR==kt^V3b|z;4_%p4p-u~Q2v~X2+~vf(+18k!VGi-te~DEcJAJ=+DZRe<7gEe>JqSSwW>47lL0GV26$Q;+~vn5pr%|c7Ic0Tw2`TURu8+{rkHnH!c21y58{N^EAE`YOQ9AQo?JK4%uvSyOo)=!og(L&w%L%TG}FoXweI<4BQC&2(FP_`WsoG-?PK$rs2$xg_&4PdYCjst->B_sCNox)_#t;Sp}k${gD68sZ7wat#Po0h1dEAnpy2cjx5UZi)znIK7ENgpW7?~dEIU3if7s|JyNT(f!C?Dt9$NEYp4Q;6*)oC9v{^RG0zIl)y(gSOjz5!i;_hGmdU_k_&=t>%o_QY95I%C?SF?C~HQfsSm<7JE_n_6ZCsFGXACW9!GQzIJc(2I$;QJD_7mt{&86rp$yeVOpQ&T%llS!?ujlMg+F|eo_51PCT9vC9B*!am%$>YwV~gyq+etFOEJYv0{_Tz-}elED>j_Ar)@rBv1Mb334%Cg3y@hHqvMyTKG#T-rg$Muv5j-#>E{>DmFL7Nwng6TfSa07yRUc-H*n^%^}?O7yF`4Q8&5k8NIg5=sX8f96`Pf8PX{BP9s*M*&ktb`PCVJ~X#D4y}$$kr$$_3!yYzHZ}~WPoQ7`ys9`=n(6j_xT$#p7X$$DTl7E^gwyp176r4>i`BDn-!SV?q)(Lxdq8UPy}`(i9}_p*~8NIur7}B!8Ry~tFyt_nUobEfp@;;A&Cdet^;t9BnJPcrKzo)W2ADLqS$qM*_&wRO}JP00jm(@5t(t}-;)wALL`O0vO@~65HRpu(b0q!{{)H}E{iw!D%K6i_mjjD?JPt)QqrnK(uL4EtfL`rIx`of1$r!pjW_!YT8EaJz{J=PV4d)r6=MGVPgIgoPQ-P2*FFPQe0x}gU)rkamzbC_0vGmIQ|z_d-sX-mS(KpH2Z{pNvV=An@eZISg&KS+NuxFfxhkCutq+$Y({RmwIV8(&KHh~N+#_pPXQSF;H>m=oLS8V4jO&jv+)cQcsI*cok`|aC(_@czAqJ}XUBQI~n@oB245$hw%Ge4pi#Bc?l@3hYSArEZslzG_VoC7#(MO-Gs^RQxxU6NevILTWigE(FX*dTEry*Kpfz3Zs8)>c>(ZvjJ;{XPFN9Z-=0d>h7=`u5h*tHQL2W~tn1eW2ZjCVDSQniX333F|Oy;IkOj9cHb{sP4gGoxPmBO0y$c}pO2qE^~quv48O_6*j@uan?;mebH`8fpZh6akY%zJA7lquE5*LKEgcX)ag`K(XRXbYc1-fFke4mmMRMg?=Fho2GU$R(hss|h8M8AQE|=hM81w93%*MoZdnB0Ay6cB}$5nF;gH}PkFT$OlNQ*X!$dMlRh3k{3ZA|{|&9xo%6G;jlCiwVJ89D~z1QMTn(E!KEch%0e|8CXtbHPO5;%(21Od4}?9))`56kcQuv~#tVAs^{uQ0)#y7Mwbi$2CyATtzYnBc-=XG1eTHmO_fw!+i=!D$$;|okgr!+=sl$yXtI5ZHdn$jSpNHJ&w4%;$`Blk>EcX>PJ@>$HZX)&JdW5%Eh0nVvO3T(^C*FMCYo5``-{$QbkH)knCvW~Hsr?V|#yqc6Kg-#BN$3GU5Ct%!pwnq(-u-l$_3f}wYRaawk1MdJFyGiBO+1(LN?dFQ2YOo&kgBMgRqHO0KUn=KsTt85D3yZ$KY?8@5({+0Yg9M0T~PE*ao8{KtOxxr1lcm0yB1x=W`D;;l#ITWHkvT55cTfci+E5v0{)=p*Xx6(5#Hjj_f1plpyZQ<1wGpvZrVc<6o#R%mQqKlT3-(g{LpJ%#1r3GkpFq{F8l22a+1h*nJg-1XgM%R`G4Z$$XBpza*DYzg63Sv9^Vpgf}h7Hy{8@+4#9JCVj6Q>P5)HmtJ{aqMSG{>2A^WwlL=p=aiXs^RN%WCVRI+@q(sXyvEdaI>Jyf#%7>3n@stgjjy>wBs#G0d+GlHs?9}p|6KxKFCW%^annW)9_AK}p1Ui@7CmbUA<8o@8H$zi@}doE30SFwNpeLFJc5Z(g5B=(z3&mct6IUktmPHuw$_5=KX=;rZ36!oM~6<_>*J*_p`z-3ejAA)!U;=3zU8%#jpuOkIDj>{YH)Ln`%&_l8@ld}!JYr!m?c{}3Zz)@cls|Ekdj^vSJM-wW?63(c5^yu2hHV3^-FHV-SHgm>jEzgUtb}_Md$T;p7N=5z=f6siKpe^NK(dASFTB)8p^FBDYeUB*$q_8J=PgM5Mo+;0(8&9!l(fgHJoDuF&vWOqBwQ(7+y56OO<_8HM@XLB5Zpx|=qtcs@%-dH-=IC<%?~+%C@$&>qCqQFyC}*W12kvTjg*dke%rBvJ?^`HopW!;2#147a9ox6A^K&F*hZg78nR|6Ln%tG?rlsLB0%ArFL<0MJ4o(v9V5^Coll%*>d)cKu5y)wuVT`)Td5m=LoC6$ogy`hM`YQL7Bir8E$ZGS-xuMre=a3ZfOzSeaz_$ytOFs%z4iE!BZLw`gHCcFRR^(Fd!ntf`23PaEMsp{)U5}XJ26Rvu&8g0!0ve?1SCn)dQS&j!a0x5NOi!xi*_lZDWykS=!_&=x@qg+tQu}eaT81_eny;TR%VUHxOxH;$kcUIZTCP#x3C?VV=tI81RDuh-bKP50M^(1UQLXs?$%^rTyd?aQEaT#vXRbc0*u;F&YM#P=b=klD`Dx1EyFd7%$6FCr5K56b7G0PEU}rjR=$uyAzvDpFSV^8g)C4d8`^bVxAXKo-IIu&AaoE^v3bgA(_MWDhH?+lz%EYp*3MI6#9TQjj~lIPDQ4n7vf*u24$OXtY?#uqr7Gd+V6^aC^X2uyyR^JUAo`jPfBFqGFsn1pMtF+;lCz6Iz0uFW^{`<6|yU!y^FAP(Ryqpox>J@%QeaZe=GW0Bc)P|?OtW4fh{--FdpU9(S8E(Lo5Z2yT0D$VA`m0q717aWp`yUM!Sm^prReA_R0y{f4L_8(WrKtM~Z&0yhafa-hk;VbVy?#U&ujjJ?^-r}0k(&(F&-z~WTSPVdh^&LQy<9y;|UqFaxv{%83ofDQZddS%`No^kpC3Sjyx%6@30rg?)-bj!Lggm8X3gC$7mk7{CTsXYh)y?2nXY*S!U8rbU{VW6?VM9!^Yp0Uq%k}18hQ{RD>o7v+o3=ID(dcKp=(AUrp86f;I8+*uySdFeLoyVhXytaKIOeh^@baBeKo8D!yHQ9qePhWO^{lS5nd@@qX_#o(^;x$BL4A>aJyICQgPl+3u~7j6I1saeBI{tP8mAwWFSv^7hkLZQe(E0}kMygiV5Kxf#F(|2EYd|oEIC#k((c;{rKFaIpaL~;`xr)Pp%b>CK8y!<2y@9HtPE^1$NLgHpzlIm@2~0Rd|$iE!MSL)1*T`o6#ta@&%<+u_r|t{2392bB9M&{8#A+RQh-4M1#~end;InMw4!q{;dD46R=Sa<_p_$1&LcI3f9F=TsOVtR^!5u;xSvo-73BPN-aA`@!hn0#k$z{7OUi|))RKRvI~a6mDVDklKpbisnRe+QK&yEDFa!Um<1qTB>%rB<%$5cZZ{Tjz8)O(N$|NPN#|F7(v!{1`PsZ28jZxVcMNA3XT9Tc8ObG(B>yu+W@dzv;FDZvA8csDCaEJSriawXgNB!;IQ0QidqSXWY+mz)3?Bq@5PIx6o^D=q;zP9*k~nBu~}N6K&C-Yq(kqj|!`IswvI*>5F43N?=6O6Oew@z3G6jnUIqd0Chmy`#wa_GP?yuXY&fwhX!+Ij!NPF#{aq*s?*rBsx-y_|R*|?LP6?nZplhM^>p-*$L%4EVVr;XSTK3o62n(uB`JH%|jXS#yBM7ScEm0U<#!YYX!OAL=udClZ73q&QmF2i;%oesL-Zr^2e=T9)~(hnxKZrFD{#d5Z!+n>5wVDaFPfJ0-#`g)05fK83LD}(Z?6s)5h!qv(4OO8-{XG*FyPoMS8L+1(Yd;Amos^Oo~um%(RQ18-fBXQ8j`?LzZ{=*4Up$ZH?CmKQnH19FHt*489eQl&0H=YXVn|iD7Yx%Z95a{2Ox1=%VH%n5;bR%cnM0$qwt)#5OQs^xbUpGpkE5X1llvBu}|c*Zj)e0M0c!8|1|T)yVrVWg9KCGPBHth8C;W~F8ibMaYQ}~7QM@Vm%D8cU&e!6uaxQ{VH4Mb%N5&5&}G#|!kKR*0;VfBA{^L68QeT?ePhraddB65~@<*A?uA|4JGrL??wPvMzI|daJ1VjvvH^&8Q?022cgL{916XPh=BIj_smP%j{b7~kc%j*E&Rl5_G)%(uz*n4XnEahf6v{#c%ok!M2Hf4}%WGpAnxoXcs*|ly^JxO{zogYhbb54%VQYzCB%YM4L=rkTX+5D|lP_&wijk;Dm0QOV#biC{^oi>rl5oDK(QO-_HZRmPeCNmHOt%JdmCz@KJIQTrmbOolxmyS*}#I9m#~$#5NbNfm|StOHs6d;P>}2({8GPOhtBno8Ki#nO6WLxF}b;tQrnpPaab*KHK|bTWh{JQDCcZ^i8XY5$v_>KRdM6dwQccOknjFH#ia68O^ec;*ceuF{KK?zCaq3vr(Y_qFocLh-7Jp4ZjO~Ohgvr1mB_@NlB!d@voRC}P@iv+tyf;lQW`9zwO&?{#pm!+r_8cS2B09?QH$wD)^Nc`CL>Gb#8Yws;Sht7zR_fex^>}l*)Y&%YY9)O_31_Mt_4PvS^tX)focmc3&L*M|9;H_Mk;eYWSK*6X;k;Al>znabceI-;&vCaI<8WMj=~ttwK7>)@7n{p&t9d^1Gd~zi^}-S(e5T0W+Vm})A1B=g1Z-$6%^}j23Eo5g8D7x4tl{f}BSgij}+wng&K}1`*nJ5RCvt8Ea}K(0;>_5cE-U)|ru^ka2qXZh8g7%0#uFeq>ezxm@#yfOh2P$Tc-o6&p&5pcWX6R@wL83YshLWq|TJ}o~!@b%Cx&BqB+u+<0USd@Z*`=2(ZEB!s8&-~d>=884`aimUW$LtCCp9{me1n2O1O`~>)g&GEl=g`r%?H?_;=X}b+_Hc)`0vPSZF{$)RV^`=}8Nc5PvYykta$<9V#;VD?`Nt5^*tQ*cJyT`YeRh(IggpdahtpS(TPZ9SLc&AM#A?H0qS=Nr90F5(sZIe-1Dv+Y~F~uo_z4+7iaS+rE%<=Wt<&`h|*#B%6g%8%(6(U*-ip!wS?;NP^3WV*_5skU75S3-tO2Hd0$Hq9zDv4sk5_wj-eX4Vk3TK$}{*x~RNxB_Bb!xHblyxJCrmtzSV85GqtNkyH+xArb3jigju8%%}^fO~cC7)SYU@Idu-XfW+xgERX0I9P%)}qp@#s`YGzf(%tbQ;J*w+3&_JytBqLN>*8h>qnkOLt+=J^WNZWWK8?;Q%+9C8at(TyC_$E-Ue_*OQ3B`=3L?FcOQx@F(`=-g1;6rSsg<3mK_hn@TW4`z(_L}NFJ?a=NMm`=z|L9rrq$Qoty_VrVsDMG=aT#O5H}ZI@rz`M?!uYUGp$StH6n2~?Od4$IcpDdC3wl0Yzk6{We`vRjuBdNSj54P}Pn5Yj_{&3dlkj=d_Cx+~_4u63NDVPl9X!iBzQuQkP|QDi86my!bSYNA4C&WJFpUyhSRCDhuUu_Q5VD6=rc^vi#Cz-9SzXLS%`Z7ze!8NM`y4%O3*$Am+Xyt)vF9_*MT(6PXA|$KfQ-;U{M9#WE+T7d@!1(6v*8(|*tYfF)e$vlJrK0Cn>w;g&Ts>u3f+LZ1i@e8E7+aQvg3@^O6m0_HCQPoqvnEoSMkVX!@n!Hod`Yn!xzG$F>MXcAz89cptc^Ct)~Aw@q=z~nqvdb^GJ{fZ6(@~oXa=HCn|^yIr8TW^wlmJQS;!9R$w%Np&mc;rNw>PSs*Gr}6F^a5uHSEo<}Ddq4!NHFgwR2ralAAEpA_#tz(4O0~KMqce9jY&;@k1G*JCazGgU7ReP71`{aSzGH>!f8PM+ncdvk-o+=1pT<}btD2g6ws@1c}1y=Q&m0o{3p?v{++DV54;aRLgy-Myls^ZzOkz8G^8XVWZbI{cT@I^a6qn#kC+c$pmEq}=`@*J4awW!;-ao=0~^_+uh)9eng4Lou%|o_>ltMh-3ACY%}pc7>Y=7YBh_RSKbrcwUr)!3*)E@d^zdFD<}kISWg<1yA5|M;2nS-k4Pfr2k~awrbxByav_eRa}kIp4Wq?1kLk_3JU%I6#vfmNN$-I39prAG}JuiREEgHNdA=ZX?zR#9WBhF{~D6NDASK~ge-56XIA`0&Nd^>0%PT#oBl@>7l_)O&DUkxKxpc$x}=%eW9cFD=s99S={d-kJ>ZiKQYsMvYG9(gJR#(0l0y#>=ftdT(?q;)E1Xy#CYW;N-9*JAKTceJm=M=E(NnuQ$RK+J1?nEykIBAnH-GYb;7@CDL^LnepZEc|K%E*Qx&6!N;lF{|CdGQN!qDLVptKCT39i1%ofz1UBhf}2_Sie}SS1aqdaHuASc_h8gN!f__|3Zua^t#uR-(|d``ae6huf7q&gqzekeLqNo|j7)ROGd8rTUDg2(qn>CaMI!m!Bki00U!$TSLot3(9ro7-xN#K_?vPn6mWc<1pSMMS0#@wV8G^?c;Zs8iMw#31oA=5@cGJ7k^Mbv-l7qcgqSm=5;@x`;>(n`OyjWM~7Z^yCnjt~7Zhu+%!MZukcC#BadIh|A2*!#~c9qc$oC6yHbLYzIPFpJCuoYW5oXQzgR4Wt}6%CHsOXk2B?SNUrtgVvF=9;2#U6hP*8-hmKBboa8juVF|26z#`V9%t5wL6~5@6PYR1CJjH;N%y!NC(p?yFLCDIJXqxr-jpBcOXaiK_q!Z@;gW?-r;p6ZG@FWHa+;F!qvu5Q+xl(*sxn$JQrBUDO47Y8_4A~8+exD{m2sM5vqzerxb6L_f4@hC!}5|QUc@)a1#in5!@pIY+WVzW`Ef>CFo;=vgtr1OGJ>S<+wKP(A>)}UX?fDp)_$or8VjT4F$(k4T5vymWe|2ZJz}K#;{VW#DCHKJ)rOUTGom>oS?|JS7knMLde#A&5pFcj_O7vV0v0FTon`yXCLdIvbMy70*}N_5wa>O0o|rcu=&neG(UW>U?jSgtV%aTj36hoV`a~S;c0?VWN_HHA#d8;Kv`B(0jV=1r)Fevzp^KwaEv<`JzLD7BEuJ^n~H~DTL_`O;|~HC%P4~fDt;3a=+Y_B=OB8t!)QWby|Oky35=)ZxO+cd7DNxzKgf|%lqHqBXsV24xdjNWY=HQS{_?&0)o&{3h_69-lg^yq*ZTs1T^6}$?L;Hp_e^pV3TWuqRA8PxY%atQA#XqMEqb=+s5q#TSTW{_EU&c9NX_C*zqBnxGiZfmG|57&&QP(sa0az0IEVEF}Q4Xh8jjzC=R_6lG$@-pgM#n?2j$!gkWIUZE^f|IAT@9_T^V7w=7oMFCve8Ap;oV%VpbU#*H#wt4>=TK25EWK4)V_Y%BT`5MOxG|_puw~dvNMq=byv|h#F4M{i9~u@bohq-a=(EAiZ4Ft>&T_(p+~AKkmNh{v-*3e3Ja;FUU!r3xy4#ZJ;tNRzM86p>VRKFd>rZJ~!Lnh85VrK+usFmADT&$I;GttNa(B)nt}y1^=;ifM!mUKF1G8^wp@en#EsI)LD(nzmCth+&v$u3J$tCo#aMtrlIHJlgXVCof_uEMUQ|n8cmOsk9TYMGHQ7~YQVIJW*J7bEq?k#!@YbAd*SbNk}c2ap;TZIaIZcB%0G0}WT*<$umyL5Y*WOOd2yNm3kBV|Bg>RYOhRd#Ms&VaU25A`Y|CsJ=tA|N;j5p}|5hHYZ_Nl}o{EcXiO%~cc$Ns~&32?^$rbHGUd(fC916lm{z$WA*yC>pf-u3xkQ&8h;Wp{U-R7uRqD|5VT{%bRG3>7D4+k^~IdUxzTFiI>mzRCEVTS*BPiw4p9zt#rBux)J`dTdLJU>K3N}^z3V&RMqNz5N+uSSv(2YO^hMsl%&)=r~m9WfF_9SvDL774Cw~(OwYk;hPSkV1c$`^99ZcuvKa94FvQdY1{neC2qyi=|&l`Lz)Qkq?!DX15dasBI@O+e1Rs7&#kA|$Qq08S5K7y)JgnY~7*IVn$eH%F+N$GbS#cZKcaB1G5N<|1>9LR&44-&cZ>)ef*dSCUj5B{j;6&ts8%_8ZZcAZldNz|~$0-Q`6mjxiY45mLG)n6QG6E<&CY3XfLBV*r-bKJsvXg9L|LC{Duuo&Hp3>@2uZJfo?-49}916xUrAe}QnYOQTR{gyD9GDnV(><^qRrJf$zC9FHXC@x&4$PE=*9_>6Rn|_D>Cj40rN@I-oBxBp9sQ&aN)P)SUmE<9+&!{zXJOx7?BBg#p!)7VZVU%`>#BO|j8z^%OMwL!8g1DXG2l-4X6*r;22Y0H4$83Lq@ORdm)uNANHQ)hoL@tq3VI2ENqn#I^Prqt!#$I7OF_vqF^U0JS*JBTsTjY;KYQ#5{V>d!_8-_Pp*9?2Y+Y)qFBR=ZA1n9^fyb3avi(Bo?9h{eCDETvkTP@4dkyn!p23_0CA!?3pjNh$P4m&=L5QZDQ9!15&AxfSOZ2N080!iN&z0lBT#Q+Ksq7qkDQnUr#H-C(xk92d-nXA&H%I)ZEYS`BIU>n&!<(T9ALTbsH9Ct1OcHP9CB|ZBmROU#^d!Wl^!D;3|@XIQT6+~l;22*lzWAdFv69i{~D2*XM|rfu!iunNHkc+x5cO6mTy-R4x)Uq5$k+`M(n1XdbWR_1vY*Ld(%VUGu8LTkqtRFK`g?MI$?oCrv%QmL+7@RZ-*|5Eu-mPUbc^st9`)Ex4YHNrIZ+CcxukmxlAGGJEy}QTUpB1cc6%2IljCYkHj_HZoVSVW~WD7?WMD9@2ZB(fZ?31%zV4I%PHn=AGX;=vp8w%)ts@(3l(0Ky+->67^I@;Svf{N1cni#e7ver$+wS>;Jze#)b~OG6vl=Yij;F1k|X2wgLiGADbScEuEaE4NB2bmEg9?9mI-&VCQwj`Dho*e@;q%;TASGeHQ%H~xETNtWedW}Oi-ivh-^SidUHT)#pIj;rz=bC>)d%lq{0R|+}{^IS+dSfOy(2_62mJ7M2yma2c(0B#p4aBwSP!9Va(TX?HWVpE`zRPRt1ofFmI^K7I{maL9>42(&>|%(z50@W_ydfUfxKhY7$;MD3|bS$8O;CW3>AY^D#)di;AgsXVJuS{tvsJF|6G7E3M4e!_*A6n9pWh!%Q({?gP2-k{{`}VcGGpc+=8r?RcDc3rYulYs`-TPh`gxT7K4IStrMp%KP%j$S_uU5oW1=U^>YRK1yE9?rA_2mpIEceJIpji$p_GuB-&6(7Bxsvv@Jyk>@nR_rlLWW$2p8gl(4+|B}{~;gDj!xdkOvc^VnEaaLjUB)a&NQU+3@>y)m}Z4Ir4*w5wtilG)HPkR(nd=&4NmPgym!hM8Y|hu`l=hu`VJ!;J9(Bc+0uG6WJDCV6VDC4dfRk8NN%a}jQCOr>Eq)6@8$=F`9H6RZ0ilE+RBceN0BE`K&0q3u;-NVmyK$GNP&Gc)Ec^eHk9KHxLdMHk>i(bt@%%sRF}=cT%W!G6FuyMPH=i65K$8}<+!XLQlBaxQ%_!@M6kcgj9B908Cbsrc|ZCaCN9wiX}~kx2!AKWk3dmrrmK3Z_AHnI{nX2)de{YCdn$=VBh8(&ZxzJ62P{6%B!kOwA9#ZAGFyx+6PApxqk3oY!=ldO_M`N}mOlAc->eN!Mjhqz&KjScg|>>kQ*-K&l5z8t0s_t^Y<=bjyB@xLxw#F{BaWSO6DtuwukH0t4?r+k+d!;>wpe5EpB|UIYEpl8O`fswOqhYvNw2Ocg^jOK-$u85qD)mih)M2YQR5^@4_r2oR&<*0A1%8X6-?a9tA6tm^O&mJ=Ew4ow@;rj#5{Brx**SrO3chGGQF#4_TSMGlJ$4AaaYXCIE3n)1}uWV+apKvPdU#5Qu2D2BECpXNGxkG@XHCmVQ#H`1ih5n$7XJbNv>F;!grhwO7%G%U8#Kop67tyso~W+s@4)IRl**3=v!7jb}AhBLKyiXe4Y^2Ge|w`A6P>Lz62vNtk|U#qs*=-lcM4A&9lGh8Jds_8e0JydRxxLTfAW^5ExUQVAgrY1jvZ&MVh8OdD!dJTe#2R;@$6TdPBrx4&8wf*@j5iK>bYh8d8dX?Q|H7Bwz&(WgFQHuai+|O^u9@+z$>Nozl-h!twG(P4)23-CzM&d@PAM6E|l=d}6beom}7c_A0C3l|aYkAmvP%N#aQMqS&*F7p*tu&_OaZXVzYRB`>h85ko?oS6h78ASWX?Eo%m6L8l(t$?wQwtH64FK+u=MbF^d2G4F5>Dx}&T%omA*kW3x^XM99B;BR--b&LK)uaO$=xlsUG9Wf@^;L}eI6;CAzK`=47=7olxtnk#2MU@K|wWy3R@-VYiqM96Cpa$F;V{r^<>%CQ|Ev>WHdZ)^~HHtbvU{fnL^JYAd3I_6Zj!iA*Mk=pfG^VvP$f>sCYGhMToE4iz)G6bj^auyB%f`PZbXBy9n}?19lN%}S#tE{2E8Ea8k{rGxAw|YF`ovAgH4)$`*OH%D;3p&~5XHp1Urx9WuZ`WsMz??8Y3LP)$gs4C^PpL{Rlcip?|p1;O$eGwN!TwdTeHN9H2O%%luhT#Z{WR2YYfe8Hkdl+cARW--ND-PBvDh^pW|P%eOtB47G7t|rQHzbCzAJseNzn?qK7|G@K^rIswD^!0tNj&qQ*Q6~BFN8{9y;$l5mI+zB!S5e*OmgCyfx6aCn3lAC$DQ4|n;dmObTdw5NlT7Wyp;(cH)%h^M?B^3F#H}&*F=cw1P%y_fe#LSOl8s*xJfkkWCK?*GH@M&!*%TXnw^q3r70`o41?2(Urz`)4={X}^Du1xBcKP?C%w7S#EOv11=p!JN2a_VsS3l~6nh%qv0&ERZit@c26$b4c^yx}iTo-a^}J@m$Y#!4W9lq3IBF)B$a|BI#Pl}1dMC2aq63%LU5roq$y@|bezf~>Uhkq_tDN9)Wtrqc2Wqn!$$;6!=IN|rW2wTtXH1Ioy?8?+_y*_==2xgFt-K|U@y;1U#%8jqut!=irRj%rhay#)sOuGLW3tmSUGDg$WRYvUuEmLFk0bvAR@jLhjE8LwfUcT~eHy(+m^2VXqcn$D-qGt!>ljNlS3AKb2)cl=m*wuFUP$nil5?tz71URZLN1#Ms*sF9^mk;lIIAB^RnO#yx9u-h@Dxf3dhrBoVx6-=l@fhT7$!K0|a1YyqR$KPA^vU5Z-}3qwXM2M_gW&#Dz&TRO)E6Jx+YB4qaf#H~NnaMSV_ZN@(qgI6x%a#Kqr6GcZy)1&Z}S2Chf_48&iVmUKSKs?q>=$Vu3i3RwTIqZTB;i;C>y4-_LcCL?;hNu{|ArV~?2EXN{17S~^4A1E}{u2I~xrsAcjbA{d>lLe2kK8nhNHF4#+fmk1*Lgl?tesT&rze750x<+nVC%jbh((x&VwVt%z-@!?f00Gb5n!GhfZU;`s@w;{S6GHg4Pf5@Xr=$Gtx>=w7Vr-2&{r(9lw`I~^fdU6oir+o&x4XVcB|5CG7d9ojuVcV6moWp9zdiJ9#P`AcM54@!RDtq9@4bdziHM2M`g~{NAmTSkYeRATw4m3NR=QC?_(#?=n9NCF4TF2uU-5(dV~DU>06&8wSpRZV9r8?cuB_%-=Ltb^U!Z(;eo8`WbYW@OyUVRr9bNvFhRD`#q3kw2>t}*EB^l5{`~pe$44A1lV`9L88JqDjl=#QOkrBbvEK3zO4`N#jqTK4gmTWbkeba)BGH1nmD^ytWV}^P-jKhN@UnSip|mNg}x~}Wf3Or$GUD@!8;+r37=H}=~(rlnW4&)%)#5P_};^k7w^0e_H~0%NVq*xU=@qhxH)m|78~7d;>`V+uc;Q$kV4T%SP6^kT}?_?uHO##Q=%r$gCglNra)=U2zKIUK1p(r?7!uE%IXsxrgy#W`5oCzqr{!gwgD#STuRhQ)M=t>1*9`NU7*V&oeaO{32&Bm_MA$`Vcg*KhugVBV9>K0}V}5%ksKDOXUdsCUR`idTVyp&y$(4yqAe{MJB}Rkp@y`_*<3ZdKQGmgdqzZXF5J9!Ngi*E}O*OmQ;8cV^udHFp&5mnLI07;-xLm!3A+*45^cWj`%LUbkGt>!)T3j1L^YJiS0pXMD1^SG(#XxqrQya)e3Y$r^n)xpvnqEKpE+H50rBsPQV&E`Y%gMA2wa`{u8?OygmW)dqAPVSu;ygGp8m!7U45iCjuPT}hSkD`AxbNwED(FQ0E|eVb>bQ{C6jx+;e$(Dz>sSCRL>0T4?*D|-L!(3J+-*=qNWISkSwyPg9QHSeYvLHX}nB{-wI>=`y9i57I%W>`E&`qHffjdM+)NXe1bY}WbV9`$}_<>Q{2hr{WhhaAzULrYw@eLNZZVWNDw8I5EslGFY?PF#)n4f<2DE$lbm*aT`dUnLQ4#hW0Js&`S?CQ1j@&>3S$uCB76UV75u1#eilGVVd^O_ac{Um_;C*U$sDo)ZB??W{+^%q!D$xK9bg-gVk%$ZM+Q*`Qk%fA@Ucz(o4IIcu7=9`;cYQ4)?43@Cfe;r525bYJ21P)tBI$A1F;KJ~#ftejRzbj#b{o`BgX~arK3XeJjTToZiJ{IJCt3G7p5|ZfzaRviyy+ET595jhMGV|CB2qUq&b&sx~w}oK30w{=`G8i^9L2yBp-cowQKQ+Tm|JWLM1EVUQyed7Rs7JA_bF>Vddr6BSWgTFVhNz01;kZD!SpCpuqz2l4k(}3G$@40Q)qyfwweidFBckfAcJt3Ez5Y{RQ*Dru0qrnvLAZUhi7^@_Lq(>_JgmpFU&`k53P=*-ukCVX;ptirxoNV=EfB1P=OpZd03lMd!HgFY=DA@e{I0<+>U})*qgyAw?xQzFdE&;&=xe%uoJoWqA3rC0p6I{$n|Tj9alu0w#%Ve!*?x%JK>8pZ>gdqabcKjxCM|^E1tNHHe<|sYbCWAF@juAlnghUsYB@M5*KSkQ^uTN5amtNWL#+*KoYc!EC463dNpHJaZ!&_=a?!zRR~EYm=X-BK&&J2W}c+_7w{h!jBWWMsdP^5O9CyPeQ-zyWu+nsx`bh80t-tY#<;!EkgK^i&KUWj7v34q#lPZ0C^@pneX3l3B~m=ez)+~PEJisg6WH5fCunu5&~W$K8JanFOou>6ki=t@@#09$dKtBIL8>+N$7HCv^6JPot92L8^uCi)He}#guta+`F8}(0vH6=#R80fz*P8xYGxT%=CLob;eqQnP?>$GR!>l6{iy3eZZ9yYyVdLD&ZiD(wdCY{(N1%#80&5pY`TE&XTAVH2>=@vCbdaEwy^G!C@Kah3+yke-E+2m>10J1-iN3clo87xQyfBfMZYzP)3?G9%`$dXOj|8nxCd+M_>QM~I3Bp$;8Sfuu~TL40Y`^{lJz2)L^$Bth=8r-RylQW7Pc(T-9Cb!X)v-9|ItK%UFU5Owq&*jB+1{bYU`I_(tLxSR_O2TbqUVvLAXpt{z*4xu6Le3$>+7135UG_exa?&vl9mRkMw248+(fp};UMDSVrCX;-{XY@c2>{WM_|cc8GXj-nv;rN-w6H#qR(#*ctMr?m-Q7CUg%iu}5JNb!TYt(9QN$x!<}3O|!%3@x_~%^F#xv!W^er#L|=^+HcEDUP5F=Y>P;5fkpH$nf}n%#C2?T>Zr{8=~8NTS`~XB>;REHV%LJ;}L}yTXI&M{TWs^O`v@_oks^HCIDEigKg|=C*V@Na}FUmja8{-18CVaIUWaM^2}m(@{J5hH|yEO<+;UL7|PjYc44{I=C3M`FdW_wWV%+?tZZphXC6TS-ZokM~paFVgWVGNS?7OSuVG@;`HArEv6?OE{<6L{H~2Ewfk8}wC0n*AHb$O=0vRs$j75-JOMuR8OEf!;t7AK$VKDhk6^#TW9V*V7t9~lP<*JoNKZV4nI79q=t0*FC#jn(gcZYjsLW~;+%M@PAruH0c?i%HS`Ey7s)az0YbI~`v0OE5m@2{`Z@UTzABVt`1ymr8=ooqM*amKi}04IGr7zT#SEh1~Wyvh4qAntM}P08-Jb?6j+jxgV`rK7dJ-aj=`6@&_qK{P`4pa>K3KNiIsCuWKtgVD}s8vt=dzy7a)}ptH2l=Vxt=&#ag$sRY$yxq*Z>-VXE=|Hl^gj2JbS{m=Y0RNgP!-Bo}R*YK_)fo+0Wq_@^+Ouw{NG0E)~Mr+Rz!#D`iE!g5)6MMW0>C2aeEO@O7oZjey9~01kc8{@8LZ=)b;SX)3TW_+31SaZ9l@633+4#F{}4SgEqt{Ak^O7**p4Me(@iEcyUQJf^(3+?5N))Lp7+>DdMx10vb8&q{js`}3E?*&|4?Db0`i4%AeNe8t>bL`9%DVPeDg>JMlHTk3}?yUtz-bE(Lb40O%Ru#XjR)^h=#VNR+bGF<3}&;;U5l2Iy3MmJX~^?Zb8WKChWkAjKyrSta>WRQo~(Ku=HP?67{A1?@Ze=x$%v*=||5w$=OVrY_PN2JO79~;qhB4lDoL!X-Wqfk==cdKKpRU*A+bU;nJM5yO?V>GjUB#aAOmAK9=D}!MxZ-MYb5O!~Mujk148YGnA<6?2u-lDaLQhf~Q%tOVgk8MLmN`?TKX|8QAIThyl`#ip5w4tkC`$Xs4SL*=Ut){!?k+|lIXd|FL`V{2$1?%J7F`;y=^StbYZK`=*}I*iHn2td-M%BbuC?hUxpYil6-Xn+m57V#PT^5M#uQ9gcryQJ9@S7D)dsBr&xh5L+OVd7t7_&5RGb{8@09wilO3F0m03N}`ES|FiP23QGcaVg@3e6cONbF>`%pWBM9z*-Z^(p&BrYUTY?eDq{)|a{eQJU16;ubz$D?gr5Kf|;DcvGwZT@6g>v@!rdQ0e~qTr63{pG`C5t|zRer^*J6-W8*U;%|N@Kg^8DwCaH&cVbjEjlt`8S_aHt%zSXWPlrnEctN0`G?)hw(^Zc@ld!c8xDT39Hnm7K`3)-Xb>K9wJ1*+r^vs&3)4b*H=4P8)@6_v}s8tojM4L<_;D%B`ZEwbb_?17hFstD`Y4yIH_h*vk@j8)k_CkQX0mGam6(ut(fhBDbMq)Mc7bQWiDnVjg`$gyoXeo?qi~~2E*H6WRXeQ`{V(VH{;k-X=X`;*oZFo&#?Q=?t~j*z0U8mI#TOB>P5l#7Od_h5WNjiFn=h=Z~c03CA-0Nm_dwwykBP!S#5Qc=ck+ET!(03d!uET+<>4dnah+9{(UA=3*GJiRlk!BXv0r`%MFhHtnS*kxUlHS!z}_p*NAQ@_J6I?{I7OM6uG%Yvt*t5#-0=U(qhT^>Dt$2MXw+Bq=IpZBARzz1@PfKmy#j#&7vfolA|FLAz|QX8>HX+{9rJ$oWSzYf_Gr-7rX}_kM&x>0`<=YsG;DkG#E-qd&J&z7478vrXh>;-0K~eWbzlINfiR2m|EhXsJhhtT{q9u@Jb~)of*IEYBvr+xb5{-3kb9%b_#&YE?|ph%uVpa6F}xjguT(Q!`?{~SsLoXWR0bowl?anBY-xx)aHp8`G^{G)=b?din<91tK#0(?RFK&Qzx$LHYd44yg&s^ode>ldGpNU80Di5wLNwemF;={AHQ&wDvvd~@u*-j{KvhZ=9H{nY?^e(b(O972d2bUjxv6CAh^_DaT^5Dj`NJ40>i1PBYjfK@-@A(T1E;~CVM^@6CoVM-5Tmh|nw>n8dJmn@bBjGyEL-*i!`Ui~srd+cJ@$d!N9)x*HC(g8Wp>@WaTzcxF2FbsNky`m%?!Vl11>T9U(N+=SKmtc>WzQe$^caO0$&Iw(&AxC#M7@50%kK8$%3P^BSx($lt-`_+$OB*Mem%k-7tK|#^u>>?jjB5!JjgSf^XN~*HWsVg$?gN9Y!}PCLFaNP6KU3FIcvY}+)*KjK`=4n1S-~8nJ<7=cJsR0fXVZW~^rgZqxsXn&HAGp6hCqdP#(n4`Pg-QLf$-q_C3ZkXjT`z4K-|rm4s(T%%JUmK{7c12i@N1GiA!2-=YFy0F2^95Mzj`vy5@{qQ^tUc<7)~gbdT+%GYyKf{EupJn6a{g%L$(-kRHavRMy(aj+{HTAH3q&@A#qc)`vIg;A5?Ve4v!~8jL_B9hX=@w0kgxOxke7?lEF+Ju^3#RE=*}9q3N01zcJ?#RYE&lF1n({McI&O$1nuZtu2ocnD+ZSBAU8Thr{AYaBLkv7Gqv`L=lu3VdmFVzz}c1v&D0pjcfm&@vTP2*Olo5~eEd-COTEYbfq(xFyYc#JVobEv+SSvRNw>&_>+e%h7dkrHRql@|Nx&9yut+vb`@!lcSDVu9KwuLiiEnS9cqNzRhmcDH(gwLe5^&aZu7mUBN3oL89#nps%=N{x?MwXaS)uHN`#Xmn?W%VvQ~6$7Y2dd`yT8iD*y1TFhXm(U*fWo;ex^t1GxV6v_P|q2BVZw0y#4$%ja7VIC60T&BlFnNh$nSA)pBI#ldmEkfS*rn{D`72bwth1#G;QG@Vg9X%gAC6`g*Toee)15>SOzI(&=cF;Df7a8oRbGD!mF%m~`ENrX~wN3wT?@d+=9!3TcxEQ%EU$0xT0>+;!XS_Z;8YKRz=jdM5lR*nYWiHpuym<5~-IaGA%z)<7uY3l0xrC2MOci6tMYfar2uj9wAwC^W=g}u=75y3fqDU2C>sfZR_u6UnU3I=QI`Fh**6U0*82EjWph5~XqDy_*$3yS12wjP;eIA7avgcJ*6HsZ$n+~*w0*DCer1IdZD+5xv`!uZR$_f8@!yfr^`kae4_PZ=1~Y3znH=KngE+1}JKx&Iqrfc%^^CeKM!EzYRviEIJMlUCRMEl8=as@aTc?FvoewsoPbdt9OLs-NOQF;^Cqq8sM?DCFS1{k)UT{&DJWkK<4rr68k-+>hUi{YhK!tkS+%p~qk-CNa;xAoQuj)Y%SHf0Gnhn<<7K5c=dBq=^;~_Ebgf>hj&}517ziqcP9XJ6oEizE(J*|l9a?sp#>wb|^|0!BV@v^WqV^|3LVp3NdD9A+UMD-}l2zKaATFMdR2@iC@(B+LfvaV>&N8}FIB2AXP4K&4m0_@KF_(K?DJCe%UE>wClfnhVSt<->5b765~U`?Uar(l4vkFvsg&{XiO%eHV&C+m0f7R(;RJ~kKxmOOcD}$E5xe%k~1E{Lw8uJKO-JWK(3DO~1W+k>p%LEsYGhxKp9$Qv7^pE%JckrwsY?LM#IqJBm>`KJ2mN2q^;1wpKLQJ9}uFR?pyxqT+H*@am-{Fb9c-2{k6LfODlr`&Y{0i3eV2R}rMBL*fzB_YS7%3o&TGL6Rh@_IjxA)62jYV%rPWStxoJ>)5YN$j5vdz{F?u@r4xj~y+BwsgHeU|TF}^K&4;f2a*3rqx)_nT?YbJ-pj#LgH+wqSsc%nUxUcXFv#xyrZIgUok~gG`0KK^A+F6sTfbmhyjV%;&;hk!ZQ%(0%y-t>ncIke+LxWV4Hqk`KM=Em+r57-Uhs%koxPgYNkYf(E?_l_nrAl^3qe>3+rHBnORvhn7)7f9&TCywuN1taNC&9nP%>jRO|n@@}U0ocvjW0Dq$b7Mgl|i+H;_g77D;dK?YnuoaFgAoH>7?L*Jvhk2As9i*3ri9-+7{KB*V<6JFy-E7x!h+Df>>g{te=fg^W3g+_ptulmx2^od3?*Mf5vkLEzMVj$8-D;1YQlQ`9%^M+=O!X=2QFj2-X_bSWb^7BnjzC@}}hZg}1IBQg%El=DH~_w))mt?7D9u)PU|F?91qcjgirhW8-enK|H;)LAZ9h2~p0SlNXR~#X1MJA6-_zeGl($^BDX@I@kTphRhkBX<5?3*)b)?hv-zVK-4D_XLHW>|9-wArf@8oj!UL#Nb$;%bZ=w@z<||o9CkRig~I-g@XQ>9)zG0Rbuy_wN`SHIM3h#*ODHlEW;yW^>d_0^^&esZ?K#5aIi}Co1bi{*xzkCL}EBBoAJ=z2P=%oPO2t_^*rT+m%9Na%ZB}Thx~PY`KC|8&1>(C^T!)>G}F`_exuw^UJ#o99O-@40Bk%{3pZ8bYgThav2BG71n!vS<5?~`-@Pgr%ln`QE(=fNz3M|cHk{^lt~w537FK<5OzL`6ej&K`<@3GA+jHLj`WzGR7p1WKY&#b4*g2`H2=It`+=!~u2JwZmvV7L!nwEa3%9*&esnsQ!rim_3Gnz=ocFZtih?Al)9avK>{W=-2=5c}?lV}A)3^Jathy}N;-~^BA!@~8_a($q!peHFtbeZWB)Cr%A)G$WjOGfTt27G@o+%%fb8}mbD>_=z0k}Uer&g5+B+MX>1(_7034C{-&j=WwkWfnexl5#S0uHn()AURq(f%n{}8$SM7~m?Ek}!$HI3)Idev7Gb_jZTj^Awh=R&NNax%m3nISPb6f06$5CB9Gf#$*dBaLvZx%SxjT7MoJUbl?+>Eh*J;J*oXytqF7p%D)g_d`kRPsyy2-st6p5&Mo4hFy5KdS@H^>`B5Ezn}XZqIgA27QED$Gp0DlodXuOuI~&k=TB6Rd{7W!{R-IN7#a5_mnM%+%di-{Z+_9t~%lvixN_3-jh%->Z=rSQ-|8!0`2f6`9pZ2QDHVlzj<_E>M?0WR8c!hOYK4+Rpy62w7~)MtOi&xr}89IGXQo;+BFgSOsYjI`gmbZCcoF&lH!lae7H)-rNp&mKKZ8cDv>R!85pK7o-mnKyZci=CR{#y8QvPCs_N&aX}>Rc{rADXYOA(~`6XpU9qb-fNCs~oR$<+ef7Jwe%=RyP@^l6!h#Q)Aqlw1hiUFg^<3ad&CGPqq5@?qb9DxO*0C@Xr2P9GHLJ4$Q8IxQ5Zjam3br2V=Lxm{&lI7MUhsg;gCv@SW(NeQT5^yqSRtj^x?QY6Zz!Ne0=&*CC{BRYPxEgcL30wumKSf=J+!6Ynje0;d8GS`(Ggzk27FLU@H;A+CC}TDtAGpLj`gIg?t(8MtefjMy6X4&X7|(`wNL$j=Yqj*|2Zp?^#+3|sE2DZV)xkp=;ix%u=K2AaEQu{QR3#@ro4hu}M>h^lIxCxR$*r)cd^lkt#APFvFaPCU2n5?_KRu4Zf0*LT;bC)VSgj*b>(W@>Yb-a2(uGD|5p9}tJuCSZB<5#^>dd*qS?W4#lLxh-RQ}*=+C~kIv{Q+AJd?KoA$`$Q-jUz%ZyP3^ZA4EJa+!_BsR)xk_rNq>1$7WDL>iESUO(VA+dz;-fR@K1t8lp3B969bQ5e8V1h(-$nqfWW%ZJHp+ONArhQOa5Y24;VIRq!l7_?8d2n@1jHb2QT#!->e9wk>V<)>{>==_}aI^|mcKuBgQL20vRA;nM?29H7AtR|g$imsQz5qS`!3_FB20d>OdX`V}=BO>d|e1?I;oZ}?&m>YBkE5^HZ?LhTY5-v7_F|GWdToPo=RSf;~!kP`aFJ;9++D=I%09&~Hz0tixdH8JdAjoYkvp$iSizEdeKthZBoq0o(aydExrMfR#xbH)v_zDaGIns3xe>XV?sq$d77v{ZOm}{DXd!Fa{Ogm_wlboVAg(VEot0z*+vh!chOf4rLB9F#x0w)%XCr}MBZ%6;(SXbb|Nko81%;$1APZJDxWE2+!>{1C6RvlhNzMHrYgr{iLBYWm@6IKD;|kA5jVdf!94EQpt2)as`q~9)Jk{B`Q&}=bS+?{iPzjRtpUJq5BLUaTp}GM(_P7g<87dNUu!+mcqM%N8BhqT6mkuF(MR&VY5QacR_D@riTqfG!jB_mr!&_eT0tc+U*ahMA2G6)YU58vFqmyiI>F5qQ_MIc8^U1`Sea@BM0{oW_I%;_aKMGW>TfAMw6sOv7m`jZtnqpT(V&YRyy!7yW`KP5k;9ILf@ZS~MB#B0WiCTA}zRxj1xfhu)!!Cb1LqC19hR%@=HDXgCQ4>T+Oi?YH*OrO*2X<0EYuyFW$9iwG3JaPN=Ib$&f2p1pKb#-A`DBWpK&~TtS{kU^EzL$$R#Q^*356OPMgRNyDOYK#=Q_M@}CY%BcMgTvyz>S)6&r6?EZ}_~=x{p<#*DyuK+-y_(3C-jHI}`zzCPl~qE~u=9Q3vtj!3oLp7mKo8{W5;bz7!)MeBm2^yp&|N_rShXOeNy{DOzm5L$c2+tl}4=@}sUz{Q^_U-uY3srB;Du&^ScR1FnR*;nrI%ZpDl>E=Td3YA|nEmqlar98`7DvMVzL)*Ivh$i4D~SyaIZ7v+4Xs;2m~>9tH(v$m9ks2miL$kimb=5t#j;`J9l7{ye3;mPF?;hvf78wkOE;XJ~|b(&kIO&Y^8D0pRqts|Kd}+8__qE$8qg~6E>w!x8D=JJ4+?d-w=3kHZN5(=g>odSOvx8KY>%*i#MICha4^M>S+X>lHJ7igB=2Uf;Z;l-2ICaswf(`4jJ>AvjDL+WqulC$st#4=7WJCYq_?z{?{ep{pDzXDa!T2g^aBtWH1Ky@M)wy9Hq{jY;dI-?Ym@MLwb9sx=I;FIERjAc)ph@e{)gePL2h*cFxMF*G4zmGiVjY48SPkB_c%QQ9MGH^}2_gG(Z~ZO`+Tk$b9Zap7)`zD36mH{#ZGbp~V@RUbxqJtZK;P8N2hP_HAoz&o&uKx!NpwGXM8)GopPNE|rxm5$PDZF8cl<=u>v1{=-v4j1Q9tSnZpH}OBlUibT$Ud`dnx-|FGrXGDz9SUU39urTpJ4No(FPJtHGKhVditDXvHi7WOIe=G0eU@|IJ=Un8zY7ed)?ed`8hwhBZBAM&A$a`3z)pZ$a93yfZn(uAIY$!r5rvaDhhOT?$Q=-p3wot_BFA&L*`aNw?1p)G`odhVLR{8gXR1iCow8rb!fK##6t|peUUE#>8P;11~QDx)nFED)5+@%0S}KV%>+E6eY!W&p~b30x0PW?fN&)2SO#JRPDZa2gHw?F;3PuwGg^1l3N|m;OYye>Q86IV&_y|(n#_fU(Q*u?vka@!XLTh9icQ1nrL2>ZqLV310AitGT!a8{q8sl{_ImCW*sTArp5YxFDtNvf;ZC~pla@{psjc*c3t;iL^|-91p9_~sO%M)1c_?{9I>fgR;xYn4Bm%R^#)sVBfsn0@s<>?6F0>BDD=V6-wmRCLv9)ZbMEa5W?U6xApQ}+4_U5cOdvzSKgR5`FR&`=-~y!=R|@~3y>FVA~C3S*OKVC2_6*u$VN9-I6qyKFt0i6bmxVQM%6Ss+tCF;l$kAwy%!95(nT{maXRlNY2w%ESQKinY+x8POp%So@=sQ^=n~ASp{IFm>YIfjulfBCA`^tQ1<4QT~K_91#DV$nI)&C&u%f9*9P1fs>~44lgt-Q=THG&YEcmvd3D+ay2%fhLLNSXQOW2cw0nUrze%bGz?m-BF1j|sd5e_<1sSo=G>=IC}bos@NZQ5)XJ7_74Z*YI*juD3(dLGcepH^a^h&`2nHEPzod>lM*PhTNh%fCQC2>5er1EfTs{H;J><`>|4*jD9P_X_9tcfItfiX+yc;HPY!-K6gOrM;H@Xdi&s)c8}V^X0j6DMjnR44$v-+laIi+SI!$>u;RDBkQt0m_d{D!IlR#RIAjV+@+gz5Nb2}RJ~}fajrD4$LoW)(oR{(*HkLuB5xskr0=MoB}+F;{o%V^p4xr|yI3vS~=HqQISuD=#pRMGTXetKzC4ZVwb>CDhAh@-6!u)w8Z9_UpEjEK++`Sh(4sY8(vbcWDP23O>%-(e+?Op)3b}#OIPf15V%Y#bXe{hIt-OJB=Y-G9U&B$r!2OJOHoEdgA}K*;DnUH{bvU!QtHvQa@<@;wUV3SsEj>RR^-&p3LgB9AZ<>(^21^RYO;wz$zhK^DOQe%>r*sPrqN!nGxEc_S9VR`18SCrkOF0d4!qz+f&${Kr$0yVH8L6x#xcr8Q=voC@G#EKw+yy=gK@ML^WAn8>2xFcfGlHsfOGLV2t6HT8mCfmz4r8Hf$&!@_Cv8EWqkk_zq8jesWbn<2dfwGJcy5ES8&eA`fS4({Up-9m|5Rj^=;u=4QK~Ztn(pS!pQf6oLAr>MY(0dq8=EXa8m?eqrt^o3&yMSE9TC`0r(lRz3@bu$64i&rsxgzEgHXeg#lww~aM}jgPxK4CbKB<;eO<3(QT#RaE#+H!E9Nb0ZR1uJriak*;%%Rh^a*jIH<9M#C2oeY3Owd2PtWmj&TjGBdOjEWSVy;*(FLPPjz%Ltqp0|G5m<>|%;+2&dd%64bMljRdix5qk;)y(Qj#?v&)Q6}L6t)uSYt&&*jGO6-n(YndhkDz~Ba}`Cp8)usAmT(&IXtGyh}%SepOj0jnY47^p5ThD4xZ)(<&%+<=(X4_sEGV-v2!>w5{}?6K(TKdtXNk0g{SOA^KZ}IaTHJ<;^UdT>x!xx|D%b>#TJ=lnYzZAL?Lb|ssKJZs0j(4zRgkOcSJ>0dJ55ll|gl{E#J_(EqJE%NVzJ58h`+w|8X+;6U^lfSYtsTfq*+Qe~juAZ`-D@{yOcQ9;Kx9PTP$q9P89h6;79zbKLrKMD(Joc1YWcgNsH3c+dVSW;h43D?aayS-Jmy*lko_T9!}{-E0k=t~9E&5D$1`&Sg`mkXmb_z%0Z823H&724TNsUXZ54X&(pADvD|i>I4e-Za_c3N<5?`t~g!N6*mfZYL^6ZX;5Igl_v1;4$iwv!^sT(MG=WnG%_L2R<8`CmO^|@I9-&e%Ez76yS?cAMR+)o*~PR|~XlTh3kHc@hJWV3;Awsb}DMof{+#sOLLu3~@S&YsYHXod8i`?NP3DI6>JVWx2EAoGA)-HSF0=Rav9jExew3nRQrdmcD<6~%ZtOnNh^L$o1LT!(%tOIt7eO}gsMI8|KCTocQS%S7_F97<*H{2Uy!EZ*K%}N=itN{W>@v4MH{-!=&)TU4x1G;JubR@i8^yKvxB0E|G#4xuQL^8&@U*s`_%;q{pTNG5Yqx1xB=BVT!GTC2NX=R0DeGd;8;YcgkojX1m^0Dk%BneMyB@LQZwRTJ?w!0)WW71-pwQg(+}1mQqcRtB2yZFGz6s5P8>o~mb9dbYlDe0_u!7Eme4<=@h0WvUC0}{;xbXW2eX;$Bi!*Az*5KH$O(ZQ`VM?)FNJx>*1FH?1}fsT0@>d&zZad1J%%Py>?gNU8c(47*rRF-aVcZ<|VBnr)iwD`L3sMU?ETdSQ!ja8ddlAs_8`wii3?ZUF^X!+#|7dyJQ~ep)j_o2-Muo(*ei5w(@`sj?svW-OrBDrA@gjt>3WRAw6QW4LqBt#flbj8h}X2@UUh38#Nzipomo>?R4Pf@aZ7v*9#c_z?D~y3NW3#C$8KebBqwMQ*SpRnXQu8TJq81UZ(Hz_XX!2$F&kk9u@te*fK$M$2-)=Md);iNZa5sFi$CLFVV#L|6CCW3K$dgZHb(0-L~T}OD7__i!x@r~S4Hddxk1_%Ffsgp*RmuhOuUerxb;7ZIJ~xUYKb_Q=xFgR#*k30dV!GWgDDuSnp^WSEF`LvVF+_!VUbqVs-)F9hgeJk#}s-Me_rI|L08;BJb-3M$@7KZG!z^fa#-`rJXw>wf7>9URhkJ1GYGYvgljdO`QZ?BUZO@!yyBj74BO;+=MENM)_W3)0)RxSm>K=|ax=B9r>#!8A^zdZ4~i8MW!zpDheazlT}*G|uHcTlWgOE=RZ$fCaf!S^XxA(h=(gRX60vhMqJE)419&w$3fdCvf0z2Pe&T4v;Kj*?fuRfrPz*_71yt%!x2~jxbt;jo*E&X@U)J1a{HU^iW4|CqyoWnObtKb|J8DiaqIdiH*1=5^2r`J3)AFCrlGQrzR*KRY#9+%A|=lf_2&VU(D(UY^;T+Jw@?@#~2KK2l&0$)TLZlF*`B4ti28Sa+^87^rsyll73+yu2Eeye3ra16-rpcVDC2fGK&f(4pS#8$@WT^1PE5gLBU?{-|i4)K~J8gX6dv}T4p-B%~#tOgeCcbql35MIAszHq^U)E;K%u9`-Z>Oe-k=QzTiCe~JCU08r|J0vpvcIV~x~6xeNt~m(6>>vz8w&So>G!25Hm2jpYZ?>0z~M2QFpda_1Hp=k4vH-ED=zHYVju_@YD_n2h~}YOVMKr?KQnJ}Od>RW#o?{>cKoleo_$8FVt`pw0>6M(D@<%>=(f`7a$cVe}Kx1G&5dQa=cBL|Bs_S;0F^tVq*9_e5pW1XY{7kC4FOLSTqWHbb&8xqm(K68o4NY%Uj1Nl}`{X?cGlcg<*RUR>u%brZIVF61y%(sQE#JrIJp!Jxf6Z47IOAv)A+am{LYBvha-s+j;>wGu-3r}ETzg~5sFYklb+)f#OZyJK3WOsuv9e#y6(Bo(N;D)>mKt$^t3(SCc}8R6c_Xmh$P5nkzSZt&GzZo9vTLQwAiIq6j*<|F?T2M-L1q`MeW-dVg`}oC+qxaA@yF7WoF?^hxl8Zhh%)v%H12v>i6+i_Dylq#76ij@y<&5Dl>cO3kFodinc&Alo=XwbtJy<{1mm*P!TDW54gT}6~U~?7>8GawqF#26UGU}}wq)Q#E%Qzx7j0i~&Ij!R5LOo}^_huP6F@+a7fRIu(g#>syglXAfwpVNN~Sl`=AhvIO*`h@XuYOexGc#JUY%bLfl;+Kfr-d^=G1n^Tcl`b&*se5#S0FYGndf~iDLh2EkvWz)K8fZe|cN{@?)20mE;WZVPLkF3c^U!5-BLbZ(19^Ls80;rp;Y)x5E|B3CyF&9G&r_!iRmi@ML2=RV0u8rn(8L6bG8AxxoxB8CgK=Rs(->z;xAbk3Da8X!D=@1Cs(&yz@HAlZPNd|hZK;cU1eRvqw`pr{VbcE2v8U!Dd047Q-`tu_@Ql^dfdaiz|M#=YuvnJ{q`0XwMkF)TD|r%d8>#b;;hMavE|&b7%P0$HW=6$u8KAlOvcSxh&wvkh%H=c--c`vi}Hscky&eIl~cq9QqSNl~^)Q}&mZ=Ap`i|JshJ>}|}kyr$x@R-oF*;IP^=lp2RV0bn;sVJ^o;>{BFxu=zHk>^#>nx^NOPeV(lqiOi*fw|37x(I`TKsuX0*Z@dgL8C)c0l;PLxtX1T%~Is0V`obJxe}G?@~kx`McMAUA5!5()8wrAYdSlE{6X$O(Ifr1$9@+nFjhlHbZV*I34(TCszq-P#dWG}1PQi?!H9Fqqe!icGp$Fd%Pudx0!h!Q)K?*2r|EWl>a1hTL24q4S#EA-LbV4lZ%;5OCoM9##2?myu;z?f&90ZnGtGsVHI7vLsCxes&NXLtb$^l>1*%vRI$3scL>4aQ$@F*}TM4!=S`+{<@n2ow^6YikD~>;Y;~H_Dj{Pl*M(cFk4aw9^WVWS1XX*-$E%rL324tCsv&EX+ueC3^>&oueR}(pC|RRuI%*VOw)vz>K-;wj*7QSS+8q;oKz6p-{b9fxsA6_P^TSEFk<72H+kMSQ3jBV%1{fiv`to2IMEG*Pj}~VHb*KmXSX*ful)y~35a=43TLQJ1^76~LDPRc)BW8duSWaXCsC1Ca;rI?Ss)Hd*cnK%h(QfK(NIOxddt+2GwX^sOtD+}W5F=>10<``3i_x2t7G;y;vWQZWAE|+@)lu4QRWp_0DxjR>-{HEBxPYeCD=~X&q@L%U2IMkNNN@qNvRp-T^$9VOLWAipHkv!D}yaaI5lvU8HqYm$ZbEE-#X5=JZWjE>-I2}$2Sf#TS@Hr7V@#M<|0ZkIl>Ojk#O5SOnTemSYFGwyzVHKigX1LB{Zasa47qbbA4a+gO7QSELIi7PHW5Np?R8N`bg?J56O&lZ7SB*T78;71Z~{Nhteg_KRBeUeL_Q@u*;~|J^p4FGl%Ewpog&B4qPfdj>ikmLt}rPEls%4_Qy!$H~7zN+V+sLa_6ro?ie=3%8q7k92?h>g~0zJrs{2Hz&5%H|A<Ml>R+G0RKy2?e6!is^JNi8M-(>bT(?xq7wEO3ho9*%!C7zY-pq&$tb2Vj-8B2RS9=(ZP8k>3t%cKn(XRY-&Rw-B6{97TOug(KrHH-o}t7)A`NvcD;Lj7F!SJ(sL*4j&SNbD}%dQHh6aLAapoEfVJlK5!8WOep?M#Kh4tUM(BqYW)+D?dMfJ_`B{l4Dc2P^;g@xEudxwFdlohxV?QiM__%(JYO=-yG$~tRP^eMPlQl_MqbyDBhYBA6lffX0~v|fb~|3kBsn-HHE6VjpYvio%n*fBJ;$Uwt50Eod7*G)D<=xOoX*y_-RR2&j3}C=ES8|R*)t%eBKm=mI=#rCa3vvgX;&ARrprrVbQfr$^+Epz1TQpY&y>$kOeq%)v^Bt;U&6IJs`Y8b8qX5PtD5;s>&h`bJM(30BuPy!R=?VG>Dtd0aOBu}!Vlq53i8KvIM5f57hEY%+9S=LuF?+%g)+-iw*G*DMVi9h9UK%ER=sgZ>cbwequFmq-b$%$z9Hjq3(?8TJJSNzic_N&Y69qPxy)Uyg`n-pY9Ja0Xf!*JcWOQMqR$V!`2z{j1cNbwXQZDVmw-^i~EK+B9TDj!q!V)m$;LtMYJNN8YR|_u?gQmSvgk8czO8>on2l%)wc0m)`l%CN)I*Q+>8iS$oZa4|!b)Lnih?T&CJN$$l3%(LK_Th9_=em44RIl(joYiZLH{?EB70g)yp>+3g;(nR7lvsSE6y~EHB~jX}*xVXp&dBcMY0^qp$J?=P#)!*|Dt@+Tjgxmwd_(u0BlE>}d_{qUaV&A}3oUWgaK}HJ4{z@Xr)Dq3v}xqq`-|=gbZ=dl<1iY?G3wSDGz35WhcdGFj@3obQl#cq%Uz#Cq@UAL7whm`-UW}c;Z=%#U$1uRH}*&b#6UxBjZxqqBh7x^1XAcbv1sFn}fG#XZBUTwpTC7v=-i?%I!O|C^AhUFikd|>x{ZEw(G7j7>CIjBGd!|*X?eQ;(;2|o_-BqNbN?Wk~E5_cRuKcU|+fmc2!sjYj#giwqBew#q0FPd~MZiN7=f4`)m&2+Yg+Noep`8e2ND*)n8WV>$Oe*bBiR+Fi#SC-{pC-T~lOv!TPj>8UYq>Vc*EA3l-1d&?VIIo!xXH(OGjaO1ljUuNM%~)wVkD7yGsdDkQX-Zm0#7|@DTsTm0p+Q=9^`J1Sp9xAAxMA*?p~l=FqsuKe1Z-aypfkTynGV)I*GxTr}U6I%x3aqF1gt((n`|6-65T%?}bit^Sx_1Za?0?nRbd@G2QX?L~!!!E(`Ph!G^i|x&q+`Glcs+m?+b#EZ7-g($Wjn`7(`+cKBPqqh_i4VrmgQ|L<6xF9oUoUqLz_ypB`S(7y{>$SW&V0P5mA%xKGl&n_iLX&xRoqX|5!z}Ev+54c=DA(5J7hvgD&DOfmp%a-6fzVX1k}JdJ3!JiQ!*%3hcN48ZqQcKz5uXJCwyGegtKp`z`>I2f2Aa5GMkE7GF(mZpa$h;D!JlCD6MIQAeMoY2{V(r;*=0YspEV`j$QV#4U{iLt*1nrUV5;Z(NF()giILQm*rvoI#)JdaczYxwTSG>?7*C@iTGkF)8Y^r&1IgPuYC2xIteiw%>iE%+|hP4?LRD$9bx`#wG@|6_7wL8H8p~lsnHp03$vJ}He)vMG~d?_kpi8cOSAtL^Tm2Ufa;{6e(9c*u@uY$?#PEU7edc6jGi_RGTk2`kepv0#tlKYn0i$chlZ*d9gU@W;?kIpiAwgDi3gxx3pz!g|wZd76;bM2UN@x`2TjpKj~^ulE1tyKrIap=q&Z{GdAKx=za{J_r$Al1^L5s~HL{v_{a8;fn2!96e6W(b?#c12&Vd5fkr3mH-bnI-1LTLUeBj!yrTDQxd-Kk;-{q-YhqBno5O>)i~t7ZY7OhYcL4}o{~i{1RW7iDcw0Muy9~;D6PBC4*n8QEDl&S;X|df7><$S?}+$FKAQ?wto2D{IXKbYAgI(HG6VZaB1ZT8(_u!oqlPAk!_AF#eQfQelm8lS^ivM>I|FZW#*w9U{OH7^^Je|M`{nFrXylLZwP}LKL4I;08Ph_ESt+Jc_;K^c-xXWZMHS6eXy4r1YtjxqV^)3d?fg`Bl5hQw85`JFd9|D;^jxnxdYLup0i6TAeA$+qj}wVoA%B8xtH-b3E>vVps&_Q;Ay`h5CIN3sy(Pa6>cD5D8N8qtq95-M|e#FiTX#}ogS)9=QX2;Vo!6fchhXqL6^6(hK|H${3r4KQ#ZZxdo*OO{@1w|x|wdZ-7AsN#ypF6+Mn|Q>+tlF^r%KexMx)%SJHe?n>`^PXBK-uRf@JKEAut^y>opSwq@msGT;RexZ)!WTb`)eJuF2aXN!g&VnAE1L}UMvY72_n3&h3KkmVF*{EfvpQ%lL6777|l;gT@RPq#U!$7o-(F0JYHnb01ZYCYV+EoAOi2r0e5xN(yQ&B2(HLwTn(@ky{*ByXO9nB157orm1LAaZIEOOITZBv4+5x;g7Q9i)>5Bd{y*`0B32=YxH0j#>z@kl9Z8{o$5lrUvzq(fw7tMPol9Q4Kof?wWSe%nc-K_igY~B;CU;RurrbQuFAJg*}ZbUiNNorv|;S$^;G^7{s%OLucagxM7!Nkk?HYz&wXzU1PIf`Bq4(->)ea>Pm&ixS%cah?=bACW|&AU#0I_@n?8bT&ca9D3AQBC6szKx<>UGgd>!8RBFy&lPQ@#>4e=JW;&O2*mnQau-mHwUNM;T92A!yviQMjvF+v!BL6728-@^TKlSbQ6GMirrjd{WEq5da@sbo%L5g6^M%86xm9@V&LMV#sp@oBq#X9%-?MbGsP;c=|%an>f5%t^|6e$<7whJqH%8^eI?mpkbr|eYzzD40W6$tXs09)>7D{3t~X4ms>&xSm~>RI!DeHaoD7tZ(fF@tUOqRq@(w;<}^KtCl?*5sG?gYz92-uFkCO_i7GcPX7Nsn16!H`*R@DDIV;;`K(i9suKElVf3q#(7s%pKK51ZoWG2HBF%z|%UjHv+}3Ehdh+bH5Jq|maGKHJw4g(6_nD7&vJ0E7T5gpz_N9UJZ-*iV=A=UOh@S(y`&@td>Ht^^-yJsPw-$5W_A^rQ(|FPr`StfaK2x)@Y6naeCwONJ*Mb9WsXRSwd{h4nx7%~=AG@B{Dq%b9J*kkB%_6C+msAQ;|%wG6G$G_J5vGjbA&HJI3h+F*RH0?i+)7g6XGPKAI3isG$#Q%~%^h7_s^qIRL4j@`kuIiy^v=D<%FJkhQ6V8+GKXN9EeR)|L$hqThyxaYB+l0;c&h!_rGf7?h%5b_D4|||Rt<&UGK;}31miUJ!AxOYON7j_2F?RM?(r6CQoem#YG`x&0Y4~wwhg(Jx0I=>ir3IzZ6t>Op~ah|Qsp7+B^z`8PwcmVV@N}hwY)#t9{$u)FTl+{qVI_$+R*+f+uA;4RRQGHO@7Dzt|9XZn|_~gn`@M7uT{8%UQlfOaG5r@Y5x?Q0W$?3+R?)PbclqAVCVz|g~dLq`o5T(2zRFUd=eA3LvLi!n6K{{oi*6NO}QxV0O08M3Q`vgLdd3On=TYT_#_o|6VV4+pMh(qD`}yZbj^gbaqG4=lR(4}e~wl^4$&ax>iP=HfnK~^Qw8dP;N_d&qqXk3H0@mkIXhu!z;lPiihkjGu87|e1q$yMGx({s;A1l80I?;j03O#`eiin>J*IrZLHc6zx6+Diynrracp(HKroawq8Ynk71SMqnW*7s3jE(Rx)$^?ubYz5$#!F>fenInwGm#&*!x3D5AWfEB#qZGO>JI|Gs)qs`GKGo-z1ah2?afe8ej&pidT1u?Omc*^S;)J9=W9CDDa=;u$Rau=So@aUkkh>dLa_wg>&t2+|XUWEg=@O$_0F6`JL&Q`6PwpU#Y)(RaS+2$9H($=!Y#~p?#X`&ybB83-F}PnQonILjJ0HIGCWv;&8Fao#Ij6D6y%zOcO=F2b7K5BLdb}ACessCgcZ#%8RD7mxJ|XplUqJa!v`xyue^G%-NfWR9`+`37%M}T8sl~Tq&+r|ZYx8C{CAYziY{L$cl3EVE0Fn7FML63%hi{%l(i73ancm)J0ul8`6u&PrwyLM_l9I0+CIZWh9`YnCO&l*#j=B@gaTjj{@U=MToYehc*#6hRJfXvMrfDe=ON0ZOm@D@=;m+HRbc^{^NpepRbUF9`9>d-XFkWB#I;baxf9LRSzd$Xkt-nGdmJG-t(a(-NO57QLMVzdbskNJqkUUBa-);OIWi2K%FNE3Yzg%2ee+zu}t-XB4-%6`B|$zT(xB|c+&0kAUS?J%uYR!e1)@%|Ye}O@8H~LgqYG2t#WC#-?D#c-9;%bRX0kY~EI+b2&@u3%DkZbr$cJ3xr{@tE347+p%hnh9Tf2%yv}|nRAy)U7wxUqvzSef+y5jHa(FANm%iJe!yK-1dNZ5(=~7eBuqt*5k^i3?dM@{cp{rjK{mcJEto&i)EusvyDP7s6XRh4ca(RO;e7I!M3{+I&Mo1Jr=G1|*$X8nn}e6s~`;I?zCy^ZTPo<3}@?AAlI^Aax!E7T|n;m#X|C6Sk-BoS~E0la%7#$cbhgTxa7}9TMNA84=8Y}nIhE%fiw+_qUtV271IY$?nIs(E94+i2g)lr)0e{-*s6ljki0dBvhikf2?gn1d;(}k;7&Ug*TP@b#-C0zayXGImy&-!ux##*!Tg)pySG%&ufp)Se(LFkYJx@uNxDT>Nrv)T^rz?xSQj2KC?|6AsOLb2KEX|b}1Ce>e(MWWfKg&5gSYN@T=Drr1aDH~ScTS{TyWImL7>CwMt9#2`-$!SyQ?mrl6NKc@oRL-t}aVsHbXNx@9rhku9ZUM!P$R3iWw`bE&uDj3_0#PRh + sys.exit(main()) + ^^^^^^ + File "/usr/local/lib/python3.12/dist-packages/torch/distributed/elastic/multiprocessing/errors/__init__.py", line 357, in wrapper + return f(*args, **kwargs) + ^^^^^^^^^^^^^^^^^^ + File "/usr/local/lib/python3.12/dist-packages/torch/distributed/run.py", line 901, in main + run(args) + File "/usr/local/lib/python3.12/dist-packages/torch/distributed/run.py", line 892, in run + elastic_launch( + File "/usr/local/lib/python3.12/dist-packages/torch/distributed/launcher/api.py", line 143, in __call__ + return launch_agent(self._config, self._entrypoint, list(args)) + ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ + File "/usr/local/lib/python3.12/dist-packages/torch/distributed/launcher/api.py", line 277, in launch_agent + raise ChildFailedError( +torch.distributed.elastic.multiprocessing.errors.ChildFailedError: +======================================================== +train_gpt.py FAILED +-------------------------------------------------------- +Failures: + +-------------------------------------------------------- +Root Cause (first observed failure): +[0]: + time : 2026-04-30_18:43:34 + host : d82d1968b256 + rank : 0 (local_rank: 0) + exitcode : -11 (pid: 434877) + error_file: + traceback : Signal 11 (SIGSEGV) received by PID 434877 +======================================================== diff --git a/records/track_10min_16mb/2026-04-30_OptE_SlidingWindow_CondPPM/train_seed314.log b/records/track_10min_16mb/2026-04-30_OptE_SlidingWindow_CondPPM/train_seed314.log new file mode 100644 index 0000000000..38355540b5 --- /dev/null +++ b/records/track_10min_16mb/2026-04-30_OptE_SlidingWindow_CondPPM/train_seed314.log @@ -0,0 +1,262 @@ +W0430 18:43:37.415000 448288 torch/distributed/run.py:774] +W0430 18:43:37.415000 448288 torch/distributed/run.py:774] ***************************************** +W0430 18:43:37.415000 448288 torch/distributed/run.py:774] Setting OMP_NUM_THREADS environment variable for each process to be 1 in default, to avoid your system being overloaded, please further tune the variable for optimal performance in your application as needed. +W0430 18:43:37.415000 448288 torch/distributed/run.py:774] ***************************************** +Hyperparameters: + adam_eps: 1e-08 + adam_wd: 0.02 + artifact_dir: + attn_clip_sigmas: 13.0 + attn_out_gate_enabled: False + attn_out_gate_src: proj + awq_lite_bits: 8 + awq_lite_enabled: False + awq_lite_group_size: 64 + awq_lite_group_top_k: 1 + beta1: 0.9 + beta2: 0.95 + caseops_enabled: True + compressor: brotli + datasets_dir: ./data/datasets/fineweb10B_sp8192_caseops/datasets/datasets/fineweb10B_sp8192_lossless_caps_caseops_v1_reserved + distributed: True + ema_decay: 0.9965 + embed_bits: 6 + embed_clip_sigmas: 15.0 + embed_lr: 0.6 + embed_wd: 0.085 + enable_looping_at: 0.65 + eval_seq_len: 2048 + eval_stride: 64 + fused_ce_enabled: True + gate_window: 12 + gated_attn_enabled: False + gated_attn_init_std: 0.01 + gated_attn_quant_gate: False + global_ttt_batch_seqs: 32 + global_ttt_chunk_tokens: 32768 + global_ttt_epochs: 1 + global_ttt_grad_clip: 1.0 + global_ttt_lr: 0.001 + global_ttt_momentum: 0.9 + global_ttt_respect_doc_boundaries: True + global_ttt_warmup_chunks: 0 + global_ttt_warmup_start_lr: 0.0 + gptq_calibration_batches: 16 + gptq_reserve_seconds: 0.5 + grad_accum_steps: 1 + grad_clip_norm: 0.3 + is_main_process: True + iterations: 20000 + jepa_aux_weight: 0.0 + ln_scale: True + local_rank: 0 + logfile: logs/podE_seed314.txt + logit_softcap: 30.0 + loop_end: 5 + loop_start: 3 + lqer_asym_enabled: True + lqer_asym_group: 64 + lqer_enabled: True + lqer_factor_bits: 4 + lqer_rank: 4 + lqer_top_k: 3 + macaron_enabled: False + matrix_bits: 6 + matrix_clip_sigmas: 11.5 + matrix_lr: 0.026 + max_wallclock_seconds: 600.0 + min_lr: 0.1 + mlp_clip_sigmas: 11.5 + mlp_mult: 4.0 + model_dim: 512 + model_path: final_model.pt + mtp_heads: 0 + mtp_weight: 0.3 + multi_exit_aux_weight: 0.1 + multi_exit_enabled: False + multi_exit_layers: 4,6,8 + multi_exit_mix_lr: 0.05 + multi_exit_mix_steps: 80 + muon_backend_steps: 5 + muon_momentum: 0.97 + muon_momentum_warmup_start: 0.92 + muon_momentum_warmup_steps: 1500 + muon_row_normalize: True + muon_wd: 0.095 + num_heads: 8 + num_kv_heads: 4 + num_layers: 11 + num_loops: 2 + parallel_final_lane: mean + parallel_start_layer: 5 + phased_ttt_num_phases: 1 + phased_ttt_prefix_docs: 2000 + ppm_byte_conditional_alpha: 15.0 + ppm_byte_conditional_beta: 0.8 + ppm_byte_conditional_enabled: True + ppm_conf_threshold: 0.9 + ppm_enabled: False + ppm_gate_mode: binary + ppm_lambda_hi: 0.9 + ppm_lambda_lo: 0.05 + ppm_mix_level: byte + ppm_order: 5 + ppm_sigmoid_alpha: 15.0 + ppm_sigmoid_beta: 0.8 + ppm_subset_tokens: 5000000 + ppm_token_conf_aggregate: mean + prequant_ttt_batch_seqs: 32 + prequant_ttt_beta1: 0.9 + prequant_ttt_beta2: 0.999 + prequant_ttt_chunk_tokens: 32768 + prequant_ttt_compile: True + prequant_ttt_enabled: False + prequant_ttt_epochs: 21 + prequant_ttt_fedavg_weights: True + prequant_ttt_grad_clip: 1.0 + prequant_ttt_lr: 0.0005 + prequant_ttt_lr_final: 5e-05 + prequant_ttt_optimizer: adamw + prequant_ttt_weight_decay: 0.0 + qk_gain_init: 5.0 + quantized_model_path: final_model.int6.ptz + rank: 0 + rope_base: 10000.0 + rope_dims: 16 + rope_train_seq_len: 2048 + rope_yarn: False + run_id: podE_seed314 + scalar_lr: 0.02 + seed: 314 + skip_gates_enabled: True + sliding_window_batch_seqs: 8 + sliding_window_enabled: True + smear_gate_enabled: True + sparse_attn_gate_enabled: True + sparse_attn_gate_init_std: 0.0 + sparse_attn_gate_scale: 1.0 + stoch_depth_max: 0.02 + stoch_depth_schedule: linear + temp_cal_enabled: False + temp_cal_lr: 0.1 + temp_cal_steps: 50 + tie_embeddings: True + tied_embed_init_std: 0.005 + tied_embed_lr: 0.03 + tokenizer_path: ./tokenizers/fineweb_8192_bpe_lossless_caps_caseops_v1_reserved.model + train_batch_tokens: 786432 + train_files: ./data/datasets/fineweb10B_sp8192_caseops/datasets/datasets/fineweb10B_sp8192_lossless_caps_caseops_v1_reserved/fineweb_train_*.bin + train_log_every: 500 + train_seq_len: 2048 + ttt_batch_size: 64 + ttt_beta1: 0.0 + ttt_beta2: 0.999 + ttt_chunk_size: 48 + ttt_enabled: False + ttt_eval_batches: + ttt_eval_seq_len: 2048 + ttt_grad_steps: 1 + ttt_k_lora: True + ttt_lora_lr: 0.0001 + ttt_lora_rank: 96 + ttt_mlp_lora: True + ttt_o_lora: True + ttt_optimizer: adam + ttt_weight_decay: 1.0 + val_batch_tokens: 524288 + val_bytes_files: ./data/datasets/fineweb10B_sp8192_caseops/datasets/datasets/fineweb10B_sp8192_lossless_caps_caseops_v1_reserved/fineweb_val_bytes_*.bin + val_doc_fraction: 1.0 + val_files: ./data/datasets/fineweb10B_sp8192_caseops/datasets/datasets/fineweb10B_sp8192_lossless_caps_caseops_v1_reserved/fineweb_val_*.bin + val_loss_every: 4000 + vocab_size: 8192 + warmdown_frac: 0.85 + warmup_steps: 20 + world_size: 8 + xsa_last_n: 11 +train_shards: 97 +val_tokens: 9662464 +model_params:35945671 +gptq:reserving 0s, effective=599500ms +warmup_cu_buckets:64,128,192,256 iters_each:3 +warmup_step: 1/20 +warmup_step: 2/20 +warmup_step: 3/20 +warmup_step: 4/20 +warmup_step: 5/20 +warmup_step: 6/20 +warmup_step: 10/20 +warmup_step: 20/20 +loop_warmup:enabled encoder:[0, 1, 2, 3, 4, 5, 3, 4] decoder:[5, 3, 4, 5, 6, 7, 8, 9, 10] +loop_warmup_step: 1/20 +loop_warmup_step: 2/20 +loop_warmup_step: 3/20 +loop_warmup_step: 4/20 +loop_warmup_step: 5/20 +loop_warmup_step: 6/20 +loop_warmup_step: 10/20 +loop_warmup_step: 20/20 +0/20000 val_loss: 8.9968 val_bpb: 4.1859 +1/20000 train_loss: 8.9992 train_time: 0.0m tok/s: 10819469 +2/20000 train_loss: 12.8805 train_time: 0.0m tok/s: 5757618 +3/20000 train_loss: 10.1836 train_time: 0.0m tok/s: 4229766 +4/20000 train_loss: 8.6476 train_time: 0.0m tok/s: 3696917 +5/20000 train_loss: 7.8203 train_time: 0.0m tok/s: 3423127 +500/20000 train_loss: 2.7081 train_time: 2.9m tok/s: 2270735 +layer_loop:enabled step:997 frac:0.650 encoder:[0, 1, 2, 3, 4, 5, 3, 4] decoder:[5, 3, 4, 5, 6, 7, 8, 9, 10] +1000/20000 train_loss: 2.8019 train_time: 6.5m tok/s: 2010036 +1333/20000 val_loss: 2.4798 val_bpb: 1.1538 +stopping_early: wallclock_cap train_time: 600264ms step: 1333/20000 +peak memory allocated: 38518 MiB reserved: 43834 MiB +ema:applying EMA weights +diagnostic pre-quantization post-ema val_loss:2.51218666 val_bpb:1.16884371 eval_time:6133ms +Serialized model: 135414717 bytes +Code size (uncompressed): 235811 bytes +Code size (compressed): 41450 bytes +GPTQ:collecting Hessians from calibration data... +GPTQ:collected 67 Hessians + 67 act_stats in 4.7s +Quantized weights: + gptq (int6): blocks.attn.c_k.weight, blocks.attn.c_q.weight, blocks.attn.c_v.weight, blocks.attn.proj.weight, blocks.mlp.fc.weight, blocks.mlp.proj.weight + gptq (int6)+lqer_asym: blocks.mlp.fc.weight, tok_emb.weight + passthrough (float16): blocks.attn.attn_gate_w, blocks.attn.q_gain, blocks.attn_scale, blocks.mlp_scale, blocks.resid_mix, parallel_post_lambdas, parallel_resid_lambdas, skip_gates, skip_weights, smear_gate.weight, smear_lambda +Serialized model quantized+brotli: 15542310 bytes +Total submission size quantized+brotli: 15583760 bytes +W0430 18:58:23.642000 448288 torch/distributed/elastic/multiprocessing/api.py:900] Sending process 448356 closing signal SIGTERM +W0430 18:58:23.643000 448288 torch/distributed/elastic/multiprocessing/api.py:900] Sending process 448358 closing signal SIGTERM +W0430 18:58:23.643000 448288 torch/distributed/elastic/multiprocessing/api.py:900] Sending process 448359 closing signal SIGTERM +W0430 18:58:23.644000 448288 torch/distributed/elastic/multiprocessing/api.py:900] Sending process 448360 closing signal SIGTERM +W0430 18:58:23.644000 448288 torch/distributed/elastic/multiprocessing/api.py:900] Sending process 448361 closing signal SIGTERM +W0430 18:58:23.645000 448288 torch/distributed/elastic/multiprocessing/api.py:900] Sending process 448362 closing signal SIGTERM +W0430 18:58:23.645000 448288 torch/distributed/elastic/multiprocessing/api.py:900] Sending process 448363 closing signal SIGTERM +E0430 18:58:25.004000 448288 torch/distributed/elastic/multiprocessing/api.py:874] failed (exitcode: -11) local_rank: 1 (pid: 448357) of binary: /usr/local/bin/python +Traceback (most recent call last): + File "/usr/local/bin/torchrun", line 7, in + sys.exit(main()) + ^^^^^^ + File "/usr/local/lib/python3.12/dist-packages/torch/distributed/elastic/multiprocessing/errors/__init__.py", line 357, in wrapper + return f(*args, **kwargs) + ^^^^^^^^^^^^^^^^^^ + File "/usr/local/lib/python3.12/dist-packages/torch/distributed/run.py", line 901, in main + run(args) + File "/usr/local/lib/python3.12/dist-packages/torch/distributed/run.py", line 892, in run + elastic_launch( + File "/usr/local/lib/python3.12/dist-packages/torch/distributed/launcher/api.py", line 143, in __call__ + return launch_agent(self._config, self._entrypoint, list(args)) + ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ + File "/usr/local/lib/python3.12/dist-packages/torch/distributed/launcher/api.py", line 277, in launch_agent + raise ChildFailedError( +torch.distributed.elastic.multiprocessing.errors.ChildFailedError: +======================================================== +train_gpt.py FAILED +-------------------------------------------------------- +Failures: + +-------------------------------------------------------- +Root Cause (first observed failure): +[0]: + time : 2026-04-30_18:58:23 + host : d82d1968b256 + rank : 1 (local_rank: 1) + exitcode : -11 (pid: 448357) + error_file: + traceback : Signal 11 (SIGSEGV) received by PID 448357 +======================================================== diff --git a/records/track_10min_16mb/2026-04-30_OptE_SlidingWindow_CondPPM/train_seed42.log b/records/track_10min_16mb/2026-04-30_OptE_SlidingWindow_CondPPM/train_seed42.log new file mode 100644 index 0000000000..c182d0fc49 --- /dev/null +++ b/records/track_10min_16mb/2026-04-30_OptE_SlidingWindow_CondPPM/train_seed42.log @@ -0,0 +1,262 @@ +W0430 18:14:02.323000 421368 torch/distributed/run.py:774] +W0430 18:14:02.323000 421368 torch/distributed/run.py:774] ***************************************** +W0430 18:14:02.323000 421368 torch/distributed/run.py:774] Setting OMP_NUM_THREADS environment variable for each process to be 1 in default, to avoid your system being overloaded, please further tune the variable for optimal performance in your application as needed. +W0430 18:14:02.323000 421368 torch/distributed/run.py:774] ***************************************** +Hyperparameters: + adam_eps: 1e-08 + adam_wd: 0.02 + artifact_dir: + attn_clip_sigmas: 13.0 + attn_out_gate_enabled: False + attn_out_gate_src: proj + awq_lite_bits: 8 + awq_lite_enabled: False + awq_lite_group_size: 64 + awq_lite_group_top_k: 1 + beta1: 0.9 + beta2: 0.95 + caseops_enabled: True + compressor: brotli + datasets_dir: ./data/datasets/fineweb10B_sp8192_caseops/datasets/datasets/fineweb10B_sp8192_lossless_caps_caseops_v1_reserved + distributed: True + ema_decay: 0.9965 + embed_bits: 6 + embed_clip_sigmas: 15.0 + embed_lr: 0.6 + embed_wd: 0.085 + enable_looping_at: 0.65 + eval_seq_len: 2048 + eval_stride: 64 + fused_ce_enabled: True + gate_window: 12 + gated_attn_enabled: False + gated_attn_init_std: 0.01 + gated_attn_quant_gate: False + global_ttt_batch_seqs: 32 + global_ttt_chunk_tokens: 32768 + global_ttt_epochs: 1 + global_ttt_grad_clip: 1.0 + global_ttt_lr: 0.001 + global_ttt_momentum: 0.9 + global_ttt_respect_doc_boundaries: True + global_ttt_warmup_chunks: 0 + global_ttt_warmup_start_lr: 0.0 + gptq_calibration_batches: 16 + gptq_reserve_seconds: 0.5 + grad_accum_steps: 1 + grad_clip_norm: 0.3 + is_main_process: True + iterations: 20000 + jepa_aux_weight: 0.0 + ln_scale: True + local_rank: 0 + logfile: logs/podE_seed42.txt + logit_softcap: 30.0 + loop_end: 5 + loop_start: 3 + lqer_asym_enabled: True + lqer_asym_group: 64 + lqer_enabled: True + lqer_factor_bits: 4 + lqer_rank: 4 + lqer_top_k: 3 + macaron_enabled: False + matrix_bits: 6 + matrix_clip_sigmas: 11.5 + matrix_lr: 0.026 + max_wallclock_seconds: 600.0 + min_lr: 0.1 + mlp_clip_sigmas: 11.5 + mlp_mult: 4.0 + model_dim: 512 + model_path: final_model.pt + mtp_heads: 0 + mtp_weight: 0.3 + multi_exit_aux_weight: 0.1 + multi_exit_enabled: False + multi_exit_layers: 4,6,8 + multi_exit_mix_lr: 0.05 + multi_exit_mix_steps: 80 + muon_backend_steps: 5 + muon_momentum: 0.97 + muon_momentum_warmup_start: 0.92 + muon_momentum_warmup_steps: 1500 + muon_row_normalize: True + muon_wd: 0.095 + num_heads: 8 + num_kv_heads: 4 + num_layers: 11 + num_loops: 2 + parallel_final_lane: mean + parallel_start_layer: 5 + phased_ttt_num_phases: 1 + phased_ttt_prefix_docs: 2000 + ppm_byte_conditional_alpha: 15.0 + ppm_byte_conditional_beta: 0.8 + ppm_byte_conditional_enabled: True + ppm_conf_threshold: 0.9 + ppm_enabled: False + ppm_gate_mode: binary + ppm_lambda_hi: 0.9 + ppm_lambda_lo: 0.05 + ppm_mix_level: byte + ppm_order: 5 + ppm_sigmoid_alpha: 15.0 + ppm_sigmoid_beta: 0.8 + ppm_subset_tokens: 5000000 + ppm_token_conf_aggregate: mean + prequant_ttt_batch_seqs: 32 + prequant_ttt_beta1: 0.9 + prequant_ttt_beta2: 0.999 + prequant_ttt_chunk_tokens: 32768 + prequant_ttt_compile: True + prequant_ttt_enabled: False + prequant_ttt_epochs: 21 + prequant_ttt_fedavg_weights: True + prequant_ttt_grad_clip: 1.0 + prequant_ttt_lr: 0.0005 + prequant_ttt_lr_final: 5e-05 + prequant_ttt_optimizer: adamw + prequant_ttt_weight_decay: 0.0 + qk_gain_init: 5.0 + quantized_model_path: final_model.int6.ptz + rank: 0 + rope_base: 10000.0 + rope_dims: 16 + rope_train_seq_len: 2048 + rope_yarn: False + run_id: podE_seed42 + scalar_lr: 0.02 + seed: 42 + skip_gates_enabled: True + sliding_window_batch_seqs: 8 + sliding_window_enabled: True + smear_gate_enabled: True + sparse_attn_gate_enabled: True + sparse_attn_gate_init_std: 0.0 + sparse_attn_gate_scale: 1.0 + stoch_depth_max: 0.02 + stoch_depth_schedule: linear + temp_cal_enabled: False + temp_cal_lr: 0.1 + temp_cal_steps: 50 + tie_embeddings: True + tied_embed_init_std: 0.005 + tied_embed_lr: 0.03 + tokenizer_path: ./tokenizers/fineweb_8192_bpe_lossless_caps_caseops_v1_reserved.model + train_batch_tokens: 786432 + train_files: ./data/datasets/fineweb10B_sp8192_caseops/datasets/datasets/fineweb10B_sp8192_lossless_caps_caseops_v1_reserved/fineweb_train_*.bin + train_log_every: 500 + train_seq_len: 2048 + ttt_batch_size: 64 + ttt_beta1: 0.0 + ttt_beta2: 0.999 + ttt_chunk_size: 48 + ttt_enabled: False + ttt_eval_batches: + ttt_eval_seq_len: 2048 + ttt_grad_steps: 1 + ttt_k_lora: True + ttt_lora_lr: 0.0001 + ttt_lora_rank: 96 + ttt_mlp_lora: True + ttt_o_lora: True + ttt_optimizer: adam + ttt_weight_decay: 1.0 + val_batch_tokens: 524288 + val_bytes_files: ./data/datasets/fineweb10B_sp8192_caseops/datasets/datasets/fineweb10B_sp8192_lossless_caps_caseops_v1_reserved/fineweb_val_bytes_*.bin + val_doc_fraction: 1.0 + val_files: ./data/datasets/fineweb10B_sp8192_caseops/datasets/datasets/fineweb10B_sp8192_lossless_caps_caseops_v1_reserved/fineweb_val_*.bin + val_loss_every: 4000 + vocab_size: 8192 + warmdown_frac: 0.85 + warmup_steps: 20 + world_size: 8 + xsa_last_n: 11 +train_shards: 97 +val_tokens: 9662464 +model_params:35945671 +gptq:reserving 0s, effective=599500ms +warmup_cu_buckets:64,128,192,256 iters_each:3 +warmup_step: 1/20 +warmup_step: 2/20 +warmup_step: 3/20 +warmup_step: 4/20 +warmup_step: 5/20 +warmup_step: 6/20 +warmup_step: 10/20 +warmup_step: 20/20 +loop_warmup:enabled encoder:[0, 1, 2, 3, 4, 5, 3, 4] decoder:[5, 3, 4, 5, 6, 7, 8, 9, 10] +loop_warmup_step: 1/20 +loop_warmup_step: 2/20 +loop_warmup_step: 3/20 +loop_warmup_step: 4/20 +loop_warmup_step: 5/20 +loop_warmup_step: 6/20 +loop_warmup_step: 10/20 +loop_warmup_step: 20/20 +0/20000 val_loss: 9.0066 val_bpb: 4.1905 +1/20000 train_loss: 9.0074 train_time: 0.0m tok/s: 10868692 +2/20000 train_loss: 12.8395 train_time: 0.0m tok/s: 5869519 +3/20000 train_loss: 10.1516 train_time: 0.0m tok/s: 4333069 +4/20000 train_loss: 8.6342 train_time: 0.0m tok/s: 3831348 +5/20000 train_loss: 7.8758 train_time: 0.0m tok/s: 3550682 +500/20000 train_loss: 2.7084 train_time: 2.8m tok/s: 2332791 +layer_loop:enabled step:991 frac:0.650 encoder:[0, 1, 2, 3, 4, 5, 3, 4] decoder:[5, 3, 4, 5, 6, 7, 8, 9, 10] +1000/20000 train_loss: 2.7239 train_time: 6.6m tok/s: 1992533 +1331/20000 val_loss: 2.4788 val_bpb: 1.1533 +stopping_early: wallclock_cap train_time: 600225ms step: 1331/20000 +peak memory allocated: 38518 MiB reserved: 43834 MiB +ema:applying EMA weights +diagnostic pre-quantization post-ema val_loss:2.51040383 val_bpb:1.16801422 eval_time:6666ms +Serialized model: 135414717 bytes +Code size (uncompressed): 235811 bytes +Code size (compressed): 41450 bytes +GPTQ:collecting Hessians from calibration data... +GPTQ:collected 67 Hessians + 67 act_stats in 4.7s +Quantized weights: + gptq (int6): blocks.attn.c_k.weight, blocks.attn.c_q.weight, blocks.attn.c_v.weight, blocks.attn.proj.weight, blocks.mlp.fc.weight, blocks.mlp.proj.weight + gptq (int6)+lqer_asym: blocks.mlp.fc.weight, tok_emb.weight + passthrough (float16): blocks.attn.attn_gate_w, blocks.attn.q_gain, blocks.attn_scale, blocks.mlp_scale, blocks.resid_mix, parallel_post_lambdas, parallel_resid_lambdas, skip_gates, skip_weights, smear_gate.weight, smear_lambda +Serialized model quantized+brotli: 15542968 bytes +Total submission size quantized+brotli: 15584418 bytes +W0430 18:28:47.988000 421368 torch/distributed/elastic/multiprocessing/api.py:900] Sending process 421437 closing signal SIGTERM +W0430 18:28:47.990000 421368 torch/distributed/elastic/multiprocessing/api.py:900] Sending process 421438 closing signal SIGTERM +W0430 18:28:47.993000 421368 torch/distributed/elastic/multiprocessing/api.py:900] Sending process 421439 closing signal SIGTERM +W0430 18:28:47.994000 421368 torch/distributed/elastic/multiprocessing/api.py:900] Sending process 421440 closing signal SIGTERM +W0430 18:28:47.995000 421368 torch/distributed/elastic/multiprocessing/api.py:900] Sending process 421441 closing signal SIGTERM +W0430 18:28:47.997000 421368 torch/distributed/elastic/multiprocessing/api.py:900] Sending process 421442 closing signal SIGTERM +W0430 18:28:47.999000 421368 torch/distributed/elastic/multiprocessing/api.py:900] Sending process 421443 closing signal SIGTERM +E0430 18:28:49.580000 421368 torch/distributed/elastic/multiprocessing/api.py:874] failed (exitcode: -11) local_rank: 0 (pid: 421436) of binary: /usr/local/bin/python +Traceback (most recent call last): + File "/usr/local/bin/torchrun", line 7, in + sys.exit(main()) + ^^^^^^ + File "/usr/local/lib/python3.12/dist-packages/torch/distributed/elastic/multiprocessing/errors/__init__.py", line 357, in wrapper + return f(*args, **kwargs) + ^^^^^^^^^^^^^^^^^^ + File "/usr/local/lib/python3.12/dist-packages/torch/distributed/run.py", line 901, in main + run(args) + File "/usr/local/lib/python3.12/dist-packages/torch/distributed/run.py", line 892, in run + elastic_launch( + File "/usr/local/lib/python3.12/dist-packages/torch/distributed/launcher/api.py", line 143, in __call__ + return launch_agent(self._config, self._entrypoint, list(args)) + ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ + File "/usr/local/lib/python3.12/dist-packages/torch/distributed/launcher/api.py", line 277, in launch_agent + raise ChildFailedError( +torch.distributed.elastic.multiprocessing.errors.ChildFailedError: +======================================================== +train_gpt.py FAILED +-------------------------------------------------------- +Failures: + +-------------------------------------------------------- +Root Cause (first observed failure): +[0]: + time : 2026-04-30_18:28:47 + host : d82d1968b256 + rank : 0 (local_rank: 0) + exitcode : -11 (pid: 421436) + error_file: + traceback : Signal 11 (SIGSEGV) received by PID 421436 +========================================================