Skip to content

Tune hyperparameters for macOS MPS efficiency#6

Open
yamyr wants to merge 24 commits into
miolini:masterfrom
yamyr:autoresearch/mar14
Open

Tune hyperparameters for macOS MPS efficiency#6
yamyr wants to merge 24 commits into
miolini:masterfrom
yamyr:autoresearch/mar14

Conversation

@yamyr
Copy link
Copy Markdown

@yamyr yamyr commented Mar 15, 2026

Summary

  • land the best 50-iteration autoresearch sweep on the macOS fork baseline
  • reduce validation bits per byte from 1.644952 to 1.342794 with cumulative training hyperparameter improvements
  • keep experiment artifacts local-only by ignoring and untracking run.log and results.tsv

@yamyr yamyr force-pushed the autoresearch/mar14 branch from 5494278 to 66ac869 Compare March 15, 2026 12:21
Ultraworked with [Sisyphus](https://github.com/code-yeongyu/oh-my-opencode)

Co-authored-by: Sisyphus <clio-agent@sisyphuslabs.ai>
PaulRBerg added a commit to PaulRBerg/autoresearch-macos that referenced this pull request Mar 20, 2026
Incorporates yamyr's 50-experiment sweep that reduces val_bpb from
1.644952 to 1.342794 (18.4% improvement).

Key changes: halved batch size, increased LRs (embedding 0.6→1.2,
matrix 0.04→0.06, unembedding 0.004→0.006), softcap 15→20,
longer Muon momentum ramp, adjusted warmdown/final LR schedules,
and black-style reformatting.

Based-on: miolini#6
Co-authored-by: Wolodymyr <vie@ajentik.ai>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant