Skip to content

Autoresearch/10k mar26#13

Open
harryschaefer93 wants to merge 3 commits into
miolini:masterfrom
harryschaefer93:autoresearch/10k-mar26
Open

Autoresearch/10k mar26#13
harryschaefer93 wants to merge 3 commits into
miolini:masterfrom
harryschaefer93:autoresearch/10k-mar26

Conversation

@harryschaefer93

Copy link
Copy Markdown

No description provided.

Trained an 11.5M parameter GPT exclusively on SEC 10-K filings from
financial companies. 23.3% better compression than the same model
trained on general web text. ~20 experiments, ~2 hours total GPU time
on a MacBook Air.

Added:
- prepare_10k.py: SEC EDGAR data pipeline (download, clean, shard)
- benchmark.py: perplexity, inference speed, and cost benchmarks
- monitor.py: live terminal dashboard with loss curves + thermal monitoring
- MODEL_CARD.md: HuggingFace model card
- Full README writeup of methodology, results, and learnings

Results:
- val_bpb: 1.645 (vs 2.146 general baseline)
- Inference: 75,000+ tok/sec on Apple Silicon
- Cost: $0 to process all 80K SEC filings (vs $15K+ via API)

Model: https://huggingface.co/HarryS64/10k-financial-slm

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
@harryschaefer93 harryschaefer93 force-pushed the autoresearch/10k-mar26 branch from b70715a to 60c36a9 Compare March 27, 2026 14:50
Harry Schaefer and others added 2 commits March 27, 2026 12:17
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Built from ground truth: 1,131 filings = 136.7M tokens (our tokenizer),
120,910 tokens/filing, x0.875 GPT conversion = 105,796 GPT tokens/filing,
x79,513 filings = 8.4B total tokens. All costs were ~40% underestimated.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant