Skip to content

Add a solution of 5,784,824 bytes#19

Merged
agavra merged 18 commits intoagavra:mainfrom
XinyuZeng:xinyuzeng
Feb 3, 2026
Merged

Add a solution of 5,784,824 bytes#19
agavra merged 18 commits intoagavra:mainfrom
XinyuZeng:xinyuzeng

Conversation

@XinyuZeng
Copy link

Thanks @agavra for the golf :)! I learned a lot on how to vibe code on a performance problem through this. Especially after the first closed PR, in this new iteration I fixed some mediocre vibe settings and let the agents self-evolve with observable performance and trackable optimization history.

XinyuZeng and others added 17 commits February 3, 2026 12:29
Fix MTF performance regression by limiting alphabet to top 4096 frequent
repo names. Infrequent repos use fallback encoding (raw indices).

- Runtime: 2+ minutes β†’ ~12 seconds (10x faster)
- Size: 5,723,601 β†’ 5,784,824 bytes (+61KB, +1.1%)

The full MTF with 261K unique repos was O(n*m) = billions of operations.
Limited alphabet keeps MTF benefits for frequent repos while avoiding
the quadratic blowup for the long tail.
Training data had max delta 251 (fits in u8), but test data has
deltas up to 2689+ causing silent truncation and decode failures.

Switch to LEB128 varint encoding which handles arbitrary delta sizes.
Minimal size impact on training data (+3 bytes: 5,784,824 -> 5,784,827).

Tested on 3 random GitHub Archive hours - xinyuzeng ranks agavra#1 on all.
@agavra
Copy link
Owner

agavra commented Feb 3, 2026

Confirmed over CI/CD, very nicely done @XinyuZeng!

β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”¬β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”¬β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”
β”‚ Codec                  β”‚           Size β”‚ vs Naive   β”‚
β”œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”Όβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”Όβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€
β”‚ Naive                  β”‚    210,727,389 β”‚   baseline β”‚
β”‚ xinyuzeng              β”‚      5,784,827 β”‚     -97.3% β”‚
β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”΄β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”΄β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜

@agavra agavra merged commit b0f162f into agavra:main Feb 3, 2026
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants

Comments