Skip to content

yimingqiao hits 97.6% compression (5,093,619 bytes)#22

Merged
agavra merged 5 commits intoagavra:mainfrom
YimingQiao:main
Feb 6, 2026
Merged

yimingqiao hits 97.6% compression (5,093,619 bytes)#22
agavra merged 5 commits intoagavra:mainfrom
YimingQiao:main

Conversation

@YimingQiao
Copy link
Contributor

@YimingQiao YimingQiao commented Feb 6, 2026

This is an incremental improvement on top of kjcao’s codec. The gains come from three columns only:

  • repo.name: split into owner/suffix streams and compress each with lpaq1, then reconstruct during decode
  • event_id: rANS tuning (best precision kept)
  • timestamps: structured delta/run + rANS encoding (kept in its best configuration)

Result

  • yimingqiao size: 5,093,619 bytes (~97.6% compression)

What we tried that didn’t help

  • repo_indices: tried rANS top‑k + ESC on the index stream; it lost the local/context patterns that lpaq1 captures, so size got worse.
  • repo.name: tried sparse rename tables (event‑indexed renames) and explicit rename streams; renames are too rare, so metadata overhead dominated.
  • repo.name (owner): tried an owner‑prefix dictionary + rANS prefix IDs; the dictionary + id stream cost more than lpaq1 on the raw owner list.
  • repo.name (suffix): tried separating the suffix flags into a standalone rANS stream; it reduced context for the suffix payload and compressed worse overall.

@agavra
Copy link
Owner

agavra commented Feb 6, 2026

Confirmed in CI/CD! Very impressive, we're so close to breaking 5MB it would be very cool if someone did.

β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”¬β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”¬β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”
β”‚ Codec                  β”‚           Size β”‚ vs Naive   β”‚
β”œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”Όβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”Όβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€
β”‚ Naive                  β”‚    210,727,389 β”‚   baseline β”‚
β”‚ yimingqiao             β”‚      5,093,619 β”‚     -97.6% β”‚
β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”΄β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”΄β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜

@agavra agavra merged commit 85429be into agavra:main Feb 6, 2026
1 check failed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants

Comments