Skip to content

Add samsond codec: 7,564,554 bytes#7

Merged
agavra merged 1 commit intoagavra:mainfrom
samsond:samsond-codec
Jan 30, 2026
Merged

Add samsond codec: 7,564,554 bytes#7
agavra merged 1 commit intoagavra:mainfrom
samsond:samsond-codec

Conversation

@samsond
Copy link
Contributor

@samsond samsond commented Jan 30, 2026

Implements sequential delta encoding with first-occurrence dictionary ordering to exploit temporal locality in the event stream.

Approach:

  • Repos indexed by first appearance order for minimal delta jumps
  • Continuous delta encoding across entire stream (no block resets)
  • Lossless (name, id) tuple keys handle duplicate repo names
  • Prefix-compressed dictionaries with zstd(22) final pass

Optimized for unsorted (temporal order) input to preserve locality.

Implements sequential delta encoding with first-occurrence dictionary ordering
to exploit temporal locality in the event stream.

Approach:
- Repos indexed by first appearance order for minimal delta jumps
- Continuous delta encoding across entire stream (no block resets)
- Lossless (name, id) tuple keys handle duplicate repo names
- Prefix-compressed dictionaries with zstd(22) final pass

Optimized for unsorted (temporal order) input to preserve locality.
@agavra
Copy link
Owner

agavra commented Jan 30, 2026

Congrats @samsond this is great! I confirmed your score in CI/CD:

β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”¬β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”¬β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”
β”‚ Codec                  β”‚           Size β”‚ vs Naive   β”‚
β”œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”Όβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”Όβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€
β”‚ Naive                  β”‚    210,727,389 β”‚   baseline β”‚
β”‚ samsond                β”‚      7,564,554 β”‚     -96.4% β”‚
β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”΄β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”΄β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜

CI/CD failed because of format but I can just fix that when I add you to the leaderboard.

@agavra agavra merged commit 8c6dae2 into agavra:main Jan 30, 2026
1 check failed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants

Comments