feat(sea): add per-file compression to SEA archive (closes #250)#251
Merged
robertsLando merged 8 commits intomainfrom Apr 18, 2026
Merged
feat(sea): add per-file compression to SEA archive (closes #250)#251robertsLando merged 8 commits intomainfrom
robertsLando merged 8 commits intomainfrom
Conversation
…/GZip/Zstd) Extends the existing --compress flag to enhanced SEA mode, matching what Standard mode has had for years. Each file in the SEA archive is compressed independently with gzip / brotli / zstd and decompressed lazily at first fs.readFileSync() / require(), so the cold-start cost is proportional to the files actually read — not the full archive. Measured on claude-code@1.0.100 (node22-linux-x64): 194 MB → 152 MB with --compress Zstd (41 MB saved, no measurable startup regression), and 194 MB → 147 MB with --compress Brotli (~3 min build). Closes most of the size gap between SEA-mode binaries and competitors like Bun. - lib/compress_type.ts: add Zstd = 3 - lib/index.ts: accept "Zstd"/"zs" at --compress; refuse --compress for simple SEA mode (no walker → nothing to compress) - lib/producer.ts: wire Zstd compressor into Standard-mode producer too, so the flag is consistent across modes - lib/sea-assets.ts: compress each entry during archive write; record manifest.compression = numeric CompressType; keep stats[key].size as the uncompressed length so fs.statSync() reports the real file size - lib/sea.ts, lib/types.ts: thread doCompress through seaEnhanced() - prelude/bootstrap.js: add Zstd branch to payloadFile/payloadFileSync - prelude/sea-vfs-setup.js: pick a decompressor once at SEAProvider construction; decompress on first read, cache the result in _fileCache - test/test-93-sea-compress: build the same fixture with None/GZip/ Brotli/Zstd (Zstd gated on zlib.zstdCompressSync availability) and assert every packaged binary prints identical output - docs: update compression.md, sea-mode.md, sea-vs-standard.md, ARCHITECTURE.md, and vs-bun-deno.md with the new feature and the re-measured claude-code numbers Closes #250
…k SEA binaries Binary-size gap to Bun isn't all archive — ~30 MB of the remaining delta is full-ICU in the stock Node binary pkg-fetch ships. Spell out that ./configure --without-intl --without-inspector --without-npm --without-corepack --fully-static (a pkg-fetch concern, not a pkg one) would close most of what's left.
…ents Re-ran all four pkg --sea variants on the same host with consistent methodology (first-run, cold ~/.cache/pkg, /usr/bin/time -f %e for ./binary --version). Bun/Deno rows are unchanged from the morning run. - None: 979 → 610 ms - GZip: 590 ms (new) - Zstd: 560 ms (new) - Brotli: 590 ms (new) Compression adds ≤0 ms vs uncompressed on this workload because claude-code's --version path only touches a handful of files, so the sync zlib/zstd decode cost is dwarfed by the startup savings from the smaller archive being memory-mapped.
Ran pkg --sea (4 codecs), bun --compile, bun --compile --bytecode, and deno compile on the same host with matching methodology (fresh fixture, cold ~/.cache/pkg, /usr/bin/time -f %e for ./bin --version first run): Bun 510 ms (108 MB) Bun --bytecode 530 ms (190 MB) pkg --sea 560 ms (194 MB) pkg --sea --zstd 570 ms (152 MB) pkg --sea --gzip 580 ms (154 MB) pkg --sea --brotli 590 ms (147 MB) Deno 740 ms (183 MB) The previous numbers (797 Bun / 1256 Deno / 979 pkg) were measured on a different run/method, not apples-to-apples. These six are. Bun is still fastest and smallest; pkg SEA with compression is within ~60 ms of Bun while shipping stock Node.js; Deno is the slowest starter on this workload. Narrative paragraphs updated to match.
…streaming
Security / correctness:
- prelude/sea-vfs-setup.js: cap per-file decompression via maxOutputLength and
assert decompressed length matches manifest stats.size; use Number.isInteger
for offset/length/size bounds (rejects NaN and non-integer floats that the
prior typeof-number guard let through).
- lib/sea-assets.ts: synthesize a stats entry for records that had STORE_CONTENT
but no STORE_STAT, so every compressed stripe has an authoritative size for
the runtime to cross-check against. Make resolveCompressor exhaustive — a new
CompressType without a matching case now fails the build instead of shipping
an archive that claims compression but contains raw bytes.
Performance:
- lib/sea-assets.ts: restore createReadStream path for unmodified disk-resident
files; the prior always-readFileAsync forced peak RSS to grow with total
asset size even when compression was disabled.
- Resolve the decompressor/compressor exactly once per path: at module load in
prelude/bootstrap.js, at SEAProvider construction in sea-vfs-setup.js, before
the stripe loop in sea-assets.ts, and before Multistream in producer.ts. Fails
fast when the runtime is missing a Zstd API instead of mid-stripe. Skips
_fileCache entirely for uncompressed archives so archive subarrays aren't
pinned unnecessarily.
DRY / surface:
- prelude/bootstrap-shared.js: single source of truth for COMPRESS_* constants,
pickDecompressorSync/Async, and a context-aware zstdMissingError (build-host
vs end-user remediation). Classical bootstrap and SEA VFS both consume it;
the local zlib require in bootstrap.js is gone.
- lib/compress_type.ts: getZstdCompressSync / getZstdCompressStream replace the
duplicated 'zlib as unknown as { ... }' casts in producer.ts and sea-assets.ts
and emit a single build-error string (now also includes process.version).
- lib/help.ts: add Zstd to the --compress description and examples.
- lib/index.ts: the 'invalid compression algorithm' error now lists the real
accepted tokens (None/none, Brotli/br, GZip/gz/gzip, Zstd/zs/zstd); the
compression banner goes through log.info instead of console.log.
Tests:
- test/test-93-sea-compress: assert each compressed binary is at least 50 KB
smaller than the None build so a silent fallback to uncompressed fails the
test (the prior byte-equality check couldn't detect that regression).
- test/test-80-compression: cover --compress Zstd in the classical pipeline
(lib/producer.ts and prelude/bootstrap.js zstd branches) when zlib.createZstdCompress
is available on the build host.
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
robertsLando
commented
Apr 18, 2026
Dead code:
- Drop SeaAssetsResult.entryIsESM — seaEnhanced destructures only
{ assets, manifestPath } and the value is read via manifest.entryIsESM at
runtime, so the return-shape field was carrying a stale copy.
- Drop the 'syscall' parameter from SEAProvider._resolveSymlink: all five
callers pass only the path, and ELOOP is rare enough that hardcoding
err.syscall = 'stat' is fine.
- Drop the 'context' parameter from pickDecompressorSync/Async and merge
zstdMissingError into a single runtime-wording string: only 'runtime' was
ever passed (build-side Zstd errors go through lib/compress_type.ts's own
zstdBuildError).
- Drop unused COMPRESS_GZIP/BROTLI/ZSTD exports from bootstrap-shared —
callers now go through pickDecompressor and only COMPRESS_NONE is read
directly by sea-vfs-setup.
- Remove the redundant process.argv[1] = entrypoint assignment in
sea-bootstrap.js; sea-bootstrap-core.js already sets it to the same value.
- Inline the single-use ZSTD_MISSING_BUILD_REMEDIATION constant.
Hot paths (~30K lookups per startup on large projects):
- toManifestKey: skip the backslash→slash regex on POSIX hosts where paths
already match the manifest shape; keep the replace on win32 where it's
mandatory.
- _resolveSymlink: short-circuit before entering the MAX_SYMLINK_DEPTH loop
when the path isn't a symlink key (the common case).
Comments:
- sea-assets.ts: rename the Zstd-resolution rationale to point at
zstdBuildError, which is where the wording now lives.
- bootstrap-shared.js: tighten the COMPRESS_NONE comment now that only it
is exported.
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Root cause: payload.txt starts with 0x0a; on a Windows checkout git's autocrlf converted it to 0x0d 0x0a, so PAYLOAD.slice(0, 32) contained a leading \r\n that survived in `expected` but got stripped from `actual` via the existing replace(/\r\n/g, '\n'), causing the equality assertion to fail across every Windows job. Fix: - Add .gitattributes so payload.txt is checked out LF on every platform; the SEA archive bytes are now deterministic cross-platform, which also keeps the compressed-size assertion stable. - Normalize CRLF in `expected` as defense-in-depth so an existing Windows clone (cloned before .gitattributes landed) still passes the test. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Per review feedback on PR #251: an attacker who can rewrite the SEA blob can also rewrite `manifest.stats[p].size` to match the payload they ship, so the post-decompression `buf.length === expected` check does not survive a consistent tamper — it only fires on accidental corruption, which is a narrow and unlikely case. Keep `maxOutputLength`: it bounds the zlib allocation up front so a blob with a plausible-but-inflated manifest can't request unbounded memory before we discover the size mismatch. That bound is cheap and standard Node zlib hygiene. Also keep the `stats.size` validation: `maxOutputLength` requires a finite integer, so NaN / negative / missing values must still be rejected before reaching zlib. Tightened the comment to reflect the actual threat model (bounded allocation vs. tamper detection) instead of the earlier bomb-defense framing. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Summary
Ports per-stripe compression to enhanced SEA mode — the SEA archive now supports
--compress Brotli,--compress GZip, and--compress Zstdalongside Standard mode. Files are compressed independently at build time and decompressed lazily at firstfs.readFileSync()/require(), so the cold-start cost scales with files actually read, not archive size.Closes #250.
Measured impact (claude-code@1.0.100, node22-linux-x64)
pkg --seapkg --sea --compress GZippkg --sea --compress Zstdpkg --sea --compress BrotliZstd is the recommended default — near-Brotli size at GZip-class build speed. Brotli goes further on size but takes ~3 min to compress the archive for this workload. All three have cold-start overhead within measurement noise (±10 ms best-of-5).
Implementation
lib/compress_type.ts— extendCompressTypewithZstd = 3.lib/index.ts— accept"zstd"/"zs"at--compress; refuse it for simple SEA mode (no walker, nothing to compress).lib/producer.ts— wirecreateZstdCompress()into Standard-mode producer too, so the flag is consistent across modes.lib/sea-assets.ts— compress each entry when writing the archive; storemanifest.compression(numeric CompressType) at the manifest root; keepstats[key].sizeas the uncompressed length sofs.statSync()reports real file sizes. Absentcompressionfield = uncompressed archive (backward compat with pre-SEA mode: add per-stripe compression (~75 MB win on typical apps) #250 SEA binaries).lib/sea.ts,lib/types.ts— threaddoCompressthroughseaEnhanced().prelude/bootstrap.js— Zstd branch inpayloadFile/payloadFileSyncfor Standard mode.prelude/sea-vfs-setup.js— pick a sync decompressor once atSEAProviderconstruction time; decompress on first read, cache the result in the existing_fileCache. Clear error if a Zstd-packaged binary runs on a Node withoutzlib.zstdDecompressSync.Test plan
test/test-93-sea-compressbuilds the same fixture withNone/GZip/Brotli/Zstd(skipping Zstd when the test runtime lackszstdCompressSync) and asserts every packaged binary prints identical output.test-85-sea-enhancedandtest-86-sea-assetsstill pass on Node 22.22.1.yarn buildandyarn lintboth clean.@anthropic-ai/claude-code@1.0.100— built and ran one binary per codec.Docs
docs-site/guide/compression.md— remove "not supported in SEA mode" warning, add Zstd + SEA section.docs-site/guide/sea-mode.md— stop claiming SEA "skips compression."docs-site/guide/sea-vs-standard.md— compression row now ✅/✅.docs-site/guide/vs-bun-deno.md— add--compressrows to the claude-code case study with the numbers above.docs/ARCHITECTURE.md— update the performance-comparison row for SEA.