Skip to content

Comments

RFC: Pre-seed BuildKit layer store to reduce build times#86

Closed
hiroTamada wants to merge 1 commit intomainfrom
rfc/pre-seed-builder-layers
Closed

RFC: Pre-seed BuildKit layer store to reduce build times#86
hiroTamada wants to merge 1 commit intomainfrom
rfc/pre-seed-builder-layers

Conversation

@hiroTamada
Copy link
Contributor

Summary

  • Proposes pre-seeding the builder VM rootfs with extracted base image layers to eliminate the ~10.6s decompression bottleneck on every code-change deployment
  • Evaluates three approaches: pre-seed at build time, persistent volumes per tenant, shared read-only cache
  • Recommends pre-seeding at build time as the simplest option with ~37% build time reduction

Problem

Every code-change deployment takes ~27s of build time. The single largest cost (~10.6s) is decompressing and extracting base image layers inside the ephemeral builder VM. This happens because:

  1. Builder VMs start with an empty BuildKit content store
  2. Any cache miss (code changed) requires BuildKit to reconstruct the base image filesystem
  3. The registry stores compressed layers, but BuildKit needs uncompressed filesystem trees

Expected Impact

Scenario Current Proposed
Code change deploy ~27s build ~17s build (-37%)
Total deploy time ~50s ~40s

Open Questions

  1. Should we pre-seed multiple base images or just nodejs?
  2. Acceptable builder image size increase? (~100-150MB per base image)
  3. Should warm-up be in CI/CD or manual when base images change?

🤖 Generated with Claude Code

Proposes pre-seeding the builder VM rootfs with extracted base image
layers to eliminate the ~10.6s decompression bottleneck on every
code-change deployment.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

| Step | Time | Notes |
|------|------|-------|
| Base image extract | ~10.6s | Decompress + unpack 16 gzipped tar layers (~100MB) |
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

9-10MB/s decompression seems like something might be wrong here 🤔

Normal single-core bound gzip is 80-150MB/s. but Lz4 fast decompress can be 2k-4k MB/s

So off the bat here I am suspicious of this finding indicating something is messed up in the configuration in comparison to just expected slow extract


### Why the registry cache doesn't help

The registry stores **compressed tar archives** (gzipped blobs). BuildKit needs **unpacked filesystem trees** (actual directories and files on disk) to execute build steps. The registry cache tells BuildKit *what* the result of each step is (layer digests), but when a step needs to actually execute, BuildKit must reconstruct the filesystem from the compressed layers. The ~10s is the decompression + extraction cost.
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm not sure but I'm curious if alternate compression algos are supported on buildkit and / or if that would work in the current architecture. Like can we save as zstd isntead of gzip? can we support parallel extraction instead of single core bound?

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It's probably better to first figure out why it's slow than try to swap algos. Then swap algos after it's already going as fast as normal gzip should be

→ COPY step: cache miss (source files changed)
→ BuildKit needs filesystem to execute COPY
→ Downloads 16 compressed layers from local registry (~0.4s)
→ Decompresses + extracts each layer sequentially (~10.6s) ← BOTTLENECK
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I kind of feel like pre-seed approach is a way to workaround some bottleneck that shouldn't exist. I am suspicious that this extraction speed of ~10MB/s is over 10x slower than it ought to be even given single-core gzip extraction. It may be advisable to better understand why it's slow before jumping into a pre-caching / warm pooling kind of pattern to work around it being too slow.

it seems like 100MB change should take about 1 or 2s to extract so it just doesn't seem to add up and implies some configuration or VM performance issue

@hiroTamada hiroTamada closed this Feb 11, 2026
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants