RFC: Pre-seed BuildKit layer store to reduce build times by hiroTamada · Pull Request #86 · kernel/hypeman

hiroTamada · 2026-02-10T15:01:38Z

Summary

Proposes pre-seeding the builder VM rootfs with extracted base image layers to eliminate the ~10.6s decompression bottleneck on every code-change deployment
Evaluates three approaches: pre-seed at build time, persistent volumes per tenant, shared read-only cache
Recommends pre-seeding at build time as the simplest option with ~37% build time reduction

Problem

Every code-change deployment takes ~27s of build time. The single largest cost (~10.6s) is decompressing and extracting base image layers inside the ephemeral builder VM. This happens because:

Builder VMs start with an empty BuildKit content store
Any cache miss (code changed) requires BuildKit to reconstruct the base image filesystem
The registry stores compressed layers, but BuildKit needs uncompressed filesystem trees

Expected Impact

Scenario	Current	Proposed
Code change deploy	~27s build	~17s build (-37%)
Total deploy time	~50s	~40s

Open Questions

Should we pre-seed multiple base images or just nodejs?
Acceptable builder image size increase? (~100-150MB per base image)
Should warm-up be in CI/CD or manual when base images change?

🤖 Generated with Claude Code

Proposes pre-seeding the builder VM rootfs with extracted base image layers to eliminate the ~10.6s decompression bottleneck on every code-change deployment. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

sjmiller609 · 2026-02-10T18:40:03Z

docs/rfcs/001-pre-seed-builder-layers.md

+
+| Step | Time | Notes |
+|------|------|-------|
+| Base image extract | ~10.6s | Decompress + unpack 16 gzipped tar layers (~100MB) |


9-10MB/s decompression seems like something might be wrong here 🤔

Normal single-core bound gzip is 80-150MB/s. but Lz4 fast decompress can be 2k-4k MB/s

So off the bat here I am suspicious of this finding indicating something is messed up in the configuration in comparison to just expected slow extract

sjmiller609 · 2026-02-10T18:42:07Z

docs/rfcs/001-pre-seed-builder-layers.md

+
+### Why the registry cache doesn't help
+
+The registry stores **compressed tar archives** (gzipped blobs). BuildKit needs **unpacked filesystem trees** (actual directories and files on disk) to execute build steps. The registry cache tells BuildKit *what* the result of each step is (layer digests), but when a step needs to actually execute, BuildKit must reconstruct the filesystem from the compressed layers. The ~10s is the decompression + extraction cost.


I'm not sure but I'm curious if alternate compression algos are supported on buildkit and / or if that would work in the current architecture. Like can we save as zstd isntead of gzip? can we support parallel extraction instead of single core bound?

It's probably better to first figure out why it's slow than try to swap algos. Then swap algos after it's already going as fast as normal gzip should be

sjmiller609 · 2026-02-10T18:47:28Z

docs/rfcs/001-pre-seed-builder-layers.md

+  → COPY step: cache miss (source files changed)
+  → BuildKit needs filesystem to execute COPY
+  → Downloads 16 compressed layers from local registry (~0.4s)
+  → Decompresses + extracts each layer sequentially (~10.6s)  ← BOTTLENECK


I kind of feel like pre-seed approach is a way to workaround some bottleneck that shouldn't exist. I am suspicious that this extraction speed of ~10MB/s is over 10x slower than it ought to be even given single-core gzip extraction. It may be advisable to better understand why it's slow before jumping into a pre-caching / warm pooling kind of pattern to work around it being too slow.

it seems like 100MB change should take about 1 or 2s to extract so it just doesn't seem to add up and implies some configuration or VM performance issue

rfc: pre-seed BuildKit layer store to reduce build times

7af1231

Proposes pre-seeding the builder VM rootfs with extracted base image layers to eliminate the ~10.6s decompression bottleneck on every code-change deployment. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

hiroTamada mentioned this pull request Feb 10, 2026

feat: pre-seed BuildKit layer store for faster first builds #87

Closed

4 tasks

sjmiller609 reviewed Feb 10, 2026

View reviewed changes

hiroTamada closed this Feb 11, 2026

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Comments

RFC: Pre-seed BuildKit layer store to reduce build times#86

RFC: Pre-seed BuildKit layer store to reduce build times#86
hiroTamada wants to merge 1 commit intomainfrom
rfc/pre-seed-builder-layers

hiroTamada commented Feb 10, 2026

Uh oh!

sjmiller609 Feb 10, 2026

Uh oh!

sjmiller609 Feb 10, 2026

Uh oh!

sjmiller609 Feb 10, 2026

Uh oh!

sjmiller609 Feb 10, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants


		### Why the registry cache doesn't help

		The registry stores compressed tar archives (gzipped blobs). BuildKit needs unpacked filesystem trees (actual directories and files on disk) to execute build steps. The registry cache tells BuildKit what the result of each step is (layer digests), but when a step needs to actually execute, BuildKit must reconstruct the filesystem from the compressed layers. The ~10s is the decompression + extraction cost.

Comments

Conversation

hiroTamada commented Feb 10, 2026

Summary

Problem

Expected Impact

Open Questions

Uh oh!

sjmiller609 Feb 10, 2026

Choose a reason for hiding this comment

Uh oh!

sjmiller609 Feb 10, 2026

Choose a reason for hiding this comment

Uh oh!

sjmiller609 Feb 10, 2026

Choose a reason for hiding this comment

Uh oh!

sjmiller609 Feb 10, 2026

Choose a reason for hiding this comment

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants