Skip to content

Conversation

@fmeum
Copy link
Collaborator

@fmeum fmeum commented Aug 30, 2025

Repositories are cached in a regular remote cache as AC entries for a synthetic command with the predeclared input hash as the salt. The contents are represented as an output file for the marker file and an output directory for the contents.

Upon a cache hit, the metadata of the files comprising the repository is downloaded and injected into an in-memory file system that is overlaid on the external directory on the native file system. Downloads of file contents only occur when Bazel needs to read a file (e.g., a BUILD or .bzl file) or if a file is an input to an action executed locally. This can save time taken to execute repo rules and compute file digests and disk space required to store the contents of external repositories.

The output of du -h $(bazel info output_base) after bazel build //src:bazel-dev and a fully up-to-date remote cache

  • without the flag:
$ bazel info peak-heap-size used-heap-size-after-gc
peak-heap-size: 271MB
used-heap-size-after-gc: 123MB
$ du -h $(bazel info output_base) | tail -1
1.3G	/private/var/tmp/_bazel_fmeum/507738cfc7e6cde00e4a0230e9aa0722
  • with the flag:
$ bazel info peak-heap-size used-heap-size-after-gc
peak-heap-size: 266MB
used-heap-size-after-gc: 120MB
$ du -h $(bazel info output_base) | tail -1
380M	/private/var/tmp/_bazel_fmeum/507738cfc7e6cde00e4a0230e9aa0722

Some repos are still materialized eagerly, which may not be necessary. Patched http_archive also can't be cached yet, so these numbers are likely to improve with further work on this feature.

TODO:

  • Should the debug events be removed?
  • Remove repo cache entries without a matching recorded inputs file during GC

Fixes #6359
Fixes #22366

RELNOTES[NEW]: The results of reproducible repository rules without dependencies added at runtime (e.g., via repository_ctx.watch or .getenv) can now be cached in a regular HTTP or gRPC remote cache if the new --experimental_remote_repo_contents_cache startup option is provided.

@fmeum fmeum force-pushed the 6359-remote-repo-contents-cache branch 2 times, most recently from bb85d69 to 17ed3a6 Compare September 1, 2025 13:53
@fmeum fmeum force-pushed the 6359-remote-repo-contents-cache branch 3 times, most recently from b79c111 to 17af177 Compare September 2, 2025 17:45
@fmeum fmeum changed the title Add an experimental remote repo contents cache Add a remote repo contents cache Sep 3, 2025
@fmeum fmeum force-pushed the 6359-remote-repo-contents-cache branch 6 times, most recently from 5b3c3d0 to 1ba11a0 Compare September 4, 2025 14:42
@fmeum fmeum marked this pull request as ready for review September 6, 2025 19:30
@fmeum fmeum requested review from a team, Wyverald and meteorcloudy as code owners September 6, 2025 19:30
@fmeum fmeum requested review from gregestren and removed request for a team September 6, 2025 19:30
@github-actions github-actions bot added team-Performance Issues for Performance teams team-Configurability platforms, toolchains, cquery, select(), config transitions team-ExternalDeps External dependency handling, remote repositiories, WORKSPACE file. team-Rules-CPP Issues for C++ rules team-Remote-Exec Issues and PRs for the Execution (Remote) team awaiting-review PR is awaiting review from an assigned reviewer labels Sep 6, 2025
@fmeum fmeum force-pushed the 6359-remote-repo-contents-cache branch 3 times, most recently from 3a0b2e0 to 9cfb892 Compare September 12, 2025 14:14
Copy link
Member

@Wyverald Wyverald left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

some initial comments. will review deeper later, probably after the PR split (see below)

@fmeum fmeum force-pushed the 6359-remote-repo-contents-cache branch 3 times, most recently from a3e7e98 to 2057ed6 Compare October 24, 2025 12:37
@fmeum fmeum requested a review from Wyverald October 24, 2025 12:40
@fmeum
Copy link
Collaborator Author

fmeum commented Oct 24, 2025

Looks like I broke a test. I'll investigate that and also improve the recovery in case of a failed materialization.

@fmeum
Copy link
Collaborator Author

fmeum commented Oct 24, 2025

@Wyverald Should be fixed. Failing materialization should not have written out the marker file, so this should be fine as well.

@Wyverald
Copy link
Member

Nice. I'll start the import myself.

@iancha1992
Copy link
Member

iancha1992 commented Oct 28, 2025

@fmeum Bazel 9.0.0 rc1 cut is scheduled for Thursday. Please merge your fix by then to include it in rc1, or it will be cherry-picked into a later RC. If the fix won't make the Bazel 9 release deadline, then please remove this issue from the milestone. Thanks!

cc: @sluongng @meisterT @Wyverald

@pzembrod pzembrod removed the team-Rules-CPP Issues for C++ rules label Oct 30, 2025
@fmeum fmeum force-pushed the 6359-remote-repo-contents-cache branch from cf6a283 to 1ad9f54 Compare October 30, 2025 19:21
@fmeum fmeum requested a review from tjgq October 30, 2025 19:22
@iancha1992
Copy link
Member

@fmeum Bazel 9.0.0 rc1 cut is scheduled for Thursday. Please merge your fix by then to include it in rc1, or it will be cherry-picked into a later RC. If the fix won't make the Bazel 9 release deadline, then please remove this issue from the milestone. Thanks!

cc: @sluongng @meisterT @Wyverald

Please note: The rc1 cut date has been rescheduled to Monday, November 3rd.

@copybara-service copybara-service bot closed this in b8589c3 Nov 3, 2025
@github-actions github-actions bot removed the awaiting-review PR is awaiting review from an assigned reviewer label Nov 3, 2025
@iancha1992 iancha1992 removed this from the 9.0.0 release blockers milestone Nov 17, 2025
@iancha1992
Copy link
Member

@bazel-io fork 9.0.0

@fmeum fmeum deleted the 6359-remote-repo-contents-cache branch November 18, 2025 19:47
modularbot pushed a commit to modular/modular that referenced this pull request Dec 13, 2025
This cache supports some results of repo rules being cached remotely

Details: bazelbuild/bazel#26860
MODULAR_ORIG_COMMIT_REV_ID: c1ce81166e00b5a888a0922aa4e19f6581fabf15
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

team-Configurability platforms, toolchains, cquery, select(), config transitions team-ExternalDeps External dependency handling, remote repositiories, WORKSPACE file. team-Performance Issues for Performance teams team-Remote-Exec Issues and PRs for the Execution (Remote) team

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Make build without bytes take into account http_file Allow using remote cache for repository cache

8 participants