Skip to content

Register non-recursive watchers for Linux#12176

Merged
kevinyang372 merged 6 commits into
masterfrom
kevin/register-non-recursive-watcher-for-linux
Jun 5, 2026
Merged

Register non-recursive watchers for Linux#12176
kevinyang372 merged 6 commits into
masterfrom
kevin/register-non-recursive-watcher-for-linux

Conversation

@kevinyang372
Copy link
Copy Markdown
Member

@kevinyang372 kevinyang372 commented Jun 3, 2026

Description

On Linux, navigating a remote (or local) session into a large non-git directory such as /workspaces registered a recursive filesystem watch over the entire subtree. Because a non-git parent has no root .gitignore, the gitignore-based prune (PR #12122) has nothing to prune, so every nested repo's target/, node_modules/, etc. got watched — 20,000+ inotify watches and the ~11 GB remote-server daemon memory blowup a customer reported.

This PR makes the watch for lazy-loaded (non-git) standalone roots as lazy as the directory tree itself, on Linux only: instead of one recursive watch on the root, we watch only the directories that are actually loaded, each NonRecursive, and add a watch for each subdirectory as the user expands it. Memory now scales with what the user expands rather than the whole subtree.

Platform scoping

The per-directory inotify cost is Linux-specific. macOS (FSEvents) and Windows (ReadDirectoryChangesW) watch a tree recursively with a single OS handle, so recursive watching there is cheap and unaffected. Git repos remain recursively watched on all platforms (they rely on gitignore pruning for correctness).

Key changes (crates/repo_metadata/src/local_model.rs)

  • Introduced a RepoWatchMode enum that is the stored per-repo watch state: RecursiveOnRoot, or LazyNonRecursive { watched_dirs } which carries the set of directories currently watched under a lazy root (root + expanded subdirs).
  • add_repository_internal takes the mode, maps it to the watcher's RecursiveMode, records it, and (when replacing a prior lazy registration, e.g. a lazy root upgraded to a git repo) unregisters stale per-directory watches.
  • index_lazy_loaded_path registers the root NonRecursive on Linux (Recursive elsewhere).
  • load_directory and the watcher-event handler register a NonRecursive watch on each newly loaded subdirectory and record it.
  • remove_repository matches on the recorded mode to unregister every tracked per-directory watch (lazy) or just the root (recursive).
  • Event processing is unchanged: a single BulkFilesystemWatcher/debouncer funnels all events, and handle_watcher_event routes each path to its owning root by longest-prefix match, so non-recursive watches need no new plumbing.

Linked Issue

  • The linked issue is labeled ready-to-spec or ready-to-implement.
  • Where appropriate, screenshots or a short video of the implementation are included below (especially for user-visible or UI changes).

This is a follow-up to #12122 (gitignore-aware pruning for git-repo recursive watches); this PR bounds non-git lazy roots, which that prune cannot help.

Testing

  • Added unit tests in crates/repo_metadata/src/local_model_tests.rs:

    • index_lazy_loaded_path_tracks_only_root — lazy root records LazyNonRecursive with only the root on Linux, RecursiveOnRoot elsewhere.
    • load_directory_tracks_expanded_subdir_for_lazy_root — expanding a subdir adds its watch on Linux.
    • recursive_repo_uses_recursive_watch_mode — git repos record RecursiveOnRoot and aren't tracked as lazy.
    • remove_lazy_loaded_path_clears_tracked_watches — teardown clears all tracked watches and repo state.
  • cargo nextest run -p repo_metadata --features local_fs (new + adjacent tests pass), cargo check -p repo_metadata (default, non-local_fs), ./script/format, and cargo clippy -p repo_metadata --features local_fs --all-targets -- -D warnings all clean.

  • Manual: built/deployed the remote-server daemon to a Linux devbox, cd /workspaces, and confirmed via [watch_inode_debug] logging (added in Stop watching gitignored directories in the repo file watcher #12122) and /proc/<pid>/fdinfo that the watched-directory count drops from ~20,000 to a small number that grows only as folders are expanded; git repos opened at their root are unaffected.

  • I have manually tested my changes locally

Agent Mode

  • Warp Agent Mode - This PR was created via Warp's AI Agent Mode

CHANGELOG-BUG-FIX: Fixed excessive memory usage (and inotify watch exhaustion) on Linux when a Warp session navigated into a large non-git directory; lazy directory roots are now watched incrementally as folders are expanded instead of recursively up front.

@oz-for-oss
Copy link
Copy Markdown
Contributor

oz-for-oss Bot commented Jun 3, 2026

@kevinyang372

I'm starting a first review of this pull request.

You can view the conversation on Warp.

I completed the review and no human review was requested for this pull request.

Comment /oz-review on this pull request to retrigger a review (up to 3 times on the same pull request).

Powered by Oz

Copy link
Copy Markdown
Member Author

kevinyang372 commented Jun 3, 2026

@kevinyang372 kevinyang372 requested a review from alokedesai June 3, 2026 22:29
Copy link
Copy Markdown
Contributor

@oz-for-oss oz-for-oss Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Overview

This PR changes repo metadata watching so lazy non-git roots on Linux use non-recursive filesystem watches, with model-level tests for watch-mode bookkeeping.

Concerns

  • ⚠️ Lazy non-recursive watch bookkeeping only adds watched directories. It does not unregister/remove deleted or moved expanded directories, leaving stale paths that block re-registration after recreation.
  • ⚠️ Live-created directory subtrees can be inserted as loaded, but only the subtree root gets a non-recursive watch, so loaded descendants do not receive live updates.
  • 💡 The PR changes user-visible Linux file-tree refresh behavior, but the description does not include manual testing evidence; please attach a screenshot or short screen recording demonstrating lazy folder expansion and live updates end to end.

Verdict

Found: 0 critical, 2 important, 1 suggestions

Request changes

Comment /oz-review on this pull request to retrigger a review (up to 3 times on the same pull request).

Powered by Oz

Comment thread crates/repo_metadata/src/local_model.rs
.iter()
.filter_map(|mutation| {
let path = match mutation {
FileTreeMutation::AddDirectorySubtree {
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ [IMPORTANT] AddDirectorySubtree inserts the whole built subtree, but this collection only registers the subtree root. Loaded descendant directories under a live-created folder remain unwatched with non-recursive mode, so changes inside them are missed; register every loaded directory in the inserted subtree or keep descendants unloaded.

Copy link
Copy Markdown
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is a bug in the current lazy loading file tree behavior -- we shouldn't be eagerly expanding the subtree on file updates when the tree itself is lazy loaded

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

ohh so the code from the other PR removes this and fixes it at its source? if so, makes sense

@kevinyang372 kevinyang372 force-pushed the kevin/register-non-recursive-watcher-for-linux branch from 5c9a4e8 to ce79cff Compare June 4, 2026 04:06
Base automatically changed from kevin/watcher-gitignore-prune to master June 4, 2026 04:26
@kevinyang372 kevinyang372 force-pushed the kevin/register-non-recursive-watcher-for-linux branch 2 times, most recently from 347c5d8 to f4f3bd7 Compare June 4, 2026 16:06
@kevinyang372 kevinyang372 force-pushed the kevin/register-non-recursive-watcher-for-linux branch from 909261e to cbd84ec Compare June 4, 2026 20:18
Comment thread crates/repo_metadata/src/entry.rs Outdated
enum RootWatchMode {
/// A single recursive watch on the root covers the whole subtree. Used for
/// git repos (which rely on gitignore pruning in the watch descend filter)
/// and, on macOS/Windows, for lazy non-git roots where a recursive OS watch
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is there any reason we want to do this on Mac / windows for a lazy repo? It still means that we need to do a bunch of work on the cpu when a file changes, right?

Copy link
Copy Markdown
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This avoids additional tree scans and keeping more states for non-recursive watchers

For Mac / Windows, there is little extra cost for registering a recursive watcher on the root node. I think it's worth keeping this behavior untouched

/// including any on-demand per-directory watches recorded for teardown. See
/// [`RepoWatch`].
#[cfg(feature = "local_fs")]
repo_watches: HashMap<StandardizedPath, RepoWatch>,
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I feel like this is really just a sumtree of watched directories (that also includes the explicitly watched directories). Not blocking in any way, just the RepoWatch abstraction with extra_dirs is a little clunky IMO

Copy link
Copy Markdown
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah agreed

.iter()
.filter_map(|mutation| {
let path = match mutation {
FileTreeMutation::AddDirectorySubtree {
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

ohh so the code from the other PR removes this and fixes it at its source? if so, makes sense

@alokedesai
Copy link
Copy Markdown
Member

approved but did not review this very closely. Biggest thought is that the extra_dir logic is extremely complicated and makes the logic very hard to follow. I think

@kevinyang372 kevinyang372 force-pushed the kevin/register-non-recursive-watcher-for-linux branch from 49fb6f8 to b614b4e Compare June 5, 2026 03:22
@kevinyang372 kevinyang372 enabled auto-merge (squash) June 5, 2026 03:22
@kevinyang372
Copy link
Copy Markdown
Member Author

@alokedesai I tried iterating on a few different approaches to clean it up and it's not straightforward. I do think the best path forward is to follow the sumtree based approach

@kevinyang372 kevinyang372 merged commit 4264275 into master Jun 5, 2026
28 checks passed
@kevinyang372 kevinyang372 deleted the kevin/register-non-recursive-watcher-for-linux branch June 5, 2026 03:39
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants