Register non-recursive watchers for Linux#12176
Conversation
|
I'm starting a first review of this pull request. You can view the conversation on Warp. I completed the review and no human review was requested for this pull request. Comment Powered by Oz |
This stack of pull requests is managed by Graphite. Learn more about stacking. |
There was a problem hiding this comment.
Overview
This PR changes repo metadata watching so lazy non-git roots on Linux use non-recursive filesystem watches, with model-level tests for watch-mode bookkeeping.
Concerns
⚠️ Lazy non-recursive watch bookkeeping only adds watched directories. It does not unregister/remove deleted or moved expanded directories, leaving stale paths that block re-registration after recreation.⚠️ Live-created directory subtrees can be inserted as loaded, but only the subtree root gets a non-recursive watch, so loaded descendants do not receive live updates.- 💡 The PR changes user-visible Linux file-tree refresh behavior, but the description does not include manual testing evidence; please attach a screenshot or short screen recording demonstrating lazy folder expansion and live updates end to end.
Verdict
Found: 0 critical, 2 important, 1 suggestions
Request changes
Comment /oz-review on this pull request to retrigger a review (up to 3 times on the same pull request).
Powered by Oz
| .iter() | ||
| .filter_map(|mutation| { | ||
| let path = match mutation { | ||
| FileTreeMutation::AddDirectorySubtree { |
There was a problem hiding this comment.
AddDirectorySubtree inserts the whole built subtree, but this collection only registers the subtree root. Loaded descendant directories under a live-created folder remain unwatched with non-recursive mode, so changes inside them are missed; register every loaded directory in the inserted subtree or keep descendants unloaded.
There was a problem hiding this comment.
This is a bug in the current lazy loading file tree behavior -- we shouldn't be eagerly expanding the subtree on file updates when the tree itself is lazy loaded
There was a problem hiding this comment.
ohh so the code from the other PR removes this and fixes it at its source? if so, makes sense
5c9a4e8 to
ce79cff
Compare
347c5d8 to
f4f3bd7
Compare
909261e to
cbd84ec
Compare
| enum RootWatchMode { | ||
| /// A single recursive watch on the root covers the whole subtree. Used for | ||
| /// git repos (which rely on gitignore pruning in the watch descend filter) | ||
| /// and, on macOS/Windows, for lazy non-git roots where a recursive OS watch |
There was a problem hiding this comment.
Is there any reason we want to do this on Mac / windows for a lazy repo? It still means that we need to do a bunch of work on the cpu when a file changes, right?
There was a problem hiding this comment.
This avoids additional tree scans and keeping more states for non-recursive watchers
For Mac / Windows, there is little extra cost for registering a recursive watcher on the root node. I think it's worth keeping this behavior untouched
| /// including any on-demand per-directory watches recorded for teardown. See | ||
| /// [`RepoWatch`]. | ||
| #[cfg(feature = "local_fs")] | ||
| repo_watches: HashMap<StandardizedPath, RepoWatch>, |
There was a problem hiding this comment.
I feel like this is really just a sumtree of watched directories (that also includes the explicitly watched directories). Not blocking in any way, just the RepoWatch abstraction with extra_dirs is a little clunky IMO
| .iter() | ||
| .filter_map(|mutation| { | ||
| let path = match mutation { | ||
| FileTreeMutation::AddDirectorySubtree { |
There was a problem hiding this comment.
ohh so the code from the other PR removes this and fixes it at its source? if so, makes sense
|
approved but did not review this very closely. Biggest thought is that the extra_dir logic is extremely complicated and makes the logic very hard to follow. I think |
49fb6f8 to
b614b4e
Compare
|
@alokedesai I tried iterating on a few different approaches to clean it up and it's not straightforward. I do think the best path forward is to follow the sumtree based approach |

Description
On Linux, navigating a remote (or local) session into a large non-git directory such as
/workspacesregistered a recursive filesystem watch over the entire subtree. Because a non-git parent has no root.gitignore, the gitignore-based prune (PR #12122) has nothing to prune, so every nested repo'starget/,node_modules/, etc. got watched — 20,000+ inotify watches and the ~11 GB remote-server daemon memory blowup a customer reported.This PR makes the watch for lazy-loaded (non-git) standalone roots as lazy as the directory tree itself, on Linux only: instead of one recursive watch on the root, we watch only the directories that are actually loaded, each
NonRecursive, and add a watch for each subdirectory as the user expands it. Memory now scales with what the user expands rather than the whole subtree.Platform scoping
The per-directory inotify cost is Linux-specific. macOS (FSEvents) and Windows (ReadDirectoryChangesW) watch a tree recursively with a single OS handle, so recursive watching there is cheap and unaffected. Git repos remain recursively watched on all platforms (they rely on gitignore pruning for correctness).
Key changes (
crates/repo_metadata/src/local_model.rs)RepoWatchModeenum that is the stored per-repo watch state:RecursiveOnRoot, orLazyNonRecursive { watched_dirs }which carries the set of directories currently watched under a lazy root (root + expanded subdirs).add_repository_internaltakes the mode, maps it to the watcher'sRecursiveMode, records it, and (when replacing a prior lazy registration, e.g. a lazy root upgraded to a git repo) unregisters stale per-directory watches.index_lazy_loaded_pathregisters the rootNonRecursiveon Linux (Recursiveelsewhere).load_directoryand the watcher-event handler register aNonRecursivewatch on each newly loaded subdirectory and record it.remove_repositorymatches on the recorded mode to unregister every tracked per-directory watch (lazy) or just the root (recursive).BulkFilesystemWatcher/debouncer funnels all events, andhandle_watcher_eventroutes each path to its owning root by longest-prefix match, so non-recursive watches need no new plumbing.Linked Issue
ready-to-specorready-to-implement.This is a follow-up to #12122 (gitignore-aware pruning for git-repo recursive watches); this PR bounds non-git lazy roots, which that prune cannot help.
Testing
Added unit tests in
crates/repo_metadata/src/local_model_tests.rs:index_lazy_loaded_path_tracks_only_root— lazy root recordsLazyNonRecursivewith only the root on Linux,RecursiveOnRootelsewhere.load_directory_tracks_expanded_subdir_for_lazy_root— expanding a subdir adds its watch on Linux.recursive_repo_uses_recursive_watch_mode— git repos recordRecursiveOnRootand aren't tracked as lazy.remove_lazy_loaded_path_clears_tracked_watches— teardown clears all tracked watches and repo state.cargo nextest run -p repo_metadata --features local_fs(new + adjacent tests pass),cargo check -p repo_metadata(default, non-local_fs),./script/format, andcargo clippy -p repo_metadata --features local_fs --all-targets -- -D warningsall clean.Manual: built/deployed the remote-server daemon to a Linux devbox,
cd /workspaces, and confirmed via[watch_inode_debug]logging (added in Stop watching gitignored directories in the repo file watcher #12122) and/proc/<pid>/fdinfothat the watched-directory count drops from ~20,000 to a small number that grows only as folders are expanded; git repos opened at their root are unaffected.I have manually tested my changes locally
Agent Mode
CHANGELOG-BUG-FIX: Fixed excessive memory usage (and inotify watch exhaustion) on Linux when a Warp session navigated into a large non-git directory; lazy directory roots are now watched incrementally as folders are expanded instead of recursively up front.