Stop watching gitignored directories in the repo file watcher#12122
Conversation
|
I'm starting a first review of this pull request. You can view the conversation on Warp. I completed the review and no human review was requested for this pull request. Comment Powered by Oz |
There was a problem hiding this comment.
Overview
This PR updates the repository file watcher to prune gitignored directories during recursive watch registration while preserving explicitly registered ignored-path interests such as skill directories. The implementation wires the new gitignore-aware descend filter into both repository watcher paths and adds focused unit coverage for ignored directories, interests, nested interests, re-include negations, and .git allowlist behavior.
Concerns
- No blocking correctness, security, or spec-alignment concerns found in the annotated diff.
Verdict
Found: 0 critical, 0 important, 0 suggestions
Approve
Comment /oz-review on this pull request to retrigger a review (up to 3 times on the same pull request).
Powered by Oz
ca30a85 to
ec4a5b5
Compare
This stack of pull requests is managed by Graphite. Learn more about stacking. |
alokedesai
left a comment
There was a problem hiding this comment.
My main question is whether this still allows for watching a gitignored directory if it's lazy loaded? That's obviously important for us to continue to support.
Make the repository file watcher's descend predicate gitignore-aware so we no longer register inotify watches on gitignored directories (node_modules, build output, vendored deps), while still watching registered ignored-path interests (e.g. skill provider dirs). Fixes excessive memory usage observed on large monorepos. Co-Authored-By: Warp <agent@warp.dev>
ec4a5b5 to
10c4d2e
Compare
@alokedesai There shouldn't be regression in most of the cases (MacOS and Windows doesn't go through the tree descend code path to register inode watchers) One case that won't work after this is expanding a gitignored folder in Linux file tree. This will be fixed in the follow-up to this PR (we will register non-recursive watcher as user expands the folder) |
## Description On Linux, navigating a remote (or local) session into a large non-git directory such as `/workspaces` registered a **recursive** filesystem watch over the entire subtree. Because a non-git parent has no root `.gitignore`, the gitignore-based prune (PR #12122) has nothing to prune, so every nested repo's `target/`, `node_modules/`, etc. got watched — 20,000+ inotify watches and the ~11 GB remote-server daemon memory blowup a customer reported. This PR makes the watch for lazy-loaded (non-git) standalone roots as lazy as the directory tree itself, **on Linux only**: instead of one recursive watch on the root, we watch only the directories that are actually loaded, each `NonRecursive`, and add a watch for each subdirectory as the user expands it. Memory now scales with what the user expands rather than the whole subtree. ### Platform scoping The per-directory inotify cost is Linux-specific. macOS (FSEvents) and Windows (ReadDirectoryChangesW) watch a tree recursively with a single OS handle, so recursive watching there is cheap and unaffected. Git repos remain recursively watched on all platforms (they rely on gitignore pruning for correctness). ### Key changes (`crates/repo_metadata/src/local_model.rs`) - Introduced a `RepoWatchMode` enum that is the stored per-repo watch state: `RecursiveOnRoot`, or `LazyNonRecursive { watched_dirs }` which carries the set of directories currently watched under a lazy root (root + expanded subdirs). - `add_repository_internal` takes the mode, maps it to the watcher's `RecursiveMode`, records it, and (when replacing a prior lazy registration, e.g. a lazy root upgraded to a git repo) unregisters stale per-directory watches. - `index_lazy_loaded_path` registers the root `NonRecursive` on Linux (`Recursive` elsewhere). - `load_directory` and the watcher-event handler register a `NonRecursive` watch on each newly loaded subdirectory and record it. - `remove_repository` matches on the recorded mode to unregister every tracked per-directory watch (lazy) or just the root (recursive). - Event processing is unchanged: a single `BulkFilesystemWatcher`/debouncer funnels all events, and `handle_watcher_event` routes each path to its owning root by longest-prefix match, so non-recursive watches need no new plumbing. ## Linked Issue <!-- Link the GitHub issue this PR addresses. Before opening this PR, please confirm: --> - [ ] The linked issue is labeled `ready-to-spec` or `ready-to-implement`. - [ ] Where appropriate, screenshots or a short video of the implementation are included below (especially for user-visible or UI changes). This is a follow-up to #12122 (gitignore-aware pruning for git-repo recursive watches); this PR bounds non-git lazy roots, which that prune cannot help. ## Testing - Added unit tests in `crates/repo_metadata/src/local_model_tests.rs`: - `index_lazy_loaded_path_tracks_only_root` — lazy root records `LazyNonRecursive` with only the root on Linux, `RecursiveOnRoot` elsewhere. - `load_directory_tracks_expanded_subdir_for_lazy_root` — expanding a subdir adds its watch on Linux. - `recursive_repo_uses_recursive_watch_mode` — git repos record `RecursiveOnRoot` and aren't tracked as lazy. - `remove_lazy_loaded_path_clears_tracked_watches` — teardown clears all tracked watches and repo state. - `cargo nextest run -p repo_metadata --features local_fs` (new + adjacent tests pass), `cargo check -p repo_metadata` (default, non-`local_fs`), `./script/format`, and `cargo clippy -p repo_metadata --features local_fs --all-targets -- -D warnings` all clean. - Manual: built/deployed the remote-server daemon to a Linux devbox, `cd /workspaces`, and confirmed via `[watch_inode_debug]` logging (added in #12122) and `/proc/<pid>/fdinfo` that the watched-directory count drops from ~20,000 to a small number that grows only as folders are expanded; git repos opened at their root are unaffected. - [x] I have manually tested my changes locally ## Agent Mode - [x] Warp Agent Mode - This PR was created via Warp's AI Agent Mode CHANGELOG-BUG-FIX: Fixed excessive memory usage (and inotify watch exhaustion) on Linux when a Warp session navigated into a large non-git directory; lazy directory roots are now watched incrementally as folders are expanded instead of recursively up front.

Description
What: Make Warp's repository file watcher stop registering filesystem watches on gitignored directories (
node_modules, build output, vendored deps, etc.), while still watching directories that consumers explicitly care about via registered "ignored-path interests".Why: On large monorepos the watcher recursively registered an inotify watch for every non-
.gitdirectory, including gitignored ones. On a customer's remote-server daemon this ballooned to ~11 GB RSS, with ~98% of the live heap insidenotify::inotify::EventLoop::add_watch(the per-directory watch table + cloned path buffers). The descend filter only pruned.git/internals, never gitignored content.How: The watcher's descend predicate (
repo_watch_filter, used by bothDirectoryWatcherandLocalRepoMetadataModel) is now gitignore-aware:should_watch_repo_directory(path, gitignores, interests):.git/internals keep the existing allowlist (should_watch_directory_in_git_path).matches_ignored_path_interest, which also matches the ancestor prefixes leading to an interest).matched_path_or_any_parents), which guarantees the watcher's monotonicity invariant: a child of an ignored directory is also reported ignored, so we never accept a descendant after rejecting its parent. Directory-only re-include negations (e.g.parentdir/*+!parentdir/*/) keep working because the re-included directory matches as not-ignored on its own path (last-match-wins).repo_watch_filter(gitignores, interests)is now parameterized; both watch-registration sites pass the repo's root + global gitignores (matching the existingis_ignoredtagging) and their interest list.DirectoryWatchergains anignored_path_interestslist +register_ignored_path_interests;app/src/lib.rsregisters the skill-provider paths on it too (they were already registered onRepoMetadataModel), so gitignored skill directories (e.g..agents/skills) stay watched on both the LSP/MCP path and the file-tree/skills path.Consumer impact: Skills are covered on both watcher paths via interests. Project rules already skip gitignored files on live updates; MCP config files are normally tracked. The emit predicate is unchanged (gitignored files inside a watched, non-ignored directory are still emitted and tagged
is_ignored).Known follow-ups (not in this PR):
didChangeWatchedFilesglobs pinned to a gitignored base directory: plumb those base dirs in as interests. Default behavior otherwise matches typical editors (no events for gitignored paths). LSP already filters to server-registered globs..gitignorefiles are not consulted by the descend filter (same limitation as the existingis_ignoredtagging); this can only over-watch, never miss events.Linked Issue
ready-to-specorready-to-implement.No linked GitHub issue. This is an internal reliability fix surfaced by heap profiling of a customer's remote-server daemon.
Testing
Added unit tests in
crates/repo_metadata/src/entry_tests.rsforshould_watch_repo_directory:a/bignored buta/b/cis an interest → full prefix descended, ignored sibling pruned;parentdir/*+!parentdir/*/) descends subdirectories while the loose ignored file stays filtered;.gitallowlist behavior preserved.Validation run locally:
cargo test -p repo_metadata --features local_fs— 72 passed (2 pre-existing flaky ignored), including the 5 new tests.cargo clippy -p repo_metadata --features local_fs --all-targets— clean../script/format/cargo fmt -- --check— clean.cargo check --workspace --all-targets— compiles (incl.lsp,warpapp crate).cargo-clippy-diff origin/master HEAD— no new lint violations.Full
cargo clippy --all-targets --all-features -D warningsandcargo nextest run --workspacewill run in CI../script/runAgent Mode
CHANGELOG-BUG-FIX: Reduced memory usage of the file watcher on large projects by no longer watching gitignored directories (e.g. node_modules, build output).
Co-Authored-By: Warp agent@warp.dev