fix: prune heavy directories (node_modules, .venv, etc.) from inotify recursive watch to reduce memory#12117
Closed
warp-dev-github-integration[bot] wants to merge 1 commit into
Closed
Conversation
…emory The repo_watch_filter() descend predicate only pruned .git/ subtrees, allowing the notify crate's inotify backend to recursively register watches on every subdirectory — including node_modules, .venv, and other massive dependency trees that are almost always gitignored. On Linux, each inotify watch involves cloning path data (Vec<u8>) and inserting into the watch tracking HashMap. For repos with large node_modules trees (50K+ subdirectories), this consumed 11+ GB of memory (71.7% from Vec<u8>::clone, 26.7% from HashMap rehashing in notify::inotify::EventLoop::add_single_watch). This commit: - Adds a HEAVY_DIRECTORY_NAMES blocklist (node_modules, .venv, __pycache__, .mypy_cache, .pytest_cache, .tox, .gradle, .cache) - Introduces should_watch_directory() that checks both the blocklist and the existing .git/ subtree pruning logic - Updates repo_watch_filter() to use the new combined predicate - Preserves backward-compatible should_watch_directory_in_git_path() for existing callers (AI crate) - Adds unit tests for the new heavy directory pruning behavior Sentry: https://sentry.io/organizations/warpdotdev/issues/7259255054/ Co-Authored-By: Oz <oz-agent@warp.dev>
Contributor
|
Being fixed by KY |
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Description
Fix excessive memory usage (11+ GB) caused by the inotify recursive file watcher registering watches on every subdirectory in a repository — including massive dependency/build trees like
node_modules,.venv,__pycache__, etc.Root cause: The
repo_watch_filter()descend predicate (should_watch_directory_in_git_path) only pruned.git/subtrees. On Linux, thenotifycrate uses inotify which requires an individual watch per directory. For repos with largenode_modules(50K+ subdirectories), each watch involves cloning path data and inserting into the tracking HashMap, consuming GBs of memory.Heap profile evidence (Sentry issue 7259255054, event 415dc1bd):
Vec<u8>::cloneinnotify::inotify::EventLoop::add_single_watchFix: Adds a
HEAVY_DIRECTORY_NAMESblocklist to the descend predicate so the recursive watcher skips well-known heavy directories:node_modules,.venv,__pycache__,.mypy_cache,.pytest_cache,.tox,.gradle,.cacheLinked Issue
Testing
should_watch_directoryverifying heavy directories are prunedshould_watch_directory_in_git_pathtests continue to pass (backward-compatible alias preserved)cargo check,cargo clippy,cargo fmtall passAgent Mode
Conversation: https://staging.warp.dev/conversation/51303386-123f-44b6-a346-b4eb68fcacf8
Run: https://oz.staging.warp.dev/runs/019e8b1e-49a4-7033-84ed-95feb644eec2
This PR was generated with Oz.
Linear Issue
APP-4657