Skip to content

fix: prune heavy directories (node_modules, .venv, etc.) from inotify recursive watch to reduce memory#12117

Closed
warp-dev-github-integration[bot] wants to merge 1 commit into
masterfrom
fix/inotify-heavy-directory-pruning
Closed

fix: prune heavy directories (node_modules, .venv, etc.) from inotify recursive watch to reduce memory#12117
warp-dev-github-integration[bot] wants to merge 1 commit into
masterfrom
fix/inotify-heavy-directory-pruning

Conversation

@warp-dev-github-integration
Copy link
Copy Markdown

@warp-dev-github-integration warp-dev-github-integration Bot commented Jun 3, 2026

Description

Fix excessive memory usage (11+ GB) caused by the inotify recursive file watcher registering watches on every subdirectory in a repository — including massive dependency/build trees like node_modules, .venv, __pycache__, etc.

Root cause: The repo_watch_filter() descend predicate (should_watch_directory_in_git_path) only pruned .git/ subtrees. On Linux, the notify crate uses inotify which requires an individual watch per directory. For repos with large node_modules (50K+ subdirectories), each watch involves cloning path data and inserting into the tracking HashMap, consuming GBs of memory.

Heap profile evidence (Sentry issue 7259255054, event 415dc1bd):

  • 71.7% (8.2 GB): Vec<u8>::clone in notify::inotify::EventLoop::add_single_watch
  • 26.7% (3.1 GB): HashMap rehashing for the watches tracking map
  • Total sampled: 11.22 GB, 190 events, affecting both Linux and macOS stable

Fix: Adds a HEAVY_DIRECTORY_NAMES blocklist to the descend predicate so the recursive watcher skips well-known heavy directories:

  • node_modules, .venv, __pycache__, .mypy_cache, .pytest_cache, .tox, .gradle, .cache

Linked Issue

Testing

  • Unit tests added for should_watch_directory verifying heavy directories are pruned
  • Existing should_watch_directory_in_git_path tests continue to pass (backward-compatible alias preserved)
  • cargo check, cargo clippy, cargo fmt all pass

Agent Mode

  • Warp Agent Mode - This PR was created via Warp's AI Agent Mode

Conversation: https://staging.warp.dev/conversation/51303386-123f-44b6-a346-b4eb68fcacf8
Run: https://oz.staging.warp.dev/runs/019e8b1e-49a4-7033-84ed-95feb644eec2
This PR was generated with Oz.

Linear Issue

APP-4657

…emory

The repo_watch_filter() descend predicate only pruned .git/ subtrees,
allowing the notify crate's inotify backend to recursively register
watches on every subdirectory — including node_modules, .venv, and
other massive dependency trees that are almost always gitignored.

On Linux, each inotify watch involves cloning path data (Vec<u8>) and
inserting into the watch tracking HashMap. For repos with large
node_modules trees (50K+ subdirectories), this consumed 11+ GB of
memory (71.7% from Vec<u8>::clone, 26.7% from HashMap rehashing in
notify::inotify::EventLoop::add_single_watch).

This commit:
- Adds a HEAVY_DIRECTORY_NAMES blocklist (node_modules, .venv,
  __pycache__, .mypy_cache, .pytest_cache, .tox, .gradle, .cache)
- Introduces should_watch_directory() that checks both the blocklist
  and the existing .git/ subtree pruning logic
- Updates repo_watch_filter() to use the new combined predicate
- Preserves backward-compatible should_watch_directory_in_git_path()
  for existing callers (AI crate)
- Adds unit tests for the new heavy directory pruning behavior

Sentry: https://sentry.io/organizations/warpdotdev/issues/7259255054/

Co-Authored-By: Oz <oz-agent@warp.dev>
@kevinchevalier
Copy link
Copy Markdown
Contributor

Being fixed by KY

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants