I have a large repo that I updated from latest remote, which updated ~276 files. I did colgrep "hello" right after and it hangs forever, ended up taking over 9 minutes.
I believe this particular diff is problematic because there were lots of large files that were deleted (.rbi files in this case). (Happy to provide any more info you may need!)
FWIW Codex says:
Your colgrep hello is not stuck in the search. It is holding the updater lock and burning CPU inside:
search -> run_indexing -> incremental_update -> delete_file_from_index -> next_plaid::filtering::delete
The bad part is that colgrep found ~276 changed indexed files on master. For each changed file, it calls delete_file_from_index separately at /tmp/next-plaid-colgrep-debug/colgrep/src/index/mod.rs:2585. That calls delete_from_index and filtering::delete at /tmp/next-plaid-colgrep-debug/colgrep/src/index/mod.rs:2594.
Both of those are expensive full-index rewrites:
next-plaid/src/delete.rs:91 loops through every chunk and rewrites chunk files.
next-plaid/src/delete.rs:187 rebuilds the IVF from all remaining codes.
next-plaid/src/filtering.rs:1139 creates a temp table for the whole metadata table.
next-plaid/src/filtering.rs:1148 deletes all metadata rows.
next-plaid/src/filtering.rs:1152 inserts everything back with renumbered subset.
So with 276 changed files, it is effectively doing hundreds of full-ish rewrites of a ~1.2GB index. That looks like “hangs forever” because it is O(changed_files * index_size).
I have a large repo that I updated from latest remote, which updated ~276 files. I did
colgrep "hello"right after and it hangs forever, ended up taking over 9 minutes.I believe this particular diff is problematic because there were lots of large files that were deleted (
.rbifiles in this case). (Happy to provide any more info you may need!)FWIW Codex says:
Your colgrep hello is not stuck in the search. It is holding the updater lock and burning CPU inside:
search -> run_indexing -> incremental_update -> delete_file_from_index -> next_plaid::filtering::delete
The bad part is that colgrep found ~276 changed indexed files on master. For each changed file, it calls delete_file_from_index separately at /tmp/next-plaid-colgrep-debug/colgrep/src/index/mod.rs:2585. That calls delete_from_index and filtering::delete at /tmp/next-plaid-colgrep-debug/colgrep/src/index/mod.rs:2594.
Both of those are expensive full-index rewrites:
next-plaid/src/delete.rs:91 loops through every chunk and rewrites chunk files.
next-plaid/src/delete.rs:187 rebuilds the IVF from all remaining codes.
next-plaid/src/filtering.rs:1139 creates a temp table for the whole metadata table.
next-plaid/src/filtering.rs:1148 deletes all metadata rows.
next-plaid/src/filtering.rs:1152 inserts everything back with renumbered subset.
So with 276 changed files, it is effectively doing hundreds of full-ish rewrites of a ~1.2GB index. That looks like “hangs forever” because it is O(changed_files * index_size).