-
Notifications
You must be signed in to change notification settings - Fork 1.1k
Improve index cleanup #5709
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: main
Are you sure you want to change the base?
Improve index cleanup #5709
Conversation
5b5ca2b to
b1751b9
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
very nice cleanup of cleanup code.
| ok. | ||
|
|
||
| cleanup(DbName, ActiveSigs) -> | ||
| cleanup(DbName, #{} = SigMap) -> |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
a _search_cleanup during a cluster upgrade will crash then? I think that's fine as they are rare but might be worth noting somewhere.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We can add an upgrade clause as well. It's easy enough in this case
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Added an upgrade clause
e79027c to
1ccf4c6
Compare
Fix these issues:
* Index cleanup triggered by smoosh only cleaned view indexes and purge
checkpoints, make sure to also clean nouveau and search indexes and
checkpoints.
* `/_search_cleanup` endpoint only cleaned indexes on the coordinator node,
despite being a clustered fabric endpoint. Fix it so it cleans indexes on
other nodes as well as most users would expect.
* Nouveau cleanup didn't clean purge checkpoints, so make it do so. Use the
mrview purge checkpoint strategy for all indexes. This improves dreyfus logic
to avoid traversing internal disk paths from clouseau.
* Make `_view_cleanup` clean all the index types. This is a bit dirty, but it
may be better than adding another cleanup `_index_cleanup` API. Left
`_search_cleanup` and `_nouveau_cleanup` APIs as is for now.
Some optimizations:
* For each index clean request fetch ddocs once. Calculate signatures and
then call the remote node cleanup logic with them. This avoids fetching
design documents multiple times or sending all of them to the worker nodes.
This is what Nouveau is doing so stick with that nice pattern.
* Use erpc for remote calls. Our Erlang version is high enough (25+) to use
the multple requests pattern from erpc. This is more compact than rexi. The
absolute timeout pattern makes it simpler to have a global timeout for the
whole request.
Cleanups:
* Make purge checkpoint fetching and cleanup more uniform. Use common utility
logic in `couch_index_util` for all indexes.
* Make index cleanup similar between all three indexes. Use the same erpc
pattern for all of them.
* Move index cleanup functions from fabric.erl to its own module --
fabric_index_cleanup.erl
* Rename fabric function names more uniform and clearly indicate if it cleans
indexes on all nodes or just the current node.
* Add the db name to dreyfus `#index{}` record when it initializes so it
matches Nouveau.
1ccf4c6 to
912b00c
Compare
Fix these issues:
Index cleanup triggered by smoosh only cleaned view indexes and purge checkpoints, make sure to also clean nouveau and search indexes and checkpoints.
/_search_cleanupendpoint only cleaned indexes on the coordinator node, despite being a clustered fabric endpoint. Fix it so it cleans indexes on other nodes as well as most users would expect.Nouveau cleanup didn't clean purge checkpoints, so make it do so. Use the mrview purge checkpoint strategy for all indexes. This improves dreyfus logic to avoid traversing internal disk paths from clouseau.
Make
_view_cleanupclean all the index types. This is a bit dirty, but it may be better than adding another cleanup_index_cleanupAPI. Left_search_cleanupand_nouveau_cleanupAPIs as is for now.Some optimizations:
For each index clean request fetch ddocs once. Calculate signatures and then call the remote node cleanup logic with them. This avoids fetching design documents multiple times or sending all of them to the worker nodes. This is what Nouveau is doing so stick with that nice pattern.
Use erpc for remote calls. Our Erlang version is high enough (25+) to use the multple requests pattern from erpc. This is more compact than rexi. The absolute timeout pattern makes it simpler to have a global timeout for the whole request.
Cleanups:
Make purge checkpoint fetching and cleanup more uniform. Use common utility logic in
couch_index_utilfor all indexes.Make index cleanup similar between all three indexes. Use the same erpc pattern for all of them.
Move index cleanup functions from fabric.erl to its own module -- fabric_index_cleanup.erl
Rename fabric function names more uniform and clearly indicate if it cleans indexes on all nodes or just the current node.
Add the db name to dreyfus
#index{}record when it initializes so it matches Nouveau.