-
Notifications
You must be signed in to change notification settings - Fork 3.4k
HBASE-29376 ReplicationLogCleaner.preClean/getDeletableFiles should return early when asyncClusterConnection closes during HMaster stopping #7071
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: master
Are you sure you want to change the base?
Conversation
This comment has been minimized.
This comment has been minimized.
This comment has been minimized.
This comment has been minimized.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Pull Request Overview
This PR implements early return checks in ReplicationLogCleaner methods when the async cluster connection is closed, ensuring that no file deletion occurs during HMaster shutdown. Key changes include:
- Adding early return logic in preClean() and getDeletableFiles() based on asyncClusterConnection status.
- Updating the ReplicationLogCleaner initialization to track MasterServices via a new masterService field.
- Introducing new tests in TestReplicationLogCleaner.java to verify the early return behavior when the async connection is closed.
Reviewed Changes
Copilot reviewed 2 out of 2 changed files in this pull request and generated 1 comment.
File | Description |
---|---|
hbase-server/src/main/java/org/apache/hadoop/hbase/replication/master/ReplicationLogCleaner.java | Added early return checks for async connection closed and stored MasterServices instance via masterService. |
hbase-server/src/test/java/org/apache/hadoop/hbase/replication/master/TestReplicationLogCleaner.java | Added tests to validate that preClean() and getDeletableFiles() return early when the async connection is closed. |
@@ -77,6 +78,12 @@ public void preClean() { | |||
if (this.getConf() == null) { | |||
return; | |||
} | |||
|
|||
if (masterService.getAsyncClusterConnection().isClosed()) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
[nitpick] The asyncClusterConnection closure check is duplicated in both preClean() and getDeletableFiles(). Consider extracting this logic into a separate private helper method to reduce code duplication and improve maintainability.
Copilot uses AI. Check for mistakes.
*/ | ||
private boolean isAsyncClusterConnectionClosed() { | ||
AsyncClusterConnection asyncClusterConnection = masterService.getAsyncClusterConnection(); | ||
return asyncClusterConnection != null && asyncClusterConnection.isClosed(); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
A null check for asyncClusterConnection here, as HMaster performs the same check.
// HMaster.shutdown()
if (this.asyncClusterConnection != null) {
this.asyncClusterConnection.close();
}
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
If asyncClusterConnection is null, we should also skip the deletion check?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
A bit busy now, I'll recheck the code this weekend.
Thanks for your review
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
My fault, the check should be skipped when asyncClusterConnection == null
Thanks.
This comment has been minimized.
This comment has been minimized.
This comment has been minimized.
This comment has been minimized.
[INFO] Tests run: 14, Failures: 0, Errors: 0, Skipped: 0, Time elapsed: 7.547 s - in org.apache.hadoop.hbase.master.cleaner.TestCleanerChore
[INFO]
[INFO] Results:
[INFO]
[INFO] Tests run: 14, Failures: 0, Errors: 0, Skipped: 0
[INFO]
[INFO]
[INFO] --- surefire:3.1.0:test (secondPartTestsExecution) @ hbase-server ---
[INFO] Tests are skipped.
[INFO] ------------------------------------------------------------------------
[INFO] Reactor Summary for Apache HBase 4.0.0-alpha-1-SNAPSHOT:
|
f9cd7da
to
090de30
Compare
This comment has been minimized.
This comment has been minimized.
This comment has been minimized.
This comment has been minimized.
090de30
to
8c0e8d6
Compare
Trigger a rebuild |
This comment has been minimized.
This comment has been minimized.
This comment has been minimized.
This comment has been minimized.
…eturn early when asyncClusterConnection closes during HMaster stopping
8c0e8d6
to
fc7a7b6
Compare
🎊 +1 overall
This message was automatically generated. |
🎊 +1 overall
This message was automatically generated. |
Details see: HBASE-29376
After applying this patch
Setting
hbase.master.cleaner.interval
to10000ms
, and when stopping the HMaster service, we can observe the following logs.