-
Notifications
You must be signed in to change notification settings - Fork 3.7k
CASSANDRA-20664: Fix endless loop on reading commitlogs when replay e… #4207
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: trunk
Are you sure you want to change the base?
CASSANDRA-20664: Fix endless loop on reading commitlogs when replay e… #4207
Conversation
Hi, this PR is ready for review. Please let me know if any changes or improvements are needed. Thanks! |
@hsahu-ksolves123 thank you for the PR.
|
// 🔴 Fix: If unsuccessful, skip to next without retrying | ||
if (!success) { | ||
logger.info("File {} was not processed successfully. Skipping to next.", file.name()); | ||
continue; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
what is the idea of this continue logic at the end of the loop logic?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks for the question. The idea behind the continue
here is to ensure that if a particular commit log file cannot be processed (e.g., due to corruption or unreadable format), we skip that file and move to the next one instead of retrying or entering an infinite loop.
Without this continue
, the system might retry the same unreadable file endlessly, which was the original issue described in CASSANDRA-20664. The goal is to fail fast for that file and proceed with the rest of the logs safely.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm sorry, I did not get: continue statement is a kind of goto to the next iteration of the for loop ("for (File file : filteredLogs)" in our case) and we have "continue" here as a last statement of the loop code, so in any case we go to the next iteration of the loop after it, it does not matter if we have "continue" or not. Am I missing something?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
You're right — since the continue
is the last statement in the loop block, it’s redundant and doesn’t affect the flow. It can be safely removed. Thanks for pointing that out — I’ll update the code accordingly.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Hi @netudima,
I've pushed the latest changes:
1.Removed the redundant continue as discussed
2.Added a unit test to reproduce and validate the fix
3.Verified build and test locally
Please review and let me know if any further updates are needed. Thank you!
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
sorry, but I do not see the changes in the PR diff
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The recent code changes are now correctly reflected in this pull request. You can now see the updates in both CommitLogReader.java and CommitLogReaderTest.java. Kindly review the latest commits and let me know if any further changes or improvements are needed.
Thanks
7c7ef9d
to
3c0ef8a
Compare
3c0ef8a
to
8760443
Compare
df3eb40
to
54e39a9
Compare
List<File> filteredLogs = filterCommitLogFiles(files); | ||
int i = 0; | ||
for (File file: filteredLogs) | ||
{ |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
please return brace on a new line as it was before
i++; | ||
readCommitLogSegment(handler, file, minPosition, ALL_MUTATIONS, i == filteredLogs.size()); | ||
boolean success = false; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
what do you need success
for? If you went without having success
, then you would do logger.info
call in catch
block.
Also the way it is coded right now is that handler.handleError
in catch block might throw RuntimeException you are not catching nor acting upon anywhere.
This PR addresses CASSANDRA-20664, which caused an endless loop during commit log replay when an error was encountered. The issue was due to improper error handling in CommitLogReadHandler.
Fix Summary:
1.Fixed loop logic in CommitLogReadHandler.java by properly handling replay errors to avoid infinite loops on unreadable entries.
2.Updated CommitLogReader.java to support graceful skipping of problematic logs and integrate the fix.
3.Added a unit test to reproduce and verify the bug scenario and ensure it’s resolved.
4.Verified manually by building from source and running both new and existing CommitLog-related tests.
5.Pushed changes to a feature branch (CASSANDRA-20664-fix) with clean commits, ready for review and backporting after merge.