-
Notifications
You must be signed in to change notification settings - Fork 14.1k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
KAFKA-18660: Transactions Version 2 doesn't handle epoch overflow correctly #18730
base: trunk
Are you sure you want to change the base?
Conversation
@@ -408,13 +408,13 @@ class TransactionCoordinator(txnConfig: TransactionConfig, | |||
|
|||
// generate the new transaction metadata with added partitions | |||
txnMetadata.inLock { | |||
if (txnMetadata.producerId != producerId) { | |||
if (txnMetadata.pendingTransitionInProgress) { | |||
// return a retriable exception to let the client backoff and retry |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Can we add a comment here that explains the significance of ordering this check prior to others?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Sounds good. 👍
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM
Fixed the typo that used the wrong producer ID and epoch when returning so that we handle epoch overflow correctly.
We also had to rearrange the concurrent transaction handling so that we don't self-fence when we start the new transaction with the new producer ID.
I also tested this with a modified version of the code where epoch overflow happens on the first epoch bump (every request has a new producer id)