-
Notifications
You must be signed in to change notification settings - Fork 386
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[CELEBORN-1912] ReadClientHandler should support reconnection for ChannelInboundHandler#channelInactive #3154
base: main
Are you sure you want to change the base?
Conversation
c67475f
to
12867fe
Compare
d0416a1
to
0f1d24c
Compare
0f1d24c
to
6c36975
Compare
Ping @reswqa, @codenohup. |
Although you can reconnect the connection, there might be something wrong with your data because rebuilding the connection may cause wrong data order. |
Forget about this, I found that you have rebuilt the connection used to read shuffle from Celeborn which shall be OK. |
common/src/main/java/org/apache/celeborn/common/network/client/ReconnectHandler.java
Outdated
Show resolved
Hide resolved
d097372
to
0aa17eb
Compare
0ce14a2
to
3c9336e
Compare
…nnelInboundHandler#channelInactive
3c9336e
to
3ae7ed8
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM.
What changes were proposed in this pull request?
ReadClientHandler
should support reconnection forChannelInboundHandler#channelInactive
.Why are the changes needed?
ReadClientHandler
should support reconnection forChannelInboundHandler#channelInactive
to avoid frequent failover because of the below exception:Does this PR introduce any user-facing change?
Introduce
celeborn.<module>.reconnect.maxRetries
andceleborn.<module>.reconnect.retryWait
to support reconnection of client.How was this patch tested?
TransportClientFactorySuiteJ#testChannelInactiveReconnect