Skip to content

Commit 8925408

Browse files
committed
fix(pd): add timeout and null-safety to getLeaderGrpcAddress()
The bolt RPC call in getLeaderGrpcAddress() returns null in Docker bridge network mode, causing NPE when a follower PD node attempts to discover the leader's gRPC address. This breaks store registration and partition distribution when any node other than pd0 wins the raft leader election. Add a bounded timeout using the configured rpc-timeout, null-check the RPC response, and fall back to deriving the address from the raft endpoint IP when the RPC fails. Closes #2959
1 parent 0505810 commit 8925408

File tree

1 file changed

+16
-2
lines changed
  • hugegraph-pd/hg-pd-core/src/main/java/org/apache/hugegraph/pd/raft

1 file changed

+16
-2
lines changed

hugegraph-pd/hg-pd-core/src/main/java/org/apache/hugegraph/pd/raft/RaftEngine.java

Lines changed: 16 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -26,6 +26,8 @@
2626
import java.util.concurrent.CompletableFuture;
2727
import java.util.concurrent.CountDownLatch;
2828
import java.util.concurrent.ExecutionException;
29+
import java.util.concurrent.TimeUnit;
30+
import java.util.concurrent.TimeoutException;
2931
import java.util.concurrent.atomic.AtomicReference;
3032
import java.util.stream.Collectors;
3133

@@ -239,8 +241,20 @@ public String getLeaderGrpcAddress() throws ExecutionException, InterruptedExcep
239241
waitingForLeader(10000);
240242
}
241243

242-
return raftRpcClient.getGrpcAddress(raftNode.getLeaderId().getEndpoint().toString()).get()
243-
.getGrpcAddress();
244+
try {
245+
RaftRpcProcessor.GetMemberResponse response = raftRpcClient
246+
.getGrpcAddress(raftNode.getLeaderId().getEndpoint().toString())
247+
.get(config.getRpcTimeout(), TimeUnit.MILLISECONDS);
248+
if (response != null && response.getGrpcAddress() != null) {
249+
return response.getGrpcAddress();
250+
}
251+
} catch (TimeoutException | ExecutionException e) {
252+
log.warn("Failed to get leader gRPC address via RPC, falling back to endpoint derivation", e);
253+
}
254+
255+
// Fallback: derive from raft endpoint IP + local gRPC port (best effort)
256+
String leaderIp = raftNode.getLeaderId().getEndpoint().getIp();
257+
return leaderIp + ":" + config.getGrpcPort();
244258
}
245259

246260
/**

0 commit comments

Comments
 (0)