This project is a high-performance implementation of the Raft Consensus Algorithm in Java. It is designed to maintain a consistent, replicated log across a cluster of nodes, ensuring system-level fault tolerance and linearizability even in the presence of network partitions or node failures.
Developed as part of the Master’s program at UvA & VU Amsterdam, this project focuses on the core challenges of distributed systems: leader election, log replication, and safety.
Leader Election: Robust heartbeat mechanism with randomized timeouts to minimize split-vote scenarios.
Log Replication: Efficient synchronization of state machine commands across the cluster.
Safety & Consistency: Guarantees that only nodes with up-to-date logs can become leaders.
Cluster Membership Changes: (If implemented, mention it here) Support for adding/removing nodes without downtime.
Fault Tolerance: Automatic recovery and state synchronization after node reboots.
The system consists of several decoupled components interacting via asynchronous RPCs:
Consensus Module: The core logic handling Raft states (Follower, Candidate, Leader).
Log Manager: Persistent storage for log entries with efficient indexing.
State Machine: The application layer that executes committed commands.
RPC Layer: Custom communication interface (or mention if you used gRPC/Netty).
Optimized heartbeat intervals and batching strategies, resulting in a 15% reduction in consensus latency in high-churn simulated environments.Reliability: Successfully handled simultaneous failures of up to