Skip to content

[automatic failover] Implement failback with given grace period #4204

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Open
wants to merge 13 commits into
base: feature/automatic-failover
Choose a base branch
from

Conversation

atakavci
Copy link
Contributor

This PR is based on changes in previous #4200.

Commits essential to this one are;

atakavci and others added 13 commits June 27, 2025 19:13
- Healtstatus manager with initial listener and registration logic
- pluggable health checker strategy  introduced,  these are draft NoOpStrategy, EchoStrategy, LagAwareStrategy,
- fix failing tests impacted from weighted clusters
- add echo ot CommandObjects and UnifiedJEdis
- improve StrategySupplier by accepting jedisclientconfig
- adapt EchoStrategy to StrategySupplier. Now it handles the creation of connection by accepting endpoint and JedisClientConfig
- make healthchecks disabled by default
- drop noOpStrategy
-  add unit&integration tests for health check
- clear redundant catch
- replace failover options and drop failoveroptions class
- remove forced_unhealthy from healthstatus
- fix failback check
- add disabled flag to cluster
- update/fix related tests
- replace failback enabled with failbacksupported in client
- fix formatting
- set defaults
- fix failing tests
- fix failing tests
- introduce graceperiod
- fix issue when CB is forced_open and gracePeriod is completed
@atakavci atakavci requested review from uglide, ggivo and Copilot July 17, 2025 15:28
@atakavci atakavci self-assigned this Jul 17, 2025
Copy link
Contributor

@Copilot Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull Request Overview

This PR implements automatic failback functionality with a configurable grace period for the multi-cluster Redis client. The implementation replaces index-based cluster management with endpoint-based management and adds periodic failback checks that can automatically switch back to higher-weight clusters after they recover from failures.

  • Replaces index-based cluster selection with endpoint-based management using weights
  • Implements periodic failback mechanism with configurable grace periods for unhealthy clusters
  • Adds comprehensive health check system with multiple strategy implementations

Reviewed Changes

Copilot reviewed 29 out of 29 changed files in this pull request and generated 3 comments.

File Description
MultiClusterPooledConnectionProvider.java Core refactoring from index-based to endpoint-based cluster management with failback support
MultiClusterClientConfig.java Added configuration for failback intervals, grace periods, and health check strategies
HealthCheck*.java New health check framework with status management and strategy pattern implementation
Test files Updated tests to use new endpoint-based API and added comprehensive failback mechanism tests
Comments suppressed due to low confidence (5)

src/test/java/redis/clients/jedis/providers/MultiClusterPooledConnectionProviderTest.java:123

  • Test method name 'testSetActiveMultiClusterIndexEqualsZero' is inconsistent with the new endpoint-based API. Should be renamed to 'testSetActiveClusterWithNull' to reflect the actual functionality being tested.
    public void testSetActiveMultiClusterIndexEqualsZero() {

src/test/java/redis/clients/jedis/providers/MultiClusterPooledConnectionProviderTest.java:129

  • Test method name 'testSetActiveMultiClusterIndexLessThanZero' is inconsistent with the new endpoint-based API. Should be renamed to 'testSetActiveClusterWithNullDuplicate' or similar to reflect the actual functionality being tested.
    public void testSetActiveMultiClusterIndexLessThanZero() {

src/test/java/redis/clients/jedis/providers/MultiClusterPooledConnectionProviderTest.java:135

  • Test method name 'testSetActiveMultiClusterIndexOutOfRange' is inconsistent with the new endpoint-based API. Should be renamed to 'testSetActiveClusterWithNonExistentEndpoint' to reflect the actual functionality being tested.
    public void testSetActiveMultiClusterIndexOutOfRange() {

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

Successfully merging this pull request may close these issues.

1 participant