Skip to content

replication: add replication state to GetVolumeReplicationInfo response #77

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Open
wants to merge 1 commit into
base: main
Choose a base branch
from

Conversation

Nikhil-Ladha
Copy link

@Nikhil-Ladha Nikhil-Ladha commented Apr 21, 2025

add replication_state to GetVolumeReplicationInfo response.
Fixes: #78

add replication_state to GetVolumeReplicationInfo response

Signed-off-by: Nikhil-Ladha <[email protected]>
@mergify mergify bot added the design Adds or updates an operation or service label Apr 21, 2025
@Nikhil-Ladha
Copy link
Author

/cc @Madhu-1 @nixpanic @Rakshith-R

@@ -254,6 +254,9 @@ message GetVolumeReplicationInfoResponse {
// This field is OPTIONAL.
// The value of this field MUST NOT be negative.
int64 last_sync_bytes = 3;
// Holds the latest replication state.
// This field is REQUIRED.
string replication_state = 4;
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Just curious, as we will have predictable states, can this not be an enum instead?

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We want the spec to be storage agnostic.
We can't have values based on ceph's mirror status values.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We want the spec to be storage agnostic. We can't have values based on ceph's mirror status values.

How are you going to use this value in csi-addons then?
its marked as required too so it may break compatibility ?
I think this needs more discussion.

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

How are you going to use this value in csi-addons then?

We will fetch the state and parse it in ceph-csi and pass the value accordingly to the api.
CSI-Addons doesn't need to validate anything then, and just update the Status.State value based on the api's response.

its marked as required too so it may break compatibility ?

Hmm, yeah we would have to mark it optional for backward compatibility.

black-dragon74
black-dragon74 previously approved these changes Apr 21, 2025
Copy link
Member

@black-dragon74 black-dragon74 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

Copy link
Member

@Rakshith-R Rakshith-R left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

More discussion is required before deciding on this particular approach.

#78 (comment)

@black-dragon74 black-dragon74 dismissed their stale review April 22, 2025 06:59

Needs further discussions..

@@ -390,6 +390,9 @@ message GetVolumeReplicationInfoResponse {
// This field is OPTIONAL.
// The value of this field MUST NOT be negative.
int64 last_sync_bytes = 3;
// Holds the latest replication state.
// This field is REQUIRED.
string replication_state = 4;
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This probably should be optional instead of required. If it is a message, maybe call it status_message instead, and explain that it is user-facing information.

If something is expected to inspect the contents and act on it, like a state, it definitely should be an enum that has values which fit for different storage providers.

Copy link
Author

@Nikhil-Ladha Nikhil-Ladha Apr 22, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If it is a message, maybe call it status_message instead, and explain that it is user-facing information.

It would be the mirroring status state like stopped , syncing, error replaying and so on. I don't think we would want it be a message instead.

If something is expected to inspect the contents and act on it, like a state, it definitely should be an enum that has values which fit for different storage providers.

Based on the above mentioned value, I am not sure which values to use that make it fit for different storage providers?
Making it generic like primary, secondary, unkown and error could be one way

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

And, then the storage provider is responsible for mapping the respective states for them to the csi-addons spec's state?
Like, for us (ceph's pov) stopped to primary, replaying to secondary, error to error and any other state to unkown

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

sounds reasonable, each state should have a description, something along these lines:

  1. unknown status was not provided
  2. primary this volume is in a primary state and can be used for writing to
  3. secondary this volume is actively tracking changes from a primary volume
  4. error something went wrong, and replication has stopped

The error state is probably final, and no automated recovery is possible. Maybe a mis-configuration or environment failure.

In case there is an error state that can be recovered automatically (or retried), such a state should be mentioned separately, maybe something like split-brain or out-of-space.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@nixpanic IMO we should keep it as string because it will be hard to map this from storage to the csi-addons and also state is more of enum where we have pre defined states . can we do something like below

// Represents the current replication state as reported by the backend storage system.
//
// This field is optional.
//
// This is a free-form string that conveys the current replication state,
// such as "synchronized", "initializing", "error", "paused", etc.
//
// Storage providers SHOULD use values that make sense within their own ecosystem.
// This field is intended to be displayed to users or for diagnostic purposes,
// and is NOT intended to be used for strict programmatic logic across vendors.
string replication_status = 4;

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sure, that is basically what I meant with

... If it is a message, maybe call it status_message instead, and explain that it is user-facing information.

There is no need to add replication_ in the name, as the call is GetVolumeReplicationInfoResponse.
To me, status still sounds like something that can be parsed by non-humans. I therefor prefer status_message so that storage providers can phrase something helpful to guide the humans reading it.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

+1 for status_message as we need this to be representation only for users for information not the programmatic

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We can have status and a message for the very clear status and the more details message as well

enum Status {
    UNKNOWN = 0;
    HEALTHY = 1;
    DEGRADED = 2;
    ERROR = 3;
  }

 Status status = 1;
 string status_message = 2

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
design Adds or updates an operation or service
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Add state field to the GetVolumeReplicationInfo response
5 participants