[Design] Pod management design #14
Replies: 2 comments 2 replies
-
|
We agree that regardless of Deployment or StatefulSet, they will only have a replica of no more than 1 replica to create Singletons. This will provide the most amount of flexibility in terms of pod placement and orchestration. |
Beta Was this translation helpful? Give feedback.
-
|
A suggestion was made in the meeting on 2025-10-31 that we could abstract away the technical implementation of Deployment vs StatefulSet with a CR; I like this idea, but I am wondering if Node conflicts too much with the Kubernetes Node. Given that Valkey uses the Node terminology I think we should be consistent. |
Beta Was this translation helpful? Give feedback.
Uh oh!
There was an error while loading. Please reload this page.
Uh oh!
There was an error while loading. Please reload this page.
-
Summary
Discussion for how the underlying Valkey workloads (via Pods) will be managed.
There are several ways of doing this:
The following sections will outline the pros and cons of each of them.
Motivation
Agreeing on the design to use will allow us to provide a unified approach (and abstraction) for the management of Valkey workloads across the available deployment methods (Cluster, Sentinel, etc.).
Detailed Design
Approaches
StatefulSets
volumeClaimTemplatesPros
Cons
volumeClaimTemplatesis an immutable field, commonly making PVC resizes challengingDeployment
Pros
Cons
Direct Pod Management
This is what TimeScale does:
Pros
Cons
Ultimately we won't be going down the route of direct pod management for this reason. We do not want an unhealthy Operator preventing pods from being scheduled, when that responsibility can be delegated to a Kubernetes controller (StatefulSet/Deployment). This model may work for other databases that have a support model tied to them. The Valkey Operator is a community backed effort with no support guarantees.
Discussion points
Resizing PVCs
There are a couple of ways for us to tackle this:
Operator managed PVC
This decouples the PVC from the STS. By this logic, we could also attach the PVC via the pod Spec in the template of a Deployment instead.
An operator for Postgres has a sidecar container that (amongst other things) monitors the disk usage of the PVC - and when it reaches a threshold, will communicate to the Operator that it needs to be resized. The Operator then updates the size of the PVC resource.
Custom controller for the Prometheus Operator
The Prometheus Operator creates Prometheus resources via StatefulSets, and manages PVCs via the
volumeClaimTemplate. In weekly meeting 2025-10-31, a contributor mentioned that they have a custom controller that watches the PVC to resize it if necessary.Services
Valkey Sentinel requires that nodes in a HA-pair (including Sentinel workloads themselves) all announce themselves at the same address (whether IP or hostname). Otherwise there will be stale replicas that can affect failovers and quorum. Pods in Kubernetes have ephemeral IP addresses, and in the case of Deployments, ephemeral pod names.
A common practice to have a Service mapped to each pod to provide a static ClusterIP and hostname. By having Valkey pods discover and announce their availability (via
replica-announce-ip) at these static endpoints, rolling Pods can reuse these Service names and IP addresses for consistency between rolls.Open Questions
Implementation
Beta Was this translation helpful? Give feedback.
All reactions