[Design] Pod management design #14

jdheyburn · 2025-10-31T17:02:04Z

jdheyburn
Oct 31, 2025
Maintainer

Summary

Discussion for how the underlying Valkey workloads (via Pods) will be managed.

There are several ways of doing this:

StatefulSets
Deployments
Direct Pod management

The following sections will outline the pros and cons of each of them.

Motivation

Agreeing on the design to use will allow us to provide a unified approach (and abstraction) for the management of Valkey workloads across the available deployment methods (Cluster, Sentinel, etc.).

Detailed Design

Approaches

StatefulSets

Commonly used for stateful applications running on Kubernetes
Can define PVCs and have them managed by the statefulset controller via volumeClaimTemplates

Pros

Pods recreated with determinate names
- Pods could announce their availability at their pod name

Cons

volumeClaimTemplates is an immutable field, commonly making PVC resizes challenging
- Resizing has to be done on the PVC directly
- Or the StatefulSet has to be recreated by orphaning pods
- Can be mitigated via strategies below
Having multiple replicas in a StatefulSet do not provide guarantees over which replica becomes the master
- This can be mitigated by having singleton StatefulSets, as what Clickhouse Cloud does:
- Example topology of how this could work for Valkey

Deployment

Pros

No requirement for a Service

Cons

Pods do not come up with determinate names
- Ephemeral ReplicaSet hash ID is included in the names

TODO: I'm looking for guidance on what to add here

Direct Pod Management

This is what TimeScale does:

https://www.youtube.com/watch?v=GBWGiCNahfo

Pros

Having the operator directly create Pods means we do not need to workaround the solutions above

Cons

Dependency of the Operator being 100% reliable to reschedule and replace Pods if they get descheduled
- If the Operator is down, Pods won't get scheduled
- Putting more dependency on Valkey workloads on the dependencyt
- The Operator will not be backed by any commercial support
Bad for the Kubernetes ecosystem: prevents contributions upstream to K8s by not addressing limitations that exist in StatefulSets and Deployments
Against the UNIX philosophy (do one thing and do it well)

Ultimately we won't be going down the route of direct pod management for this reason. We do not want an unhealthy Operator preventing pods from being scheduled, when that responsibility can be delegated to a Kubernetes controller (StatefulSet/Deployment). This model may work for other databases that have a support model tied to them. The Valkey Operator is a community backed effort with no support guarantees.

Discussion points

Resizing PVCs

There are a couple of ways for us to tackle this:

Operator managed PVC

Operator creates the PVC prior to the STS
The volume is then attached to the pod Spec in the template of STS instead of via the volumeClaimTemplate

This decouples the PVC from the STS. By this logic, we could also attach the PVC via the pod Spec in the template of a Deployment instead.

An operator for Postgres has a sidecar container that (amongst other things) monitors the disk usage of the PVC - and when it reaches a threshold, will communicate to the Operator that it needs to be resized. The Operator then updates the size of the PVC resource.

https://youtu.be/p2v7bPJkrVU?si=IdmGMJeK1cmtZ4GB&t=733

Custom controller for the Prometheus Operator

The Prometheus Operator creates Prometheus resources via StatefulSets, and manages PVCs via the volumeClaimTemplate. In weekly meeting 2025-10-31, a contributor mentioned that they have a custom controller that watches the PVC to resize it if necessary.

TODO: find out how this process works

Services

Valkey Sentinel requires that nodes in a HA-pair (including Sentinel workloads themselves) all announce themselves at the same address (whether IP or hostname). Otherwise there will be stale replicas that can affect failovers and quorum. Pods in Kubernetes have ephemeral IP addresses, and in the case of Deployments, ephemeral pod names.

A common practice to have a Service mapped to each pod to provide a static ClusterIP and hostname. By having Valkey pods discover and announce their availability (via replica-announce-ip) at these static endpoints, rolling Pods can reuse these Service names and IP addresses for consistency between rolls.

Open Questions

Is a StatefulSets requirement for a Service enough of a con to not require them?
- If Services are required to ensure consistent static endpoints in the case of a Sentinel setup, then it would not make sense to use Deployments for these
Ideally long term it may make sense to support both Deployment and StatefulSet, but right now we don't want to promise the world

Implementation

I'm willing to implement this design and submit a PR
This design has been discussed with maintainers

jdheyburn · 2025-10-31T17:06:22Z

jdheyburn
Oct 31, 2025
Maintainer Author

We agree that regardless of Deployment or StatefulSet, they will only have a replica of no more than 1 replica to create Singletons.

This will provide the most amount of flexibility in terms of pod placement and orchestration.

0 replies

jdheyburn · 2025-11-07T10:50:13Z

jdheyburn
Nov 7, 2025
Maintainer Author

A suggestion was made in the meeting on 2025-10-31 that we could abstract away the technical implementation of Deployment vs StatefulSet with a CR; ValkeyClusterNode, or ValkeyNode.

I like this idea, but I am wondering if Node conflicts too much with the Kubernetes Node. Given that Valkey uses the Node terminology I think we should be consistent.

2 replies

C0mbatwombat Mar 17, 2026

Sorry to chime in here out of the blue, but our experience with the CloudnativePg project is great. They use the Direct Pod management approach, which allows them to handle different scenarios exactly the way they think is best, as this is one of the most sensitive operations. My gut feeling is that the same would hold for this operator, especially when running in in-memory mode.

jdheyburn Mar 19, 2026
Maintainer Author

Hey @C0mbatwombat thanks for reaching out. We implemented the ValkeyNode component to abstract away the concrete implementation of how the workload is deployed. Today we support a workloadType of singleton StatefulSet or Deployment for this, effectively delegating management of the pods to those controllers. This should provide some resiliency should the Operator be down for whatever reason. This is similar to Direct Pod Management, albeit with a couple abstractions between them.

We have designed ValkeyNode to be flexible, such that we could introduce a workloadtype: Direct later on if someone didn't want the abstraction of StatefulSet/Deployment. That's not on the roadmap yet, but the design is flexible enough to allow it.

Let me know your thoughts

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[Design] Pod management design #14

Uh oh!

{{title}}

Uh oh!

Uh oh!

{{editor}}'s edit

{{editor}}'s edit

Uh oh!

Replies: 2 comments 2 replies

Uh oh!

{{title}}

Uh oh!

Uh oh!

{{title}}

Uh oh!

Uh oh!

{{title}}

Uh oh!

Uh oh!

{{title}}

Uh oh!

Select a reply

Uh oh!

[Design] Pod management design #14

Uh oh!

Uh oh!

jdheyburn Oct 31, 2025 Maintainer

Summary

Motivation

Detailed Design

Approaches

StatefulSets

Deployment

Direct Pod Management

Discussion points

Resizing PVCs

Operator managed PVC

Custom controller for the Prometheus Operator

Services

Open Questions

Implementation

Replies: 2 comments · 2 replies

Uh oh!

jdheyburn Oct 31, 2025 Maintainer Author

Uh oh!

jdheyburn Nov 7, 2025 Maintainer Author

Uh oh!

C0mbatwombat Mar 17, 2026

Uh oh!

jdheyburn Mar 19, 2026 Maintainer Author

jdheyburn
Oct 31, 2025
Maintainer

Replies: 2 comments 2 replies

jdheyburn
Oct 31, 2025
Maintainer Author

jdheyburn
Nov 7, 2025
Maintainer Author

jdheyburn Mar 19, 2026
Maintainer Author