Skip to content

Commit 8c3147e

Browse files
committed
feat: extend concurrency limit to all reconcilers
Extend the concurrency control from Stack-only to ALL reconcilers (Stacks, Modules, and Resources) to truly prevent "big bang" deployments. Key changes: - Move concurrency config to core package (internal/core/concurrency.go) - Apply MaxConcurrentReconciles to all Module reconcilers: * Analytics, Auth, Gateway, Ledger, Orchestration * Payments, Reconciliation, Search, Stargate * Wallets, Webhooks - Rename env var: STACK_MAX_CONCURRENT → MAX_CONCURRENT_RECONCILES - Rename Helm param: stackMaxConcurrent → maxConcurrentReconciles - Update documentation to reflect global scope This now controls: - Stack reconciliations (namespace creation, config updates) - Module reconciliations (Ledger, Payments, etc. deployments) - Resource reconciliations (Database, Broker management) Default remains 5 concurrent reconciliations for all resource types.
1 parent c800fc8 commit 8c3147e

File tree

17 files changed

+70
-54
lines changed

17 files changed

+70
-54
lines changed

helm/operator/STACK_CONCURRENCY.md

Lines changed: 32 additions & 28 deletions
Original file line numberDiff line numberDiff line change
@@ -1,8 +1,8 @@
1-
# Stack Concurrency Configuration
1+
# Concurrency Control Configuration
22

33
## Overview
44

5-
Control the number of Stack reconciliations that run in parallel to prevent cluster overload.
5+
Control the number of concurrent reconciliations (Stacks, Modules, and other resources) that run in parallel to prevent cluster overload and manage deployment pace.
66

77
## Configuration
88

@@ -12,14 +12,14 @@ Edit your `values.yaml` or use `--set`:
1212

1313
```yaml
1414
operator:
15-
stackMaxConcurrent: 5 # Max 5 concurrent stack reconciliations
15+
maxConcurrentReconciles: 5 # Max 5 concurrent reconciliations (all resources)
1616
```
1717
1818
Or with Helm command:
1919
2020
```bash
2121
helm install operator ./helm/operator \
22-
--set operator.stackMaxConcurrent=5
22+
--set operator.maxConcurrentReconciles=5
2323
```
2424

2525
### Default Behavior
@@ -43,7 +43,7 @@ helm install operator ./helm/operator \
4343
```yaml
4444
# values.yaml
4545
operator:
46-
stackMaxConcurrent: 3
46+
maxConcurrentReconciles: 3
4747
```
4848
4949
```bash
@@ -55,7 +55,7 @@ helm upgrade operator ./helm/operator -f values.yaml
5555
```yaml
5656
# values-prod.yaml
5757
operator:
58-
stackMaxConcurrent: 10
58+
maxConcurrentReconciles: 10
5959
enableLeaderElection: true
6060
region: "eu-west-1"
6161
env: "production"
@@ -69,16 +69,16 @@ helm upgrade operator ./helm/operator -f values-prod.yaml
6969

7070
```bash
7171
helm upgrade operator ./helm/operator \
72-
--set operator.stackMaxConcurrent=5 \
72+
--set operator.maxConcurrentReconciles=5 \
7373
--set operator.region=us-east-1
7474
```
7575

7676
## How It Works
7777

78-
1. The Helm chart sets the `STACK_MAX_CONCURRENT` environment variable
78+
1. The Helm chart sets the `MAX_CONCURRENT_RECONCILES` environment variable
7979
2. The operator reads this value on startup
80-
3. Stack reconciliations are limited to N concurrent executions
81-
4. Additional stacks are queued automatically by Kubernetes
80+
3. All reconciliations (Stacks, Modules like Ledger/Payments, etc.) are limited to N concurrent executions
81+
4. Additional reconciliations are queued automatically by Kubernetes
8282

8383
### Behavior
8484

@@ -111,12 +111,12 @@ Check if the environment variable is set:
111111
POD=$(kubectl get pods -n formance-system -l control-plane=formance-controller-manager -o jsonpath='{.items[0].metadata.name}')
112112

113113
# Check environment variables
114-
kubectl exec -n formance-system $POD -- env | grep STACK_MAX_CONCURRENT
114+
kubectl exec -n formance-system $POD -- env | grep MAX_CONCURRENT_RECONCILES
115115
```
116116

117117
Expected output:
118118
```
119-
STACK_MAX_CONCURRENT=5
119+
MAX_CONCURRENT_RECONCILES=5
120120
```
121121

122122
## Troubleshooting
@@ -130,7 +130,7 @@ STACK_MAX_CONCURRENT=5
130130

131131
2. **Verify deployment:**
132132
```bash
133-
kubectl get deployment operator-manager -n formance-system -o yaml | grep -A 2 "STACK_MAX_CONCURRENT"
133+
kubectl get deployment operator-manager -n formance-system -o yaml | grep -A 2 "MAX_CONCURRENT_RECONCILES"
134134
```
135135

136136
3. **Restart pods to apply changes:**
@@ -176,17 +176,17 @@ kubectl logs -n formance-system -l control-plane=formance-controller-manager -f
176176
```yaml
177177
# values-dev.yaml
178178
operator:
179-
stackMaxConcurrent: 2
179+
maxConcurrentReconciles: 2
180180
env: "dev"
181181

182182
# values-staging.yaml
183183
operator:
184-
stackMaxConcurrent: 5
184+
maxConcurrentReconciles: 5
185185
env: "staging"
186186

187187
# values-prod.yaml
188188
operator:
189-
stackMaxConcurrent: 10
189+
maxConcurrentReconciles: 10
190190
env: "production"
191191
```
192192
@@ -208,7 +208,7 @@ spec:
208208
helm:
209209
values: |
210210
operator:
211-
stackMaxConcurrent: 5
211+
maxConcurrentReconciles: 5
212212
region: "eu-west-1"
213213
env: "production"
214214
```
@@ -217,26 +217,30 @@ spec:
217217
218218
### Implementation
219219
220-
- **Environment Variable:** `STACK_MAX_CONCURRENT`
221-
- **Read by:** `internal/resources/stacks/config.go::GetStackConcurrency()`
222-
- **Applied in:** `internal/resources/stacks/init.go`
220+
- **Environment Variable:** `MAX_CONCURRENT_RECONCILES`
221+
- **Read by:** `internal/core/concurrency.go::GetMaxConcurrentReconciles()`
222+
- **Applied in:** All reconcilers (Stacks, Modules, Resources)
223223
- **Uses:** Native controller-runtime `MaxConcurrentReconciles`
224224

225225
### Source Code
226226

227227
```go
228-
// internal/resources/stacks/config.go
229-
func GetStackConcurrency() int {
230-
if v := os.Getenv("STACK_MAX_CONCURRENT"); v != "" {
231-
if n, err := strconv.Atoi(v); err == nil && n > 0 {
228+
// internal/core/concurrency.go
229+
func GetMaxConcurrentReconciles() int {
230+
if v := os.Getenv("MAX_CONCURRENT_RECONCILES"); v != "" {
231+
if n, err := strconv.Atoi(v); err == nil && n >= 0 {
232232
return n
233233
}
234234
}
235-
return 0 // Default: unlimited
235+
return 5 // Default: 5 concurrent reconciliations
236236
}
237237
```
238238

239-
## Related Documentation
239+
## What This Controls
240240

241-
- [How to Limit Concurrent Stacks](../../docs/HOW_TO_LIMIT_CONCURRENT_STACKS.md)
242-
- [Concurrent Limit Implementation](../../CONCURRENT_LIMIT_IMPLEMENTATION.md)
241+
This setting limits **all types** of reconciliations:
242+
- **Stack reconciliations**: Namespace creation, configuration updates
243+
- **Module reconciliations**: Ledger, Payments, Wallets, Gateway, etc. deployments
244+
- **Resource reconciliations**: Database, Broker, BrokerTopic management
245+
246+
This prevents "big bang" deployments where all resources are processed simultaneously.

helm/operator/templates/deployment.yaml

Lines changed: 3 additions & 3 deletions
Original file line numberDiff line numberDiff line change
@@ -81,10 +81,10 @@ spec:
8181
port: {{ regexReplaceAll ":" .Values.operator.probeAddr "" | default "8081" }}
8282
initialDelaySeconds: 5
8383
periodSeconds: 10
84-
{{- if .Values.operator.stackMaxConcurrent }}
84+
{{- if .Values.operator.maxConcurrentReconciles }}
8585
env:
86-
- name: STACK_MAX_CONCURRENT
87-
value: {{ .Values.operator.stackMaxConcurrent | quote }}
86+
- name: MAX_CONCURRENT_RECONCILES
87+
value: {{ .Values.operator.maxConcurrentReconciles | quote }}
8888
{{- end }}
8989
resources:
9090
{{- toYaml .Values.resources | nindent 12 }}

helm/operator/values.yaml

Lines changed: 3 additions & 3 deletions
Original file line numberDiff line numberDiff line change
@@ -49,11 +49,11 @@ operator:
4949
# Enable leader election for controller manager. Enabling this will ensure there is only one active controller manager.
5050
enableLeaderElection: true
5151

52-
# Maximum number of concurrent stack reconciliations
52+
# Maximum number of concurrent reconciliations (applies to all resources: stacks, modules, etc.)
5353
# Set to 0 for unlimited
5454
# Recommended values: 5 for small/medium clusters, 10 for large, 20 for XL
55-
# @section -- Stack Concurrency
56-
stackMaxConcurrent: 5
55+
# @section -- Concurrency Control
56+
maxConcurrentReconciles: 5
5757

5858
utils:
5959
tag: ""

internal/core/concurrency.go

Lines changed: 18 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,18 @@
1+
package core
2+
3+
import (
4+
"os"
5+
"strconv"
6+
)
7+
8+
// GetMaxConcurrentReconciles returns the maximum number of concurrent reconciliations
9+
// from the MAX_CONCURRENT_RECONCILES environment variable, or a default value of 5
10+
func GetMaxConcurrentReconciles() int {
11+
if v := os.Getenv("MAX_CONCURRENT_RECONCILES"); v != "" {
12+
if n, err := strconv.Atoi(v); err == nil && n >= 0 {
13+
return n
14+
}
15+
}
16+
// Default: 5 concurrent reconciliations (good balance for most clusters)
17+
return 5
18+
}

internal/resources/analytics/init.go

Lines changed: 3 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -31,6 +31,8 @@ func Reconcile(_ Context, _ *v1beta1.Stack, _ *v1beta1.Analytics, _ string) erro
3131

3232
func init() {
3333
Init(
34-
WithModuleReconciler(Reconcile),
34+
WithModuleReconciler(Reconcile,
35+
WithMaxConcurrentReconciles[*v1beta1.Analytics](GetMaxConcurrentReconciles()),
36+
),
3537
)
3638
}

internal/resources/auths/init.go

Lines changed: 1 addition & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -110,6 +110,7 @@ func Reconcile(ctx Context, stack *v1beta1.Stack, auth *v1beta1.Auth, version st
110110
func init() {
111111
Init(
112112
WithModuleReconciler(Reconcile,
113+
WithMaxConcurrentReconciles[*v1beta1.Auth](GetMaxConcurrentReconciles()),
113114
WithOwn[*v1beta1.Auth](&appsv1.Deployment{}),
114115
WithOwn[*v1beta1.Auth](&v1beta1.GatewayHTTPAPI{}),
115116
WithOwn[*v1beta1.Auth](&v1beta1.Database{}),

internal/resources/gateways/init.go

Lines changed: 1 addition & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -87,6 +87,7 @@ func Reconcile(ctx Context, stack *v1beta1.Stack, gateway *v1beta1.Gateway, vers
8787
func init() {
8888
Init(
8989
WithModuleReconciler(Reconcile,
90+
WithMaxConcurrentReconciles[*v1beta1.Gateway](GetMaxConcurrentReconciles()),
9091
WithOwn[*v1beta1.Gateway](&corev1.ConfigMap{}),
9192
WithOwn[*v1beta1.Gateway](&appsv1.Deployment{}),
9293
WithOwn[*v1beta1.Gateway](&corev1.Service{}),

internal/resources/ledgers/init.go

Lines changed: 1 addition & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -150,6 +150,7 @@ func Reconcile(ctx Context, stack *v1beta1.Stack, ledger *v1beta1.Ledger, versio
150150
func init() {
151151
Init(
152152
WithModuleReconciler(Reconcile,
153+
WithMaxConcurrentReconciles[*v1beta1.Ledger](GetMaxConcurrentReconciles()),
153154
WithOwn[*v1beta1.Ledger](&appsv1.Deployment{}),
154155
WithOwn[*v1beta1.Ledger](&batchv1.Job{}),
155156
WithOwn[*v1beta1.Ledger](&corev1.Service{}),

internal/resources/orchestrations/init.go

Lines changed: 1 addition & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -88,6 +88,7 @@ func Reconcile(ctx Context, stack *v1beta1.Stack, o *v1beta1.Orchestration, vers
8888
func init() {
8989
Init(
9090
WithModuleReconciler(Reconcile,
91+
WithMaxConcurrentReconciles[*v1beta1.Orchestration](GetMaxConcurrentReconciles()),
9192
WithOwn[*v1beta1.Orchestration](&v1beta1.BrokerConsumer{}),
9293
WithOwn[*v1beta1.Orchestration](&v1beta1.AuthClient{}),
9394
WithOwn[*v1beta1.Orchestration](&appsv1.Deployment{}),

internal/resources/payments/init.go

Lines changed: 1 addition & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -126,6 +126,7 @@ func Reconcile(ctx Context, stack *v1beta1.Stack, p *v1beta1.Payments, version s
126126
func init() {
127127
Init(
128128
WithModuleReconciler(Reconcile,
129+
WithMaxConcurrentReconciles[*v1beta1.Payments](GetMaxConcurrentReconciles()),
129130
WithFinalizer[*v1beta1.Payments]("clean-payments", Clean),
130131
WithOwn[*v1beta1.Payments](&appsv1.Deployment{}),
131132
WithOwn[*v1beta1.Payments](&corev1.Service{}),

0 commit comments

Comments
 (0)