Skip to content

Conversation

@justinyeh1995
Copy link
Contributor

@justinyeh1995 justinyeh1995 commented Oct 30, 2025

…ap it if the count is over math.MaxInt32

Why are these changes needed?

The CalculateMaxReplicas function did not guard against integer overflow, which could result in an incorrect value for Status.MaxWorkerReplicas, as reported in #4153.

$ kubectl get raycluster raycluster-kuberay -o yaml
apiVersion: ray.io/v1
kind: RayCluster
metadata:
  annotations:
    kubectl.kubernetes.io/last-applied-configuration: |
      {"apiVersion":"ray.io/v1","kind":"RayCluster","metadata":{"annotations":{},"name":"raycluster-kuberay","namespace":"default"},"spec":{"headGroupSpec":{"rayStartParams":{},"template":{"spec":{"containers":[{"image":"rayproject/ray:2.46.0","name":"ray-head","ports":[{"containerPort":6379,"name":"gcs-server"},{"containerPort":8265,"name":"dashboard"},{"containerPort":10001,"name":"client"}],"resources":{"limits":{"cpu":1,"memory":"2G"},"requests":{"cpu":1,"memory":"2G"}}}],"schedulerName":"default-scheduler"}}},"rayVersion":"2.46.0","workerGroupSpecs":[{"groupName":"workergroup","minReplicas":3,"numOfHosts":4,"rayStartParams":{},"replicas":1,"template":{"spec":{"containers":[{"image":"rayproject/ray:2.46.0","name":"ray-worker","resources":{"limits":{"cpu":1,"memory":"1G"},"requests":{"cpu":1,"memory":"1G"}}}]}}}]}}
  creationTimestamp: "2025-10-28T16:05:15Z"
  generation: 1
  name: raycluster-kuberay
  namespace: default
  resourceVersion: "13168"
  uid: c83e5bd6-c5ee-4a22-b615-341fee75725a
....
status:
  availableWorkerReplicas: 12
  conditions:
  - lastTransitionTime: "2025-10-28T16:05:36Z"
    message: ""
    reason: HeadPodRunningAndReady
    status: "True"
    type: HeadPodReady
  - lastTransitionTime: "2025-10-28T16:06:43Z"
    message: All Ray Pods are ready for the first time
    reason: AllPodRunningAndReadyFirstTime
    status: "True"
    type: RayClusterProvisioned
  - lastTransitionTime: "2025-10-28T16:05:36Z"
    message: ""
    reason: RayClusterSuspended
    status: "False"
    type: RayClusterSuspended
  - lastTransitionTime: "2025-10-28T16:05:36Z"
    message: ""
    reason: RayClusterSuspending
    status: "False"
    type: RayClusterSuspending
  desiredCPU: "5"
  desiredGPU: "0"
  desiredMemory: 6G
  desiredTPU: "0"
  desiredWorkerReplicas: 12
  endpoints:
    client: "10001"
    dashboard: "8265"
    gcs-server: "6379"
    metrics: "8080"
  head:
    podIP: 10.244.0.34
    podName: raycluster-kuberay-head-g5wzm
    serviceIP: 10.244.0.34
    serviceName: raycluster-kuberay-head-svc
  lastUpdateTime: "2025-10-28T16:06:43Z"
  maxWorkerReplicas: -4                <---- overflowed
  minWorkerReplicas: 12
  observedGeneration: 1
  readyWorkerReplicas: 12
  state: ready
  stateTransitionTimes:
    ready: "2025-10-28T16:06:43Z"

After the fix, the maximum replicas now caps at 2,147,483,647.

$kubectl get raycluster raycluster-kuberay -o yaml

apiVersion: ray.io/v1
kind: RayCluster
metadata:
  annotations:
    kubectl.kubernetes.io/last-applied-configuration: |
      {"apiVersion":"ray.io/v1","kind":"RayCluster","metadata":{"annotations":{},"name":"raycluster-kuberay","namespace":"default"},"spec":{"headGroupSpec":{"rayStartParams":{},"template":{"spec":{"containers":[{"image":"rayproject/ray:2.46.0","name":"ray-head","ports":[{"containerPort":6379,"name":"gcs-server"},{"containerPort":8265,"name":"dashboard"},{"containerPort":10001,"name":"client"}],"resources":{"limits":{"cpu":1,"memory":"2G"},"requests":{"cpu":1,"memory":"2G"}}}],"schedulerName":"default-scheduler"}}},"rayVersion":"2.46.0","workerGroupSpecs":[{"groupName":"workergroup","minReplicas":3,"numOfHosts":4,"rayStartParams":{},"replicas":1,"template":{"spec":{"containers":[{"image":"rayproject/ray:2.46.0","name":"ray-worker","resources":{"limits":{"cpu":1,"memory":"1G"},"requests":{"cpu":1,"memory":"1G"}}}]}}}]}}
  creationTimestamp: "2025-10-30T10:53:15Z"
  generation: 1
  name: raycluster-kuberay
  namespace: default
  resourceVersion: "2885229"
  uid: 1be0d901-2c48-45e1-a3ef-fbfc6f4f6d45
spec:
  headGroupSpec:
    rayStartParams: {}
    template:
      spec:
        containers:
        - image: rayproject/ray:2.46.0
          name: ray-head
          ports:
          - containerPort: 6379
            name: gcs-server
            protocol: TCP
          - containerPort: 8265
            name: dashboard
            protocol: TCP
          - containerPort: 10001
            name: client
            protocol: TCP
          resources:
            limits:
              cpu: 1
              memory: 2G
            requests:
              cpu: 1
              memory: 2G
        schedulerName: default-scheduler
  rayVersion: 2.46.0
  workerGroupSpecs:
  - groupName: workergroup
    maxReplicas: 2147483647.               <---- caps at 2147483647
    minReplicas: 3
    numOfHosts: 4
    rayStartParams: {}
    replicas: 1
    template:
      spec:
        containers:
        - image: rayproject/ray:2.46.0
          name: ray-worker
          resources:
            limits:
              cpu: 1
              memory: 1G
            requests:
              cpu: 1
              memory: 1G

Related issue number

Closes #4153

Checks

  • I've made sure the tests are passing.
  • Testing Strategy
    • Unit tests
    • Manual tests
    • This PR is not tested :(

@justinyeh1995 justinyeh1995 marked this pull request as ready for review October 30, 2025 23:34
@justinyeh1995 justinyeh1995 marked this pull request as draft October 30, 2025 23:37
@justinyeh1995 justinyeh1995 marked this pull request as ready for review October 30, 2025 23:37
@justinyeh1995
Copy link
Contributor Author

justinyeh1995 commented Oct 30, 2025

@win5923 @Future-Outlier I somehow could not add you as reviewers, so I will ping you here instead. Please take a look. Thanks!

Copy link
Collaborator

@win5923 win5923 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM, do we also need to check overflow for CalculateMinReplicas?
@Future-Outlier

Copy link
Collaborator

@troychiu troychiu left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can we do sth like

func SafeUint64ToInt64(n uint64) int64 {
to avoid nosec?

@justinyeh1995
Copy link
Contributor Author

Can we do sth like

func SafeUint64ToInt64(n uint64) int64 {

to avoid nosec?

I think we can.

Say we have another function

func SafeInt64ToInt32(n int64) int32 

we can change the logic into something like this

func CalculateMaxReplicas(cluster *rayv1.RayCluster) int32 {
	count := int64(0)
	for _, nodeGroup := range cluster.Spec.WorkerGroupSpecs {
		if nodeGroup.Suspend != nil && *nodeGroup.Suspend {
			continue
		}
		count += int64(*nodeGroup.MaxReplicas) * int64(nodeGroup.NumOfHosts)

		// stop the calculation if an overflow happens
		if count > math.MaxInt32 {
			break
		}
	}

	return SafeInt64ToInt32(count)

Do you think it will be a reasonable way of approaching it?

@troychiu
Copy link
Collaborator

troychiu commented Nov 4, 2025

yeah IMO that's better

Copy link
Contributor

@400Ping 400Ping left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

@justinyeh1995
Copy link
Contributor Author

yeah IMO that's better

Sounds good. I will update this section accordingly.

func CapInt64ToInt32(n int64) int32 {
if n > math.MaxInt32 {
return math.MaxInt32
}
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You may need a minimum check to avoid lint error. Also, could you follow the naming convention for the function?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for the feedback! I'll add the minimum check and see if it resolves the lint error, and rename the function to SafeInt64ToInt32 as well.

count += (*nodeGroup.MaxReplicas * nodeGroup.NumOfHosts)
count += int64(*nodeGroup.MaxReplicas) * int64(nodeGroup.NumOfHosts)

// early return if an overflow happens
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Do we still need this early return if we will do a safe type casting?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think that's a decision to be made. The early return is a non-critical optimization
but not as clean as Option 1. I'm happy to go with either way.

Option 1

func CalculateMaxReplicas(cluster *rayv1.RayCluster) int32 {
	count := int64(0)
	for _, nodeGroup := range cluster.Spec.WorkerGroupSpecs {
		if nodeGroup.Suspend != nil && *nodeGroup.Suspend {
			continue
		}
		count += int64(*nodeGroup.MaxReplicas) * int64(nodeGroup.NumOfHosts)
	}
	return SafeInt64ToInt32(count)

Option 2

func CalculateMaxReplicas(cluster *rayv1.RayCluster) int32 {
	count := int64(0)
	for _, nodeGroup := range cluster.Spec.WorkerGroupSpecs {
		if nodeGroup.Suspend != nil && *nodeGroup.Suspend {
			continue
		}
		count += int64(*nodeGroup.MaxReplicas) * int64(nodeGroup.NumOfHosts)

		// early return if an overflow happens
		if count > math.MaxInt32 {
			return math.MaxInt32
		}
	}
	return SafeInt64ToInt32(count)
}

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'll go for option 1 but if others think option 2 is better then I am fine with it.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I see. Now with the SafeInt64ToInt32 doing the overflow/underflow prevention. I think Option 1 is much cleaner.

… prevention (it also help pass the lint test)

Signed-off-by: justinyeh1995 <[email protected]>
…t32 overflow and underflow checking.

Signed-off-by: justinyeh1995 <[email protected]>
Copy link
Collaborator

@troychiu troychiu left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thank you!

@rueian rueian merged commit b51f885 into ray-project:master Nov 7, 2025
27 checks passed
@justinyeh1995 justinyeh1995 deleted the fix/4153-operator-maxreplicas-overflow branch November 7, 2025 02:38
andrewsykim added a commit that referenced this pull request Nov 21, 2025
* [Bug] Sidecar mode shouldn't restart head pod when head pod is deleted (#4141)

* [Bug] Sidecar mode shouldn't restart head pod when head pod is deleted

Signed-off-by: 400Ping <[email protected]>

* [Fix] Fix e2e error

Signed-off-by: 400Ping <[email protected]>

* [Fix] fix according to rueian's comment

Signed-off-by: 400Ping <[email protected]>

* [Chore] fix ci error

Signed-off-by: 400Ping <[email protected]>

* Update ray-operator/controllers/ray/raycluster_controller.go

Co-authored-by: Han-Ju Chen (Future-Outlier) <[email protected]>
Signed-off-by: Ping <[email protected]>

* Update ray-operator/controllers/ray/rayjob_controller.go

Co-authored-by: Han-Ju Chen (Future-Outlier) <[email protected]>
Signed-off-by: Ping <[email protected]>

* update

Signed-off-by: Future-Outlier <[email protected]>

* update

Signed-off-by: Future-Outlier <[email protected]>

* Trigger CI

Signed-off-by: Future-Outlier <[email protected]>

---------

Signed-off-by: 400Ping <[email protected]>
Signed-off-by: Ping <[email protected]>
Signed-off-by: Future-Outlier <[email protected]>
Co-authored-by: Han-Ju Chen (Future-Outlier) <[email protected]>

* fix: dashboard build for kuberay 1.5.0 (#4161)

Signed-off-by: Future-Outlier <[email protected]>

* [Feature Enhancement] Set ordered replica index label to support multi-slice (#4163)

* [Feature Enhancement] Set ordered replica index label to support multi-slice

Signed-off-by: Ryan O'Leary <[email protected]>

* rename replica-id -> replica-name

Signed-off-by: Ryan O'Leary <[email protected]>

* Separate replica index feature gate logic

Signed-off-by: Ryan O'Leary <[email protected]>

* remove index arg in createWorkerPod

Signed-off-by: Ryan O'Leary <[email protected]>

---------

Signed-off-by: Ryan O'Leary <[email protected]>

* update stale feature gate comments (#4174)

Signed-off-by: Andrew Sy Kim <[email protected]>

* [RayCluster] Add more context why we don't recreate head Pod for RayJob (#4175)

Signed-off-by: Kai-Hsun Chen <[email protected]>

* feature: Remove empty resource list initialization. (#4168)

Fixes #4142.

* [Dockerfile] [KubeRay Dashboard]: Fix Dockerfile warnings (ENV format, CMD JSON args) (#4167)

* [#4166] improvement: Fix Dockerfile warnings (ENV format, CMD JSON args)

* extract the hostname from CMD

Signed-off-by: Neo Chien <[email protected]>

---------

Signed-off-by: Neo Chien <[email protected]>
Co-authored-by: cchung100m <[email protected]>

* [Fix] Resolve int32 overflow by having the calculation in int64 and c… (#4158)

* [Fix] Resolve int32 overflow by having the calculation in int64 and cap it if the count is over math.MaxInt32

Signed-off-by: justinyeh1995 <[email protected]>

* [Test] Add unit tests for CalculateReadyReplicas

Signed-off-by: justinyeh1995 <[email protected]>

* [Fix] Add a nosec comment to pass the Lint (pre-commit) test

Signed-off-by: justinyeh1995 <[email protected]>

* [Refactor] Add CapInt64ToInt32 to replace #nosec directives

Signed-off-by: justinyeh1995 <[email protected]>

* [Refactor] Rename function to SafeInt64ToInt32 and add a underflowing prevention (it also help pass the lint test)

Signed-off-by: justinyeh1995 <[email protected]>

* [Refactor] Remove the early return as SafeInt64ToInt32 handles the int32 overflow and underflow checking.

Signed-off-by: justinyeh1995 <[email protected]>

---------

Signed-off-by: justinyeh1995 <[email protected]>

* Add RayService incremental upgrade sample for guide (#4164)

Signed-off-by: Ryan O'Leary <[email protected]>

* Edit RayCluster example config for label selectors (#4151)

Signed-off-by: Ryan O'Leary <[email protected]>

* [RayJob] update light weight submitter image from quay.io (#4181)

Signed-off-by: Future-Outlier <[email protected]>

* [flaky] RayJob fails when head Pod is deleted when job is running (#4182)

Signed-off-by: Future-Outlier <[email protected]>

* [CI] Pin Docker api version to avoid API version mismatch (#4188)

Signed-off-by: win5923 <[email protected]>

* Make replicas configurable for kuberay-operator #4180 (#4195)

* Make replicas configurable for kuberay-operator #4180

* Make replicas configurable for kuberay-operator #4180

* [Fix] rayjob update raycluster status (#4192)

* feat: check if raycluster status update in rayjob

* test: e2e test to check the rayjob raycluster status update

* fix: dashboard http client tests discovered and passing (#4173)

Signed-off-by: alimaazamat <[email protected]>

* [RayJob] Lift cluster status while initializing (#4191)

Signed-off-by: Spencer Peterson <[email protected]>

* [RayJob] Remove updateJobStatus call (#4198)

Fast follow to #4191

Signed-off-by: Spencer Peterson <[email protected]>

* Add support for Ray token auth (#4179)

* Add support for Ray token auth

Signed-off-by: Andrew Sy Kim <[email protected]>

* add e2e test for Ray cluster auth

Signed-off-by: Andrew Sy Kim <[email protected]>

* address nits from Ruiean

Signed-off-by: Andrew Sy Kim <[email protected]>

* update RAY_auth_mode -> RAY_AUTH_MODE

Signed-off-by: Andrew Sy Kim <[email protected]>

* configure auth for Ray autoscaler

Signed-off-by: Andrew Sy Kim <[email protected]>

---------

Signed-off-by: Andrew Sy Kim <[email protected]>

* Bump js-yaml from 4.1.0 to 4.1.1 in /dashboard (#4194)

Bumps [js-yaml](https://github.com/nodeca/js-yaml) from 4.1.0 to 4.1.1.
- [Changelog](https://github.com/nodeca/js-yaml/blob/master/CHANGELOG.md)
- [Commits](nodeca/js-yaml@4.1.0...4.1.1)

---
updated-dependencies:
- dependency-name: js-yaml
  dependency-version: 4.1.1
  dependency-type: indirect
...

Signed-off-by: dependabot[bot] <[email protected]>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>

* update minimum Ray version required for token authentication to 2.52.0 (#4201)

* update minimum Ray version required for token authentication to 2.52.0

Signed-off-by: Andrew Sy Kim <[email protected]>

* update RayCluster auth e2e test to use Ray v2.52

Signed-off-by: Andrew Sy Kim <[email protected]>

---------

Signed-off-by: Andrew Sy Kim <[email protected]>

* add samples for RayCluster token auth (#4200)

Signed-off-by: Andrew Sy Kim <[email protected]>

* update (#4208)

Signed-off-by: Future-Outlier <[email protected]>

* [RayJob] Add token authentication support for All mode (#4210)

* dashboard client authentication support

Signed-off-by: Future-Outlier <[email protected]>

* support rayjob

Signed-off-by: Future-Outlier <[email protected]>

* update to fix api serverr err

Signed-off-by: Future-Outlier <[email protected]>

* update

Signed-off-by: Future-Outlier <[email protected]>

* updarte

Signed-off-by: Future-Outlier <[email protected]>

* Rayjob sidecar mode auth token mode support

Signed-off-by: Future-Outlier <[email protected]>

* RayJob support k8s job mode

Signed-off-by: Future-Outlier <[email protected]>

* update

Signed-off-by: Future-Outlier <[email protected]>

* update

Signed-off-by: Future-Outlier <[email protected]>

* update

Signed-off-by: Future-Outlier <[email protected]>

* Address Andrew's advice

Signed-off-by: Future-Outlier <[email protected]>

* add todo x-ray-authorization comments

Signed-off-by: Future-Outlier <[email protected]>

---------

Signed-off-by: Future-Outlier <[email protected]>

* [RayCluster] Enable Secret informer watch/list and remove unused RBAC verbs (#4202)

* Add authentication secret reconciliation support

Signed-off-by: Future-Outlier <[email protected]>

* update

Signed-off-by: Future-Outlier <[email protected]>

* update

Signed-off-by: Future-Outlier <[email protected]>

* fix flaky test

Signed-off-by: Future-Outlier <[email protected]>

* remove test fix

Signed-off-by: Rueian <[email protected]>

---------

Signed-off-by: Future-Outlier <[email protected]>
Signed-off-by: Rueian <[email protected]>
Co-authored-by: Rueian <[email protected]>

* [APIServer][Docs] Add user guide for retry behavior & configuration (#4144)

* [Docs] Add the draft description about feature intro, configurations, and usecases

Signed-off-by: justinyeh1995 <[email protected]>

* [Fix] Update the retry walk-through

Signed-off-by: justinyeh1995 <[email protected]>

* [Doc] rewrite the first 2 sections

Signed-off-by: justinyeh1995 <[email protected]>

* [Doc] Revise documentation wording and add Observing Retry Behavior section

Signed-off-by: justinyeh1995 <[email protected]>

* [Fix] fix linting issue by running pre-commit run berfore commiting

Signed-off-by: justinyeh1995 <[email protected]>

* [Fix] fix linting errors in the Markdown linting

Signed-off-by: justinyeh1995 <[email protected]>

* [Fix] Clean up the math equation

Signed-off-by: justinyeh1995 <[email protected]>

* Update the math formula of Backoff calculation.

Co-authored-by: Nary Yeh <[email protected]>
Signed-off-by: JustinYeh <[email protected]>

* [Fix] Explicitly mentioned exponential backoff and removed the customization parts

Signed-off-by: justinyeh1995 <[email protected]>

* [Docs] Clarify naming by replacing “APIServer” with “KubeRay APIServer”

Co-authored-by: Cheng-Yeh Chung <[email protected]>
Signed-off-by: JustinYeh <[email protected]>

* [Docs] Rename retry-configuration.md to retry-behavior.md for accuracy

Signed-off-by: justinyeh1995 <[email protected]>

* Update Title to KubeRay APIServer Retry Behavior

Co-authored-by: Cheng-Yeh Chung <[email protected]>
Signed-off-by: JustinYeh <[email protected]>

* [Docs] Add a note about the limitation of retry configuration

Signed-off-by: justinyeh1995 <[email protected]>

---------

Signed-off-by: justinyeh1995 <[email protected]>
Signed-off-by: JustinYeh <[email protected]>
Co-authored-by: Nary Yeh <[email protected]>
Co-authored-by: Cheng-Yeh Chung <[email protected]>

* Support X-Ray-Authorization fallback header for accepting auth token via proxy (#4213)

* Support X-Ray-Authorization fallback header for accepting auth token in dashboard

Signed-off-by: Future-Outlier <[email protected]>

* remove todo comment

Signed-off-by: Future-Outlier <[email protected]>

---------

Signed-off-by: Future-Outlier <[email protected]>

* [RayCluster] make auth token secret name consistency (#4216)

Signed-off-by: fscnick <[email protected]>

* [RayCluster] Status includes head containter status message (#4196)

* [RayCluster] Status includes head containter status message

Signed-off-by: Spencer Peterson <[email protected]>

* lint

Signed-off-by: Spencer Peterson <[email protected]>

* [RayCluster] Containers not ready status reflects structured reason

Signed-off-by: Spencer Peterson <[email protected]>

* nit

Signed-off-by: Spencer Peterson <[email protected]>

---------

Signed-off-by: Spencer Peterson <[email protected]>

* Remove erroneous  call in applyServeTargetCapacity (#4212)

Signed-off-by: Ryan O'Leary <[email protected]>

* [RayJob] Add token authentication support for light weight job submitter (#4215)

* [RayJob] light weight job submitter auth token support

Signed-off-by: Future-Outlier <[email protected]>

* X-Ray-Authorization

Signed-off-by: Rueian <[email protected]>

---------

Signed-off-by: Future-Outlier <[email protected]>
Signed-off-by: Rueian <[email protected]>
Co-authored-by: Rueian <[email protected]>

* feat: kubectl ray get token command (#4218)

* feat: kubectl ray get token command

Signed-off-by: Rueian <[email protected]>

* Update kubectl-plugin/pkg/cmd/get/get_token_test.go

Co-authored-by: Copilot <[email protected]>
Signed-off-by: Rueian <[email protected]>

* Update kubectl-plugin/pkg/cmd/get/get_token.go

Co-authored-by: Copilot <[email protected]>
Signed-off-by: Rueian <[email protected]>

* make sure the raycluster exists before getting the secret

Signed-off-by: Rueian <[email protected]>

* better ux

Signed-off-by: Rueian <[email protected]>

* Update kubectl-plugin/pkg/cmd/get/get_token.go

Co-authored-by: Han-Ju Chen (Future-Outlier) <[email protected]>
Signed-off-by: Rueian <[email protected]>

---------

Signed-off-by: Rueian <[email protected]>
Co-authored-by: Copilot <[email protected]>
Co-authored-by: Han-Ju Chen (Future-Outlier) <[email protected]>

---------

Signed-off-by: 400Ping <[email protected]>
Signed-off-by: Ping <[email protected]>
Signed-off-by: Future-Outlier <[email protected]>
Signed-off-by: Ryan O'Leary <[email protected]>
Signed-off-by: Andrew Sy Kim <[email protected]>
Signed-off-by: Kai-Hsun Chen <[email protected]>
Signed-off-by: Neo Chien <[email protected]>
Signed-off-by: justinyeh1995 <[email protected]>
Signed-off-by: win5923 <[email protected]>
Signed-off-by: alimaazamat <[email protected]>
Signed-off-by: Spencer Peterson <[email protected]>
Signed-off-by: dependabot[bot] <[email protected]>
Signed-off-by: Rueian <[email protected]>
Signed-off-by: JustinYeh <[email protected]>
Signed-off-by: fscnick <[email protected]>
Co-authored-by: Ping <[email protected]>
Co-authored-by: Han-Ju Chen (Future-Outlier) <[email protected]>
Co-authored-by: Ryan O'Leary <[email protected]>
Co-authored-by: Kai-Hsun Chen <[email protected]>
Co-authored-by: Kavish <[email protected]>
Co-authored-by: Neo Chien <[email protected]>
Co-authored-by: cchung100m <[email protected]>
Co-authored-by: JustinYeh <[email protected]>
Co-authored-by: Jun-Hao Wan <[email protected]>
Co-authored-by: Divyam Raj <[email protected]>
Co-authored-by: Nary Yeh <[email protected]>
Co-authored-by: Alima Azamat <[email protected]>
Co-authored-by: Spencer Peterson <[email protected]>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
Co-authored-by: Rueian <[email protected]>
Co-authored-by: Cheng-Yeh Chung <[email protected]>
Co-authored-by: fscnick <[email protected]>
Co-authored-by: Copilot <[email protected]>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

[Bug] Status.MaxWorkerReplicas overflow when numOfHosts > 1

5 participants