Skip to content

Unable to auto-scale Kubernetes cluster #52

@saffronjam

Description

@saffronjam

Hi!

I am unable to auto-scale Kubernetes clusters. As I understand, it create a "cluster-autoscaler" deployment that decides whether to scale or not. However, it does not seem to work, since it logs multiple errors and warnings in the pod, even though it is a completely clean cluster.

Normal scaling seems to work just fine.

Setup

A "default" CloudStack setup 4.18 running KVMs.

Settings (relevant)

  • Cloud kubernetes service enabled true
  • Cloud kubernetes cluster experimental features enabled true
  • Cloud kubernetes cluster max size 50

The nodes uses the following service offering:

  • 2 CPU x 2.05 Ghz
  • 2048 MB memory
  • 8 GB root disk

Replicate

  1. Create a new cluster using Kubernets 1.24 ISO found here:
    http://download.cloudstack.org/cks/

  2. Enable forced auto-scaling
    Since the cluster starts with only one worker node, auto-scaling with 3-5 nodes should trigger an upscale (I assume)
    Screenshot from 2023-08-07 16-55-00

  3. Check the logs for cluster-autoscaler in the Kubernetes cluster
    Some notable entries:

E0807 14:41:30.317148       1 reflector.go:138] k8s.io/client-go/informers/factory.go:134: Failed to watch *v1.CSIDriver: failed to list *v1.CSIDriver: csidrivers.storage.k8s.io is forbidden: User "system:serviceaccount:kube-system:cluster-autoscaler" cannot list resource "csidrivers" in API group "storage.k8s.io" at the cluster scope

E0807 14:41:32.388828       1 reflector.go:138] k8s.io/client-go/informers/factory.go:134: Failed to watch *v1beta1.CSIStorageCapacity: failed to list *v1beta1.CSIStorageCapacity: csistoragecapacities.storage.k8s.io is forbidden: User "system:serviceaccount:kube-system:cluster-autoscaler" cannot list resource "csistoragecapacities" in API group "storage.k8s.io" at the cluster scope

Even though I have not edited anything myself (just a clean CKS cluster), I get these weird logs:

W0807 14:41:43.251280       1 clusterstate.go:590] Failed to get nodegroup for 6a4c91a3-9694-4596-9ddd-dc86e60136ff: Unable to find node 6a4c91a3-9694-4596-9ddd-dc86e60136ff in cluster

W0807 14:41:43.251361       1 clusterstate.go:590] Failed to get nodegroup for bd0b855f-6dc6-4678-9bea-b52329333024: Unable to find node bd0b855f-6dc6-4678-9bea-b52329333024 in cluster

I0807 14:57:06.667061       1 static_autoscaler.go:341] 2 unregistered nodes present

The IDs are correct in CloudStack

The entire log:
logs-from-cluster-autoscaler-in-cluster-autoscaler-5bf887ddd8-hxg2g.log

Please tell me if you need more logs to look at, or if I should try some other configuration.

Thanks!

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Projects

    No projects

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions