Skip to content

feat!(kubernetes): add fine grained pod scheduling configurations with nodeSelector, tolerations, and affinity #6658

Merged
theosanderson merged 4 commits into
loculus-project:mainfrom
florianzwagemaker:k8s_nodepoolselector
Jun 12, 2026
Merged

feat!(kubernetes): add fine grained pod scheduling configurations with nodeSelector, tolerations, and affinity #6658
theosanderson merged 4 commits into
loculus-project:mainfrom
florianzwagemaker:k8s_nodepoolselector

Conversation

@florianzwagemaker

@florianzwagemaker florianzwagemaker commented Jun 11, 2026

Copy link
Copy Markdown
Contributor

As briefly discussed on Slack.
This PR adds the podScheduling root-level configuration property in the helm chart with three new child properties for kubernetes pod scheduling control;

  • nodeSelector (object): maps the Loculus pod deployment directly to a Kubernetes node with a defined label for placement
  • tolerations (array): allows pods to bypass specific node taints
  • affinity (object): allows for the addition of fine grained affinity rules such as nodeAffinity or podAffinity as well as conditional scheduling.

Additionally moved the podPriorityValue key from a root property to be grouped underneath the podScheduling key.

Example

This assumes a cluster is already created either in kubernetes or locally through k3s.

  1. get the node name with kubectl get nodes
  2. label the node where you wish to deploy the loculus pods with kubectl label nodes my-nodepool workload-type=my-custom-label
  3. taint the node to ensure only the pods with toleration rules are allowed on this particular node with kubectl taint nodes my-nodepool workload-type=my-custom-label:NoSchedule
  4. Define your node pool configuration in the helm chart like the following:
podScheduling:
  nodeSelector: {}
  tolerations:
    - key: "workload-type"
      operator: "Equal"
      value: "my-custom-label"
      effect: "NoSchedule"
  affinity:
    nodeAffinity:
      requiredDuringSchedulingIgnoredDuringExecution:
        nodeSelectorTerms:
          - matchExpressions:
              - key: "workload-type"
                operator: "In"
                values:
                  - "my-custom-label"

when deploying, all Loculus pods will now schedule on the node named my-nodepool because it matches the assigned label my-custom-label in the affinity rules. The tolerations rule states that the Loculus pods are allowed on this label while any other pod without this tolerations rule won't be allowed on this node during scheduling.

If a more direct mapping is desired then this can be achieved the yaml example below, but this will not allow for more fine-grained control which is probably desired in production-like kubernetes clusters

podScheduling:
  nodeSelector:
    workload-type: my-custom-label

Breaking changes

Need to move podPriorityValue if set, to be under podScheduling

🚀 Preview: Add preview label to enable

Copilot AI review requested due to automatic review settings June 11, 2026 10:09

Copilot AI left a comment

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

Note

Copilot was unable to run its full agentic suite in this review.

Adds configurable Kubernetes pod scheduling support to the Loculus Helm chart so operators can control placement via node selectors, tolerations, and affinity.

Changes:

  • Introduces nodeSelector, tolerations, and affinity values (with schema entries) for chart-wide pod scheduling.
  • Adds a shared Helm template helper (loculus.podScheduling) and wires it into multiple workloads’ pod specs.

Reviewed changes

Copilot reviewed 17 out of 17 changed files in this pull request and generated 4 comments.

Show a summary per file
File Description
kubernetes/loculus/values.yaml Adds default/suggested pod scheduling values and a brief comment.
kubernetes/loculus/values.schema.json Defines schema/validation for the new scheduling values.
kubernetes/loculus/templates/_pod-scheduling.tpl Adds reusable helper to render nodeSelector/tolerations/affinity blocks.
kubernetes/loculus/templates/taxonomy-deployment.yaml Injects shared pod scheduling block into pod spec.
kubernetes/loculus/templates/silo-deployment.yaml Injects shared pod scheduling block into pod spec.
kubernetes/loculus/templates/minio-deployment.yaml Injects shared pod scheduling block into pod spec.
kubernetes/loculus/templates/loculus-website.yaml Injects shared pod scheduling block into pod spec.
kubernetes/loculus/templates/loculus-preprocessing-deployment.yaml Injects shared pod scheduling block into pod spec.
kubernetes/loculus/templates/loculus-database-standin.yaml Injects shared pod scheduling block into pod spec.
kubernetes/loculus/templates/loculus-backend.yaml Injects shared pod scheduling block into pod spec.
kubernetes/loculus/templates/lapis-deployment.yaml Injects shared pod scheduling block into pod spec.
kubernetes/loculus/templates/keycloak-deployment.yaml Injects shared pod scheduling block into pod spec.
kubernetes/loculus/templates/keycloak-database-standin.yaml Injects shared pod scheduling block into pod spec.
kubernetes/loculus/templates/ena-submission-deployment.yaml Injects shared pod scheduling block into pod spec.
kubernetes/loculus/templates/docs-preview.yaml Injects shared pod scheduling block into pod spec.
kubernetes/loculus/templates/autoapprove-deployment.yaml Injects shared pod scheduling block into pod spec.
kubernetes/loculus/templates/_ingest-pod-spec.tpl Injects shared pod scheduling block into ingest job pod spec.

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

Comment thread kubernetes/loculus/values.schema.json Outdated
Comment thread kubernetes/loculus/values.schema.json Outdated
Comment thread kubernetes/loculus/values.schema.json Outdated
Comment thread kubernetes/loculus/values.schema.json Outdated
Comment thread kubernetes/loculus/values.yaml
…ed under general podScheduling key including podPriorityValue
@theosanderson

Copy link
Copy Markdown
Member

@codex review

@theosanderson theosanderson changed the title feat(kubernetes): add fine grained pod scheduling configurations with nodeSelector, tolerations, and affinity feat!(kubernetes): add fine grained pod scheduling configurations with nodeSelector, tolerations, and affinity Jun 11, 2026

@chatgpt-codex-connector chatgpt-codex-connector Bot left a comment

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

💡 Codex Review

Here are some automated review suggestions for this pull request.

Reviewed commit: 3ae7eb04d8

ℹ️ About Codex in GitHub

Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you

  • Open a pull request for review
  • Mark a draft as ready
  • Comment "@codex review".

If Codex has suggestions, it will comment; otherwise it will react with 👍.

Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".

Comment thread kubernetes/loculus/templates/ena-submission-deployment.yaml
Comment thread kubernetes/loculus/templates/_possiblePriorityClassName.tpl
@theosanderson

Copy link
Copy Markdown
Member

@claude review

@claude

claude Bot commented Jun 11, 2026

Copy link
Copy Markdown
Contributor

Claude finished @theosanderson's task in 2m 32s —— View job


Code Review

  • Gather context and understand the PR
  • Read changed files
  • Check for missing pod scheduling in CronJobs/Jobs
  • Review schema validation
  • Review template helper and all deployment templates
  • Post comprehensive review

Overall this is a clean, well-structured implementation. The helper template approach avoids repetition nicely, and the grouping under podScheduling is the right call. There are two confirmed bugs where pods will be Pending or misscheduled when tolerations/nodeSelector/affinity are configured, plus a few minor issues.

Critical: Missing loculus.podScheduling in two pod specs

Both of these already have possiblePriorityClassName but are missing the new scheduling block — so if the cluster uses node taints, these pods will fail to schedule.

1. ena-submission-deployment.yaml — the loculus-get-ena-submission-list-cronjob CronJob

# Line 136 — inside the CronJob's jobTemplate.spec.template.spec
spec:
  {{- include "possiblePriorityClassName" . | nindent 10 }}
  # ← missing: {{- include "loculus.podScheduling" . | nindent 10 }}
  restartPolicy: Never

Fix this →

2. ingest.yaml — the loculus-ingest-trigger-{{ $key }} PostSync Job

# Line 144 — inside the PostSync Job's spec.template.spec
spec:
  {{- include "possiblePriorityClassName" $ | nindent 6 }}
  # ← missing: {{- include "loculus.podScheduling" $ | nindent 6 }}
  restartPolicy: Never

Note: this uses $ (root context) not ., so the fix should use $ to stay consistent.

Fix this →


Breaking change: podPriorityValue migration

The PR notes this is a known breaking change, and the schema correctly removes the root podPriorityValue property. Existing values files with podPriorityValue: 1000 at the root level will silently stop working — no PriorityClass is created and no error is thrown, because the root schema doesn't enforce additionalProperties: false. If any production deployments currently set this, they'll silently lose pod priority on upgrade. Worth calling out explicitly in release notes / the chart's CHANGELOG or NOTES.txt.


Minor: Non-idiomatic YAML indentation in _pod-scheduling.tpl

The template uses a slightly unusual pattern:

nodeSelector:
{{ toYaml .Values.podScheduling.nodeSelector | indent 2 }}
{{- end }}

The standard Helm idiom (which avoids a potential extra blank line when the input YAML has a trailing newline) is:

nodeSelector:
  {{- toYaml .Values.podScheduling.nodeSelector | nindent 2 }}

In practice the current version renders correct YAML, but the {{- toYaml ... | nindent N }} form is what you'll find in the Helm docs and most official charts. Low priority, but worth knowing.


Looks good ✓

  • Schema structure for podScheduling with additionalProperties: false is correct — good for catching typos.
  • Default values ({} / []) are correct; the if guards in the template are falsy for empty maps/lists so nothing is rendered by default.
  • All Deployments, StatefulSets, and the shared _ingest-pod-spec.tpl (used by the scheduled ingest CronJobs) are correctly wired up.
  • priorityclass.yaml and _possiblePriorityClassName.tpl are correctly updated to use podScheduling.podPriorityValue.

…plates & fixed the yaml indentation in the `_pod-scheduling.tpl` file
@florianzwagemaker

Copy link
Copy Markdown
Contributor Author

Included the podScheduling block to the last two pod specs as suggested, all pods should now schedule correctly.
Let me know if any other changes are wanted or necessary :)

@theosanderson theosanderson left a comment

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM and CI passes in #6665, thanks

@theosanderson theosanderson merged commit 093d039 into loculus-project:main Jun 12, 2026
55 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants