Skip to content

Conversation

ngopalak-redhat
Copy link
Contributor

@ngopalak-redhat ngopalak-redhat commented Sep 22, 2025

Fixes: https://issues.redhat.com/browse/OCPNODE-3747
Context:
With the release of 1.34 version of Kubernetes swap feature in Kubelet is GAed. This means that, customer can set FailSwapOn to false and SwapBehavior to LimitedSwap. OpenShift 4.21 is currently planned to adopt 1.34 version of Kubernetes. Hence users of OpenShift 4.21 and above are impacted by this upstream release of swap feature.

- What I did

  1. Ensure that FailSwapOn and MemorySwap.SwapBehavior cannot be set by users of OpenShift
  2. Ensure that FailSwapOn is set to false on worker nodes and true on all other types of nodes. MemorySwap.SwapBehavior is always set to "NoSwap"
  3. Ensure that drop-in directory is enabled on the worker node kubelet so that CNV team can use it to set swapBehavior (won't be documented for external users)

- How to verify it

  • Created a cluster in clusterBot with this PR and ensured that failSwapOn and swapBehavior is set
Worker Nodes:
  - All 3 worker nodes: failSwapOn: false, swapBehavior: "NoSwap"

  Control Plane Nodes:
  - All 3 master nodes: failSwapOn: true, swapBehavior: "NoSwap"

  The worker nodes have swap failure disabled (failSwapOn: false) while control plane nodes have it enabled (failSwapOn: true). All nodes have swapBehavior
  set to "NoSwap".

While this was tested the following PR was also required: openshift/api#2494 (Merged already)

The cluster bot command was:

launch 4.21,openshift/machine-config-operator#5294,openshift/api#2494 gcp
  • Then created a drop-in file
apiVersion: kubelet.config.k8s.io/v1beta1
kind: KubeletConfiguration
memorySwap:
  swapBehavior: LimitedSwap

Checked the output of configz

oc get --raw "/api/v1/nodes/ci-ln-sptb06t-72292-d64jj-worker-a-4zrm5/proxy/configz" | \
jq '.kubeletconfig.memorySwap.swapBehavior'
"LimitedSwap"

Note to reviewer
Unit tests are included. The end-to-end (E2E) tests will be added to the origin repository, as defined by the work items in the Epic: OCPNODE-3646.

I've chosen the origin repository to ensure maintain consistency with other node features (ex: image volume).

- Description for the changelog

Disables openshift users to use swap mode on the Kubelet

@openshift-ci openshift-ci bot added the do-not-merge/work-in-progress Indicates that a PR should not merge because it is a work in progress. label Sep 22, 2025
Copy link
Contributor

openshift-ci bot commented Sep 22, 2025

Skipping CI for Draft Pull Request.
If you want CI signal for your change, please convert it to an actual PR.
You can still manually trigger a test run with /test all

@ngopalak-redhat
Copy link
Contributor Author

/test all

1 similar comment
@ngopalak-redhat
Copy link
Contributor Author

/test all

@ngopalak-redhat
Copy link
Contributor Author

/test unit

@ngopalak-redhat
Copy link
Contributor Author

/retest-required

1 similar comment
@ngopalak-redhat
Copy link
Contributor Author

/retest-required

@ngopalak-redhat ngopalak-redhat changed the title WIP: TODO:Disable Swap mode OCPNODE-3747: Disable Swap mode in Kubelet and enable drop-in directory Oct 6, 2025
@openshift-ci-robot openshift-ci-robot added the jira/valid-reference Indicates that this PR references a valid Jira ticket of any type. label Oct 6, 2025
@openshift-ci-robot
Copy link
Contributor

openshift-ci-robot commented Oct 6, 2025

@ngopalak-redhat: This pull request references OCPNODE-3747 which is a valid jira issue.

Warning: The referenced jira issue has an invalid target version for the target branch this PR targets: expected the story to target the "4.21.0" version, but no target version was set.

In response to this:

Fixes: https://issues.redhat.com/browse/OCPNODE-3747
Context:
With the release of 1.34 version of Kubernetes swap feature in Kubelet is GAed. This means that, customer can set FailSwapOn to false and SwapBehavior to LimitedSwap. OpenShift 4.21 is currently planned to adopt 1.34 version of Kubernetes. Hence users of OpenShift 4.21 and above are impacted by this upstream release of swap feature.

- What I did

  1. Ensure that FailSwapOn and MemorySwap.SwapBehavior cannot be set by users of OpenShift
  2. Ensure that FailSwapOn is set to false on worker nodes and true on all other types of nodes. MemorySwap.SwapBehavior is always set to "NoSwap"
  3. Ensure that drop-in directory is enabled on the worker node kubelet so that CNV team can use it to set swapBehavior (won't be documented for external users)

- How to verify it

  • Created a cluster in clusterBot with this PR and ensured that failSwapOn and swapBehavior is set
Worker Nodes:
 - All 3 worker nodes: failSwapOn: false, swapBehavior: "NoSwap"

 Control Plane Nodes:
 - All 3 master nodes: failSwapOn: true, swapBehavior: "NoSwap"

 The worker nodes have swap failure disabled (failSwapOn: false) while control plane nodes have it enabled (failSwapOn: true). All nodes have swapBehavior
 set to "NoSwap".

While this was tested the following PR was also required: openshift/api#2494

The cluster bot command was:

launch 4.21,openshift/machine-config-operator#5294,openshift/api#2494 gcp
  • Then created a drop-in file
apiVersion: kubelet.config.k8s.io/v1beta1
kind: KubeletConfiguration
memorySwap:
 swapBehavior: LimitedSwap

Checked the output of configz

oc get --raw "/api/v1/nodes/ci-ln-sptb06t-72292-d64jj-worker-a-4zrm5/proxy/configz" | \
jq '.kubeletconfig.memorySwap.swapBehavior'
"LimitedSwap"

- Description for the changelog

Disables openshift users to use swap mode on the Kubelet

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the openshift-eng/jira-lifecycle-plugin repository.

@ngopalak-redhat
Copy link
Contributor Author

/verified later @BhargaviGudi

@openshift-ci-robot openshift-ci-robot added verified-later verified Signifies that the PR passed pre-merge verification criteria labels Oct 6, 2025
@openshift-ci-robot
Copy link
Contributor

@ngopalak-redhat: This PR has been marked to be verified later by @BhargaviGudi.

In response to this:

/verified later @BhargaviGudi

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the openshift-eng/jira-lifecycle-plugin repository.

@openshift-ci-robot
Copy link
Contributor

openshift-ci-robot commented Oct 6, 2025

@ngopalak-redhat: This pull request references OCPNODE-3747 which is a valid jira issue.

Warning: The referenced jira issue has an invalid target version for the target branch this PR targets: expected the story to target the "4.21.0" version, but no target version was set.

In response to this:

Fixes: https://issues.redhat.com/browse/OCPNODE-3747
Context:
With the release of 1.34 version of Kubernetes swap feature in Kubelet is GAed. This means that, customer can set FailSwapOn to false and SwapBehavior to LimitedSwap. OpenShift 4.21 is currently planned to adopt 1.34 version of Kubernetes. Hence users of OpenShift 4.21 and above are impacted by this upstream release of swap feature.

- What I did

  1. Ensure that FailSwapOn and MemorySwap.SwapBehavior cannot be set by users of OpenShift
  2. Ensure that FailSwapOn is set to false on worker nodes and true on all other types of nodes. MemorySwap.SwapBehavior is always set to "NoSwap"
  3. Ensure that drop-in directory is enabled on the worker node kubelet so that CNV team can use it to set swapBehavior (won't be documented for external users)

- How to verify it

  • Created a cluster in clusterBot with this PR and ensured that failSwapOn and swapBehavior is set
Worker Nodes:
 - All 3 worker nodes: failSwapOn: false, swapBehavior: "NoSwap"

 Control Plane Nodes:
 - All 3 master nodes: failSwapOn: true, swapBehavior: "NoSwap"

 The worker nodes have swap failure disabled (failSwapOn: false) while control plane nodes have it enabled (failSwapOn: true). All nodes have swapBehavior
 set to "NoSwap".

While this was tested the following PR was also required: openshift/api#2494

The cluster bot command was:

launch 4.21,openshift/machine-config-operator#5294,openshift/api#2494 gcp
  • Then created a drop-in file
apiVersion: kubelet.config.k8s.io/v1beta1
kind: KubeletConfiguration
memorySwap:
 swapBehavior: LimitedSwap

Checked the output of configz

oc get --raw "/api/v1/nodes/ci-ln-sptb06t-72292-d64jj-worker-a-4zrm5/proxy/configz" | \
jq '.kubeletconfig.memorySwap.swapBehavior'
"LimitedSwap"

Note to reviewer
Unit tests are included. The end-to-end (E2E) tests will be added to the origin repository, as defined by the work items in the Epic: OCPNODE-3646.

I've chosen the origin repository to ensure maintain consistency with other node features (ex: image volume).

- Description for the changelog

Disables openshift users to use swap mode on the Kubelet

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the openshift-eng/jira-lifecycle-plugin repository.

@ngopalak-redhat ngopalak-redhat marked this pull request as ready for review October 6, 2025 12:10
@openshift-ci openshift-ci bot removed the do-not-merge/work-in-progress Indicates that a PR should not merge because it is a work in progress. label Oct 6, 2025
@ngopalak-redhat
Copy link
Contributor Author

@haircommander Can you please review?

@openshift-ci-robot
Copy link
Contributor

openshift-ci-robot commented Oct 6, 2025

@ngopalak-redhat: This pull request references OCPNODE-3747 which is a valid jira issue.

Warning: The referenced jira issue has an invalid target version for the target branch this PR targets: expected the story to target the "4.21.0" version, but no target version was set.

In response to this:

Fixes: https://issues.redhat.com/browse/OCPNODE-3747
Context:
With the release of 1.34 version of Kubernetes swap feature in Kubelet is GAed. This means that, customer can set FailSwapOn to false and SwapBehavior to LimitedSwap. OpenShift 4.21 is currently planned to adopt 1.34 version of Kubernetes. Hence users of OpenShift 4.21 and above are impacted by this upstream release of swap feature.

- What I did

  1. Ensure that FailSwapOn and MemorySwap.SwapBehavior cannot be set by users of OpenShift
  2. Ensure that FailSwapOn is set to false on worker nodes and true on all other types of nodes. MemorySwap.SwapBehavior is always set to "NoSwap"
  3. Ensure that drop-in directory is enabled on the worker node kubelet so that CNV team can use it to set swapBehavior (won't be documented for external users)

- How to verify it

  • Created a cluster in clusterBot with this PR and ensured that failSwapOn and swapBehavior is set
Worker Nodes:
 - All 3 worker nodes: failSwapOn: false, swapBehavior: "NoSwap"

 Control Plane Nodes:
 - All 3 master nodes: failSwapOn: true, swapBehavior: "NoSwap"

 The worker nodes have swap failure disabled (failSwapOn: false) while control plane nodes have it enabled (failSwapOn: true). All nodes have swapBehavior
 set to "NoSwap".

While this was tested the following PR was also required: openshift/api#2494 (Merged already)

The cluster bot command was:

launch 4.21,openshift/machine-config-operator#5294,openshift/api#2494 gcp
  • Then created a drop-in file
apiVersion: kubelet.config.k8s.io/v1beta1
kind: KubeletConfiguration
memorySwap:
 swapBehavior: LimitedSwap

Checked the output of configz

oc get --raw "/api/v1/nodes/ci-ln-sptb06t-72292-d64jj-worker-a-4zrm5/proxy/configz" | \
jq '.kubeletconfig.memorySwap.swapBehavior'
"LimitedSwap"

Note to reviewer
Unit tests are included. The end-to-end (E2E) tests will be added to the origin repository, as defined by the work items in the Epic: OCPNODE-3646.

I've chosen the origin repository to ensure maintain consistency with other node features (ex: image volume).

- Description for the changelog

Disables openshift users to use swap mode on the Kubelet

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the openshift-eng/jira-lifecycle-plugin repository.

@haircommander
Copy link
Member

the commit history is a bit wonky can you fix it up please?

@openshift-ci-robot openshift-ci-robot removed verified Signifies that the PR passed pre-merge verification criteria verified-later labels Oct 6, 2025
@ngopalak-redhat
Copy link
Contributor Author

the commit history is a bit wonky can you fix it up please?

@haircommander Fixed it. Thanks

@ngopalak-redhat
Copy link
Contributor Author

/retest-required

Copy link

@fabiand fabiand left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I cna't review the details, but just enabling the dir and rejecting it on the cluster API level looks correct.

🚀

#just-commenting-to-leave-an-emoji

@haircommander
Copy link
Member

/approve
/lgtm

@haircommander
Copy link
Member

/retest-required

@openshift-ci openshift-ci bot added the lgtm Indicates that a PR is ready to be merged. label Oct 7, 2025
@ngopalak-redhat
Copy link
Contributor Author

/assign djoshy
As per #5294 (comment)

@ngopalak-redhat
Copy link
Contributor Author

/verified later @BhargaviGudi

@openshift-ci-robot openshift-ci-robot added verified-later verified Signifies that the PR passed pre-merge verification criteria labels Oct 7, 2025
@openshift-ci-robot
Copy link
Contributor

@ngopalak-redhat: This PR has been marked to be verified later by @BhargaviGudi.

In response to this:

/verified later @BhargaviGudi

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the openshift-eng/jira-lifecycle-plugin repository.

@djoshy
Copy link
Contributor

djoshy commented Oct 7, 2025

/lgtm
/approve

Took a look and it looks sane to me. Will defer to node team's expertise for the swap mode details.

Copy link
Contributor

openshift-ci bot commented Oct 7, 2025

[APPROVALNOTIFIER] This PR is APPROVED

This pull-request has been approved by: djoshy, haircommander, ngopalak-redhat

The full list of commands accepted by this bot can be found here.

The pull request process is described here

Needs approval from an approver in each of these files:

Approvers can indicate their approval by writing /approve in a comment
Approvers can cancel approval by writing /approve cancel in a comment

@openshift-ci openshift-ci bot added the approved Indicates a PR has been approved by an approver from all required OWNERS files. label Oct 7, 2025
@djoshy
Copy link
Contributor

djoshy commented Oct 7, 2025

/test unit

seems like a flake

Copy link
Contributor

openshift-ci bot commented Oct 7, 2025

@ngopalak-redhat: The following tests failed, say /retest to rerun all failed tests or /retest-required to rerun all mandatory failed tests:

Test name Commit Details Required Rerun command
ci/prow/e2e-gcp-op-ocl fc21115 link false /test e2e-gcp-op-ocl
ci/prow/okd-scos-e2e-aws-ovn fc21115 link false /test okd-scos-e2e-aws-ovn
ci/prow/e2e-aws-mco-disruptive fc21115 link false /test e2e-aws-mco-disruptive
ci/prow/e2e-azure-ovn-upgrade-out-of-change fc21115 link false /test e2e-azure-ovn-upgrade-out-of-change
ci/prow/bootstrap-unit fc21115 link false /test bootstrap-unit

Full PR test history. Your PR dashboard.

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes-sigs/prow repository. I understand the commands that are listed here.

@ngopalak-redhat
Copy link
Contributor Author

/test e2e-gcp-op-1of2

@openshift-merge-bot openshift-merge-bot bot merged commit c6faace into openshift:main Oct 8, 2025
18 of 23 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

approved Indicates a PR has been approved by an approver from all required OWNERS files. jira/valid-reference Indicates that this PR references a valid Jira ticket of any type. lgtm Indicates that a PR is ready to be merged. verified Signifies that the PR passed pre-merge verification criteria verified-later

Projects

None yet

Development

Successfully merging this pull request may close these issues.

5 participants