Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

test: add compatibility versions feature gate test as a nightly prow periodic job #34257

Merged
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
55 changes: 53 additions & 2 deletions config/jobs/kubernetes/sig-testing/compatibility-versions-e2e.yaml
Original file line number Diff line number Diff line change
Expand Up @@ -6,7 +6,7 @@ periodics:
testgrid-dashboards: sig-testing-kind
testgrid-tab-name: compatibility-version-test-n-minus-1
description: Uses kind to run e2e tests from the n-1 kubernetes release against a latest kubernetes master components w/ --emulated-version=n-1 set.
# TODO(aaron-prindle) route the alert email to a rotation vs individual email
# TODO(#34269) route the alert email to a rotation vs individual email and update owners in experiment/compatibility-versions
testgrid-alert-email: [email protected]
testgrid-num-columns-recent: '6'
labels:
Expand Down Expand Up @@ -64,7 +64,7 @@ periodics:
testgrid-dashboards: sig-testing-kind
testgrid-tab-name: compatibility-version-test-n-minus-2
description: Uses kind to run e2e tests from the n-2 kubernetes release against a latest kubernetes master components w/ --emulated-version=n-2 set.
# TODO(aaron-prindle) route the alert email to a rotation vs individual email
# TODO(#34269) route the alert email to a rotation vs individual email and update owners in experiment/compatibility-versions
testgrid-alert-email: [email protected]
testgrid-num-columns-recent: '6'
labels:
Expand Down Expand Up @@ -115,3 +115,54 @@ periodics:
# this is mostly for building kubernetes
memory: 9Gi
cpu: 7
- interval: 6h
cluster: k8s-infra-prow-build
name: ci-kubernetes-e2e-kind-compatibility-versions-feature-gate-test
annotations:
testgrid-dashboards: sig-testing-kind
testgrid-tab-name: compatibility-versions-feature-gate-test
description: Uses kind to run bespoke feature gate tests from the n-1 kubernetes release yaml files against a latest kubernetes master components w/ --emulated-version=n-1 set.
# TODO(#34269) route the alert email to a rotation vs individual email and update owners in experiment/compatibility-versions
testgrid-alert-email: [email protected]
testgrid-num-failures-to-alert: "2"
testgrid-num-columns-recent: '6'
aaron-prindle marked this conversation as resolved.
Show resolved Hide resolved
labels:
preset-dind-enabled: "true"
preset-kind-volume-mounts: "true"
decorate: true
decoration_config:
timeout: 60m
extra_refs:
- org: kubernetes
repo: kubernetes
base_ref: master
path_alias: k8s.io/kubernetes
workdir: true
- org: kubernetes
repo: test-infra
base_ref: master
path_alias: k8s.io/test-infra
spec:
containers:
- image: gcr.io/k8s-staging-test-infra/krte:v20241230-3006692a6f-master
imagePullPolicy: Always # pull latest image for canary testing
command:
- wrapper.sh
- bash
- -c
- curl -sSL https://kind.sigs.k8s.io/dl/latest/linux-amd64.tgz | tar xvfz - -C "${PATH%%:*}/" && ./../test-infra/experiment/compatibility-versions/compatibility-versions-feature-gate-test.sh
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

xref: #33980

Copy link
Contributor Author

@aaron-prindle aaron-prindle Feb 5, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Apologies is the recommendation here related to?
#33594

From reading over #33980 I'm a bit confused what the actionable change would be to make to the script here is, can you elaborate a bit on how I might use the precompiled binaries here? Thanks!

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sorry -- I'm just trying to track these for later, it's not a blocker.

Basically you use kind build node-image with one of the CI builds of Kubernetes instead of from source.
We will also have to make sure the commit from that build gets recorded to testgrid (I forget the specifics of how that works), if / when we convert one of the scripts we can apply it in multiple places.

But it's not a blocker.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ah makes sense, thanks. I'll track this work through #33594, added a reference there to #33980. I'll work on a separate PR converting all of the compatibility-versions pieces to use prebuilt binaries when I tackle #33594

env:
- name: RUNTIME_CONFIG
value: '{"api/beta":"true", "api/ga":"true"}'
# we need privileged mode in order to do docker in docker
securityContext:
privileged: true
resources:
limits:
memory: 14Gi
cpu: 7
requests:
# these are both a bit below peak usage during build
# this is mostly for building kubernetes
memory: 14Gi
cpu: 7
1 change: 1 addition & 0 deletions experiment/compatibility-versions/OWNERS
Original file line number Diff line number Diff line change
@@ -1,5 +1,6 @@
# See the OWNERS docs at https://go.k8s.io/owners

# TODO(#34269) update owners in experiment/compatibility-versions to a group/rotation and route the alert email to a rotation vs individual email
reviewers:
- aaron-prindle
approvers:
Expand Down
Original file line number Diff line number Diff line change
@@ -0,0 +1,259 @@
#!/usr/bin/env bash
# Copyright 2025 The Kubernetes Authors.
#
# Licensed under the Apache License, Version 2.0 (the "License");
# you may not use this file except in compliance with the License.
# You may obtain a copy of the License at
#
# http://www.apache.org/licenses/LICENSE-2.0
#
# Unless required by applicable law or agreed to in writing, software
# distributed under the License is distributed on an "AS IS" BASIS,
# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
# See the License for the specific language governing permissions and
# limitations under the License.

# hack script for running kind clusters, fetching kube-apiserver metrics, and validating feature gates
# must be run with a kubernetes checkout in $PWD (IE from the checkout)
# Usage: compatibility-versions-feature-gate-test.sh

set -o errexit -o nounset -o pipefail
set -o xtrace

# Settings:
# GA_ONLY: true - limit to GA APIs/features as much as possible
# false - (default) APIs and features left at defaults

# FEATURE_GATES:
# JSON or YAML encoding of a string/bool map: {"FeatureGateA": true, "FeatureGateB": false}
# Enables or disables feature gates in the entire cluster.
# Cannot be used when GA_ONLY=true.

# RUNTIME_CONFIG:
# JSON or YAML encoding of a string/string (!) map: {"apia.example.com/v1alpha1": "true", "apib.example.com/v1beta1": "false"}
# Enables API groups in the apiserver via --runtime-config.
# Cannot be used when GA_ONLY=true.

# cleanup logic for cleanup on exit
CLEANED_UP=false
cleanup() {
if [ "$CLEANED_UP" = "true" ]; then
return
fi
# KIND_CREATE_ATTEMPTED is true once we: kind create
if [ "${KIND_CREATE_ATTEMPTED:-}" = true ]; then
kind "export" logs "${ARTIFACTS}" || true
kind delete cluster || true
fi
rm -f _output/bin/kubectl || true
# remove our tempdir, this needs to be last, or it will prevent kind delete
if [ -n "${TMP_DIR:-}" ]; then
rm -rf "${TMP_DIR:?}"
fi
CLEANED_UP=true
}

# setup signal handlers
# shellcheck disable=SC2317 # this is not unreachable code
signal_handler() {
cleanup
}
trap signal_handler INT TERM

# build kubernetes / node image, kubectl binary
build() {
# build the node image w/ kubernetes
kind build node-image -v 1
# make sure we have kubectl
make all WHAT="cmd/kubectl"

# Ensure the built kubectl is used instead of system
export PATH="${PWD}/_output/bin:$PATH"
}

check_structured_log_support() {
case "${KUBE_VERSION}" in
v1.1[0-8].*)
echo "$1 is only supported on versions >= v1.19, got ${KUBE_VERSION}"
exit 1
;;
esac
}

# up a cluster with kind
create_cluster() {
# Grab the version of the cluster we're about to start
KUBE_VERSION="$(docker run --rm --entrypoint=cat "kindest/node:latest" /kind/version)"

# Default Log level for all components in test clusters
KIND_CLUSTER_LOG_LEVEL=${KIND_CLUSTER_LOG_LEVEL:-4}

EMULATED_VERSION=${EMULATED_VERSION:-}

# potentially enable --logging-format
CLUSTER_LOG_FORMAT=${CLUSTER_LOG_FORMAT:-}
scheduler_extra_args=" \"v\": \"${KIND_CLUSTER_LOG_LEVEL}\""
controllerManager_extra_args=" \"v\": \"${KIND_CLUSTER_LOG_LEVEL}\""
apiServer_extra_args=" \"v\": \"${KIND_CLUSTER_LOG_LEVEL}\""
kubelet_extra_args=" \"v\": \"${KIND_CLUSTER_LOG_LEVEL}\""

if [ -n "$CLUSTER_LOG_FORMAT" ]; then
check_structured_log_support "CLUSTER_LOG_FORMAT"
scheduler_extra_args="${scheduler_extra_args}
\"logging-format\": \"${CLUSTER_LOG_FORMAT}\""
controllerManager_extra_args="${controllerManager_extra_args}
\"logging-format\": \"${CLUSTER_LOG_FORMAT}\""
apiServer_extra_args="${apiServer_extra_args}
\"logging-format\": \"${CLUSTER_LOG_FORMAT}\""
fi

KUBELET_LOG_FORMAT=${KUBELET_LOG_FORMAT:-$CLUSTER_LOG_FORMAT}
if [ -n "$KUBELET_LOG_FORMAT" ]; then
check_structured_log_support "KUBECTL_LOG_FORMAT"
kubelet_extra_args="${kubelet_extra_args}
\"logging-format\": \"${KUBELET_LOG_FORMAT}\""
fi

# JSON or YAML map injected into featureGates config
feature_gates="${FEATURE_GATES:-{\}}"
# --runtime-config argument value passed to the API server, again as a map
runtime_config="${RUNTIME_CONFIG:-{\}}"

case "${GA_ONLY:-false}" in
false)
:
;;
true)
if [ "${feature_gates}" != "{}" ]; then
echo "GA_ONLY=true and FEATURE_GATES=${feature_gates} are mutually exclusive."
exit 1
fi
if [ "${runtime_config}" != "{}" ]; then
echo "GA_ONLY=true and RUNTIME_CONFIG=${runtime_config} are mutually exclusive."
exit 1
fi

echo "Limiting to GA APIs and features for ${KUBE_VERSION}"
feature_gates='{"AllAlpha":false,"AllBeta":false}'
runtime_config='{"api/alpha":"false", "api/beta":"false"}'
;;
*)
echo "\$GA_ONLY set to '${GA_ONLY}'; supported values are true and false (default)"
exit 1
;;
esac

# create the config file
cat <<EOF > "${ARTIFACTS}/kind-config.yaml"
# config for 1 control plane node and 2 workers (necessary for conformance)
kind: Cluster
apiVersion: kind.x-k8s.io/v1alpha4
networking:
ipFamily: ${IP_FAMILY:-ipv4}
kubeProxyMode: ${KUBE_PROXY_MODE:-iptables}
# don't pass through host search paths
# TODO: possibly a reasonable default in the future for kind ...
dnsSearch: []
nodes:
- role: control-plane
featureGates: ${feature_gates}
runtimeConfig: ${runtime_config}
kubeadmConfigPatches:
- |
kind: ClusterConfiguration
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We're also going to have to start targeting these to specific versions and add a kubeadm v1alpha4 patch soon: kubernetes-sigs/kind#3847

Going to be a headache, at least trying to track the places in-project where we're running kind @ HEAD

metadata:
name: config
apiServer:
extraArgs:
${apiServer_extra_args}
"emulated-version": "${EMULATED_VERSION}"
controllerManager:
extraArgs:
${controllerManager_extra_args}
"emulated-version": "${EMULATED_VERSION}"
scheduler:
extraArgs:
${scheduler_extra_args}
"emulated-version": "${EMULATED_VERSION}"
---
kind: InitConfiguration
nodeRegistration:
kubeletExtraArgs:
${kubelet_extra_args}
---
kind: JoinConfiguration
nodeRegistration:
kubeletExtraArgs:
${kubelet_extra_args}
EOF

KIND_CREATE_ATTEMPTED=true
kind create cluster \
--image=kindest/node:latest \
--retain \
--wait=1m \
-v=3 \
"--config=${ARTIFACTS}/kind-config.yaml"

# debug cluster version
kubectl version

# Patch kube-proxy to set the verbosity level
kubectl patch -n kube-system daemonset/kube-proxy \
--type='json' -p='[{"op": "add", "path": "/spec/template/spec/containers/0/command/-", "value": "--v='"${KIND_CLUSTER_LOG_LEVEL}"'" }]'
}

fetch_metrics() {
local output_file="$1"
echo "Fetching metrics to ${output_file}..."
kubectl get --raw /metrics > "${output_file}"
}


main() {
TMP_DIR=$(mktemp -d)
export ARTIFACTS="${ARTIFACTS:-${PWD}/_artifacts}"
mkdir -p "${ARTIFACTS}"

export EMULATED_VERSION=$(get_latest_release_version)
export PREV_VERSIONED_FEATURE_LIST=${PREV_VERSIONED_FEATURE_LIST:-"release-${EMULATED_VERSION}/test/featuregates_linter/test_data/versioned_feature_list.yaml"}
export UNVERSIONED_FEATURE_LIST=${UNVERSIONED_FEATURE_LIST:-"release-${EMULATED_VERSION}/test/featuregates_linter/test_data/unversioned_feature_list.yaml"}

# Create and validate previous cluster
git clone --filter=blob:none --single-branch --branch "release-${EMULATED_VERSION}" https://github.com/kubernetes/kubernetes.git "release-${EMULATED_VERSION}"

# Build current version
build

# Create and validate latest cluster
KUBECONFIG="${HOME}/.kube/kind-test-config-latest"
export KUBECONFIG
create_cluster
LATEST_METRICS="${ARTIFACTS}/latest_metrics.txt"
fetch_metrics "${LATEST_METRICS}"
LATEST_RESULTS="${ARTIFACTS}/latest_results.txt"

VALIDATE_SCRIPT="${VALIDATE_SCRIPT:-${PWD}/../test-infra/experiment/compatibility-versions/validate-compatibility-versions-feature-gates.sh}"
"${VALIDATE_SCRIPT}" "${EMULATED_VERSION}" "${LATEST_METRICS}" "${PREV_VERSIONED_FEATURE_LIST}" "${UNVERSIONED_FEATURE_LIST}" "${LATEST_RESULTS}"

# Report results
echo "=== Latest Cluster (${EMULATED_VERSION}) Validation ==="
cat "${LATEST_RESULTS}"

if grep -q "FAIL" "${LATEST_RESULTS}"; then
echo "Validation failures detected"
exit 1
fi

cleanup
}

get_latest_release_version() {
git ls-remote --heads https://github.com/kubernetes/kubernetes.git | \
grep -o 'release-[0-9]\+\.[0-9]\+' | \
sort -t. -k1,1n -k2,2n | \
tail -n1 | \
cut -d- -f2
}

main
Original file line number Diff line number Diff line change
@@ -1,4 +1,4 @@
#!/bin/bash
#!/usr/bin/env bash
# Copyright 2024 The Kubernetes Authors.
#
# Licensed under the Apache License, Version 2.0 (the "License");
Expand Down
Loading