Skip to content

Conversation

@lance5890
Copy link
Contributor

@lance5890 lance5890 commented Nov 4, 2025

Since PR #1367 was merged, we should consider changing the restore pod's readiness probe from tcpSocket: 2380 to httpGet: 9980. This would allow the probe to actually check etcd's readiness status, rather than just verifying that port 2380 is listening

@openshift-ci openshift-ci bot requested review from hasbro17 and tjungblu November 4, 2025 01:12
@openshift-ci
Copy link
Contributor

openshift-ci bot commented Nov 4, 2025

Hi @lance5890. Thanks for your PR.

I'm waiting for a github.com member to verify that this patch is reasonable to test. If it is, they should reply with /ok-to-test on its own line. Until that is done, I will not automatically test new commits in this PR, but the usual testing commands by org members will still work. Regular contributors should join the org to skip this step.

Once the patch is verified, the new status will be reflected by the ok-to-test label.

I understand the commands that are listed here.

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes-sigs/prow repository.

@openshift-ci openshift-ci bot added the needs-ok-to-test Indicates a PR that requires an org member to verify it is safe to test. label Nov 4, 2025
@coderabbitai
Copy link

coderabbitai bot commented Nov 4, 2025

Walkthrough

Updated readiness probes in etcd restore pod configurations from TCP socket checks on port 2380 to HTTP GET checks on port 9980 with HTTPS scheme and /readyz endpoint path.

Changes

Cohort / File(s) Summary
Etcd readiness probe updates
bindata/etcd/quorum-restore-pod.yaml, bindata/etcd/restore-pod.yaml
Replaced TCP socket readiness probes (port 2380) with HTTP GET probes (port 9980, path /readyz, HTTPS scheme)

Estimated code review effort

🎯 1 (Trivial) | ⏱️ ~2 minutes

  • Both changes are identical and follow the same straightforward pattern
  • Configuration-only modifications with no logic or structural impact
  • Probe mechanism swap is a direct substitution with no side effects
✨ Finishing touches
🧪 Generate unit tests (beta)
  • Create PR with unit tests
  • Post copyable unit tests in a comment

Comment @coderabbitai help to get the list of available commands and usage tips.

Copy link

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 2

📜 Review details

Configuration used: CodeRabbit UI

Review profile: CHILL

Plan: Pro

Cache: Disabled due to data retention organization setting

Knowledge base: Disabled due to Reviews -> Disable Knowledge Base setting

📥 Commits

Reviewing files that changed from the base of the PR and between 18a8bcc and 7d23518.

📒 Files selected for processing (2)
  • bindata/etcd/quorum-restore-pod.yaml (1 hunks)
  • bindata/etcd/restore-pod.yaml (1 hunks)
🔇 Additional comments (2)
bindata/etcd/quorum-restore-pod.yaml (1)

67-70: Architecture improvement: probing readyz sidecar instead of etcd directly is sound.

The change from TCP socket probe on port 2380 to HTTP GET probe on port 9980 correctly delegates readiness checks to the dedicated etcd-readyz sidecar container (lines 87–122). The sidecar runs cluster-etcd-operator readyz and properly exposes port 9980 (line 110), aligning the probe configuration with the container setup.

bindata/etcd/restore-pod.yaml (1)

123-126: Consistent update across restore pod configurations.

Both restore pod manifests (quorum-restore-pod.yaml and restore-pod.yaml) are updated identically. The architecture mirrors quorum-restore-pod.yaml: the etcd-readyz sidecar container (lines 143–177) properly listens on port 9980 (line 156), and the probe configuration aligns with the container setup.

Comment on lines +67 to +70
httpGet:
port: 9980
path: readyz
scheme: HTTPS
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🔴 Critical

🧩 Analysis chain

Verify HTTP probe path format: "readyz" should be "/readyz".

Kubernetes HTTPGet probes expect paths to begin with a forward slash. The path should be specified as "/readyz" rather than "readyz" to comply with the Probe API specification and ensure reliable probe execution.

Proposed fix:

      httpGet:
        port: 9980
-       path: readyz
+       path: /readyz
        scheme: HTTPS

🌐 Web query:

Kubernetes HTTPGet probe path format requirements

💡 Result:

Short answer

  • httpGet.path is a string URL path (defaults to "/"). The kubelet will send an HTTP GET to podIP:port. Use an absolute path (beginning with "/") as a best practice. [1][2]

Details / sources

  • Kubernetes docs: httpGet has fields path (Path to access on the HTTP server) and default is "/". The probe is sent to the pod IP and specified port. [1]
  • API reference / HTTPGetAction: path is a string (Path to access on the HTTP server). Treat it as a normal URL path. [2]

If you want, I can show examples and edge cases (query strings, virtual-hosting via Host header, or HTTPS scheme).

Sources:
[1] Kubernetes docs — Configure Liveness/Readiness/Startup probes.
[2] Kubernetes HTTPGetAction API reference.


</shell_scripts>


🏁 Script executed:

#!/bin/bash
# Check the current state of the file and related probe configurations
echo "=== Checking quorum-restore-pod.yaml ===" 
fd -t f "quorum-restore-pod.yaml" --exec cat {}

echo -e "\n=== Searching for other HTTPGet probes in YAML files ==="
rg -n "httpGet:" --type yaml -A 3 -B 1 bindata/

Length of output: 6185


Fix HTTPGet probe paths to use absolute paths across all affected files.

Kubernetes best practice requires httpGet probe paths to begin with "/" (absolute path). The default value is "/" and the probe is sent to podIP:port. Update the following files:

  • bindata/etcd/quorum-restore-pod.yaml line 69: path: readyzpath: /readyz
  • bindata/etcd/pod.gotpl.yaml line 233: path: readyzpath: /readyz
  • bindata/etcd/pod.gotpl.yaml line 242: path: healthzpath: /healthz
  • bindata/etcd/pod.gotpl.yaml line 253: path: readyzpath: /readyz
  • bindata/etcd/restore-pod.yaml line 125: path: readyzpath: /readyz
🤖 Prompt for AI Agents
In bindata/etcd/quorum-restore-pod.yaml around lines 67-70 (line 69),
bindata/etcd/pod.gotpl.yaml around lines 233, 242, and 253, and
bindata/etcd/restore-pod.yaml around line 125, the httpGet probe path values are
relative (e.g., "readyz", "healthz"); update each to use absolute paths by
prefixing with a slash (e.g., "readyz" -> "/readyz", "healthz" -> "/healthz") so
probes become path: /readyz or path: /healthz respectively. Ensure you update
all listed files and lines to use the absolute path format for Kubernetes
httpGet probes.

Comment on lines +123 to +126
httpGet:
port: 9980
path: readyz
scheme: HTTPS
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🔴 Critical

❓ Verification inconclusive

Verify HTTP probe path format: "readyz" should be "/readyz".

Same issue as in the companion file: Kubernetes HTTPGet probes expect paths to begin with a forward slash. Correct the path to "/readyz" for compliance with the Probe API specification.

Proposed fix:

      httpGet:
        port: 9980
-       path: readyz
+       path: /readyz
        scheme: HTTPS

Verify HTTP probe path format: "readyz" should be "/readyz".

Kubernetes HTTPGet probes access a path on the HTTP server, with "/" as the default. The kubelet sends an HTTP request to the specified port and path to perform the check. All official Kubernetes examples use leading slashes in paths, such as /healthz. The current path readyz lacks the required leading slash and should be corrected to /readyz to comply with HTTP URI conventions and Kubernetes best practices.

Proposed fix:

      httpGet:
        port: 9980
-       path: readyz
+       path: /readyz
        scheme: HTTPS
📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change
httpGet:
port: 9980
path: readyz
scheme: HTTPS
httpGet:
port: 9980
path: /readyz
scheme: HTTPS
🤖 Prompt for AI Agents
In bindata/etcd/restore-pod.yaml around lines 123 to 126, the HTTP probe path is
set to "readyz" without a leading slash; change the path value to "/readyz" so
the HTTPGet probe uses a proper URI path (add the leading "/" to the path
string).

@tjungblu
Copy link
Contributor

/approve

/test ?

@openshift-ci
Copy link
Contributor

openshift-ci bot commented Nov 12, 2025

@tjungblu: The following commands are available to trigger required jobs:

/test e2e-agnostic-ovn
/test e2e-agnostic-ovn-upgrade
/test e2e-aws-ovn-serial-1of2
/test e2e-aws-ovn-serial-2of2
/test e2e-aws-ovn-single-node
/test e2e-metal-ipi-ovn-ipv6
/test e2e-operator
/test images
/test okd-scos-images
/test unit
/test verify
/test verify-deps

The following commands are available to trigger optional jobs:

/test configmap-scale
/test e2e-aws
/test e2e-aws-cpms
/test e2e-aws-disruptive
/test e2e-aws-disruptive-ovn
/test e2e-aws-etcd-certrotation
/test e2e-aws-etcd-recovery
/test e2e-aws-ovn-etcd-scaling
/test e2e-azure
/test e2e-azure-ovn-etcd-scaling
/test e2e-gcp
/test e2e-gcp-disruptive
/test e2e-gcp-disruptive-ovn
/test e2e-gcp-ovn-etcd-scaling
/test e2e-metal-assisted
/test e2e-metal-ovn-ha-cert-rotation-shutdown
/test e2e-metal-ovn-sno-cert-rotation-shutdown
/test e2e-metal-ovn-two-node-fencing
/test e2e-metal-single-node-live-iso
/test e2e-operator-fips
/test e2e-vsphere-ovn-etcd-scaling
/test okd-scos-e2e-aws-ovn

Use /test all to run the following jobs that were automatically triggered:

pull-ci-openshift-cluster-etcd-operator-main-configmap-scale
pull-ci-openshift-cluster-etcd-operator-main-e2e-agnostic-ovn
pull-ci-openshift-cluster-etcd-operator-main-e2e-agnostic-ovn-upgrade
pull-ci-openshift-cluster-etcd-operator-main-e2e-aws-ovn-serial-1of2
pull-ci-openshift-cluster-etcd-operator-main-e2e-aws-ovn-serial-2of2
pull-ci-openshift-cluster-etcd-operator-main-e2e-aws-ovn-single-node
pull-ci-openshift-cluster-etcd-operator-main-e2e-metal-ipi-ovn-ipv6
pull-ci-openshift-cluster-etcd-operator-main-e2e-operator
pull-ci-openshift-cluster-etcd-operator-main-images
pull-ci-openshift-cluster-etcd-operator-main-okd-scos-e2e-aws-ovn
pull-ci-openshift-cluster-etcd-operator-main-okd-scos-images
pull-ci-openshift-cluster-etcd-operator-main-unit
pull-ci-openshift-cluster-etcd-operator-main-verify
pull-ci-openshift-cluster-etcd-operator-main-verify-deps

In response to this:

/approve

/test ?

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes-sigs/prow repository.

@tjungblu
Copy link
Contributor

/test e2e-aws-etcd-recovery

@openshift-ci
Copy link
Contributor

openshift-ci bot commented Nov 12, 2025

[APPROVALNOTIFIER] This PR is APPROVED

This pull-request has been approved by: lance5890, tjungblu

The full list of commands accepted by this bot can be found here.

The pull request process is described here

Needs approval from an approver in each of these files:

Approvers can indicate their approval by writing /approve in a comment
Approvers can cancel approval by writing /approve cancel in a comment

@openshift-ci openshift-ci bot added the approved Indicates a PR has been approved by an approver from all required OWNERS files. label Nov 12, 2025
@tjungblu
Copy link
Contributor

/ok-to-test

@openshift-ci openshift-ci bot added ok-to-test Indicates a non-member PR verified by an org member that is safe to test. and removed needs-ok-to-test Indicates a PR that requires an org member to verify it is safe to test. labels Nov 12, 2025
@tjungblu
Copy link
Contributor

/test e2e-aws-etcd-recovery

@lance5890
Copy link
Contributor Author

/retest

@openshift-ci
Copy link
Contributor

openshift-ci bot commented Nov 17, 2025

@lance5890: The following test failed, say /retest to rerun all failed tests or /retest-required to rerun all mandatory failed tests:

Test name Commit Details Required Rerun command
ci/prow/e2e-aws-etcd-recovery 7d23518 link false /test e2e-aws-etcd-recovery

Full PR test history. Your PR dashboard.

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes-sigs/prow repository. I understand the commands that are listed here.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

approved Indicates a PR has been approved by an approver from all required OWNERS files. ok-to-test Indicates a non-member PR verified by an org member that is safe to test.

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants