-
Notifications
You must be signed in to change notification settings - Fork 42
Add wait time to extract artifacts task #650
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: main
Are you sure you want to change the base?
Conversation
WalkthroughThe change introduces retry logic to the "Extract artifacts from release image" task within the Ansible playbook. The task now registers the extraction result, and if it fails, it will automatically retry up to three times with a 60-second delay between attempts. This adjustment enhances the reliability of the extraction process by handling transient failures. Changes
Poem
Tip ⚡💬 Agentic Chat (Pro Plan, General Availability)
🪧 TipsChatThere are 3 ways to chat with CodeRabbit:
Note: Be mindful of the bot's finite context window. It's strongly recommended to break down tasks such as reading entire modules into smaller chunks. For a focused discussion, use review comments to chat about specific files and their changes, instead of using the PR comments. CodeRabbit Commands (Invoked using PR comments)
Other keywords and placeholders
CodeRabbit Configuration File (
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actionable comments posted: 0
🧹 Nitpick comments (1)
roles/sno_installer/tasks/10_get_oc.yml (1)
98-101
: Consider parameterizing retries and using an explicit success check
To make these settings more flexible and improve readability, extractretries
anddelay
into role defaults and switch theuntil
to checkresult_extract.rc == 0
. For example:In
defaults/main.yml
:artifact_extract_retries: 3 artifact_extract_delay: 60Proposed diff:
- register: result_extract - retries: 3 - delay: 60 - until: result_extract is not failed + register: result_extract + retries: "{{ artifact_extract_retries }}" + delay: "{{ artifact_extract_delay }}" + until: result_extract.rc == 0
📜 Review details
Configuration used: CodeRabbit UI
Review profile: CHILL
Plan: Pro
📒 Files selected for processing (1)
roles/sno_installer/tasks/10_get_oc.yml
(1 hunks)
⏰ Context from checks skipped due to timeout of 90000ms (3)
- GitHub Check: Sanity Check (stable-2.9)
- GitHub Check: Ansible-lint Check
- GitHub Check: Sanity Check (stable-2.17)
🔇 Additional comments (1)
roles/sno_installer/tasks/10_get_oc.yml (1)
98-101
: Robust retry mechanism added for artifact extraction
By registering the command result and retrying up to 3 times with a 60‑second delay, transient failures during artifact extraction will be handled automatically, improving reliability.
Build succeeded. ✔️ dci-rpm-build-el8 SUCCESS in 2m 52s |
from change #650: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Looks like the retries are not really helping, according to the last DCI job:
task path: /var/lib/dci-openshift-agent/github/ansible-collection-redhatci-ocp-pr650-e3ae71b34cc1f90115f1927ef0b766c3/collections/ansible_collections/redhatci/ocp/roles/sno_installer/tasks/10_get_oc.yml:89
FAILED - RETRYING: Extract artifacts from release image (3 retries left).
FAILED - RETRYING: Extract artifacts from release image (2 retries left).
FAILED - RETRYING: Extract artifacts from release image (1 retries left).
fatal: [server06-sno.partnerci.bos2.lab -> server06-sno.partnerci.bos2.lab]: FAILED! => {"attempts": 3, "changed": true, "cmd": ["/tmp/sno_oc.l0alte68/oc", "adm", "release", "extract", "--registry-config", "/opt/cache/pull-secret.txt", "--tools", "--from", "registry.ci.openshift.org/ocp/release@sha256:a3c9dbafda0babdfdcafcebb083cbc7040a8422b22b9f0d341e9e8a7e939739f", "--to", "/opt/cache/4.18.0-0.nightly-2025-04-15-161728"], "delta": "0:00:00.319923", "end": "2025-04-17 06:08:33.104692", "msg": "non-zero return code", "rc": 1, "start": "2025-04-17 06:08:32.784769", "stderr": "error: image \"registry.ci.openshift.org/ocp/release@sha256:a3c9dbafda0babdfdcafcebb083cbc7040a8422b22b9f0d341e9e8a7e939739f\" not found: manifest unknown: manifest unknown", "stderr_lines": ["error: image \"registry.ci.openshift.org/ocp/release@sha256:a3c9dbafda0babdfdcafcebb083cbc7040a8422b22b9f0d341e9e8a7e939739f\" not found: manifest unknown: manifest unknown"], "stdout": "", "stdout_lines": []}
SUMMARY
This change is to allow wait time through retries for extract artifacts from release image
ISSUE TYPE
Enhanced feature
Tests
Summary by CodeRabbit