Skip to content

mirror_ocp_release: fixes for concurrent jobs #626

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Open
wants to merge 1 commit into
base: main
Choose a base branch
from
Open
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
77 changes: 39 additions & 38 deletions roles/mirror_ocp_release/tasks/artifacts.yml
Original file line number Diff line number Diff line change
Expand Up @@ -7,38 +7,51 @@
when:
- not mor_force

- name: Create temporary working directory
ansible.builtin.tempfile:
state: directory
prefix: mor-
register: _mor_tmp
when: mor_force or not _mor_target.stat.exists

- name: "Extract the OCP installer and metadata"
when:
- mor_force or not _mor_target.stat.exists
block:
- name: "Extract installer and metadata from release image"
ansible.builtin.shell: >
flock -x {{ mor_cache_dir }}/{{ mor_version }}/release_extract.lock -c '
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

With this new approach, use of filesystem locks is not needed anymore.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Have you considered scenarios where two jobs run concurrently, both extracting to a temporary location and writing to mor_cache_dir? How do you prevent conflicts in such cases? Implementing a mechanism like a lock might help avoid these issues.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

that's a fair point. I was accepting the risk of facing such scenarios provided moving files around the filesystem is must faster than running the operations directly on the mor_cache_dir.

But at this point it's right what we could just limit the protected zone to the task were we copy the files to the cache directory once they have been processed in the temporary directory.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

here comes another thought. The way this implementation works, the problematic task would be when we run the copy module to move the files from the temporary directory to the cache directory. Alternatively or in combination of the lock usage, we can add the parameter "force: false" to the module call, so ansible won't replace a file that already exists, even if the contents are different.
The question here is whether it'd be safe to assume the artifact won't change between jobs deploying the same OCP release.
My guess is the files won't change if the jobs are running concurrently, but would those artifacts change between jobs running with, say, days of difference?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Unfortunately, I don't find a way of implementing locks that would allows us to run ansible tasks in the locked zone.

In other words, when using locks the lock code must be part of the same shell script.

set -e;
- name: "Extract installer from release image"
ansible.builtin.command: >
{{ mor_oc }} adm release extract
--registry-config {{ mor_auths_file }}
--command={{ mor_installer }}
--from {{ mor_pull_url }}
--to "{{ mor_cache_dir }}/{{ mor_version }}";
--to "{{ _mor_tmp.path }}"
register: _mor_extract_res
retries: 9
delay: 10
until: _mor_extract_res is not failed
changed_when: _mor_extract_res.rc == 0

- name: "Extract metadata from release image"
ansible.builtin.command: >
{{ mor_oc }} adm release extract
--registry-config {{ mor_auths_file }}
--tools
--from {{ mor_pull_url }}
--to "{{ mor_cache_dir }}/{{ mor_version }}"'
--to "{{ _mor_tmp.path }}"
register: _mor_extract_res
retries: 9
delay: 10
until: _mor_extract_res is not failed
changed_when: false
changed_when: _mor_extract_res.rc == 0

- name: "Extract rhcos.json if version >= 4.8"
when:
- mor_version is version("4.8", ">=")
ansible.builtin.shell: >
flock -x "{{ mor_cache_dir }}/{{ mor_version }}/release_extract.lock" -c '{
"{{ mor_cache_dir }}/{{ mor_version }}/{{ mor_installer }}" coreos print-stream-json >
"{{ mor_cache_dir }}/{{ mor_version }}/rhcos.json";
}'
"{{ _mor_tmp.path }}/{{ mor_installer }}" coreos print-stream-json >
"{{ _mor_tmp.path }}/rhcos.json"
args:
creates: "{{ _mor_tmp.path }}/rhcos.json"

- name: "Download rhcos.json (< 4.8)"
when:
Expand All @@ -47,7 +60,7 @@
- name: "Get Git SHA from installer"
ansible.builtin.shell: >
set -e -o pipefail;
{{ mor_cache_dir }}/{{ mor_version }}/openshift-baremetal-install version |
{{ _mor_tmp.path }}/openshift-baremetal-install version |
grep "^built from" |
awk '{ print $NF }'
register: _mor_commit_id
Expand All @@ -57,42 +70,30 @@
ansible.builtin.include_tasks: fetch.yml
vars:
mor_uri: "https://raw.githubusercontent.com/openshift/installer/{{ _mor_commit_id.stdout }}/data/data/rhcos.json"
mor_dir: "{{ mor_cache_dir }}/{{ mor_version }}"
mor_dir: "{{ _mor_tmp.path }}"

- name: "Figure out status of SELinux"
ansible.builtin.command: /usr/sbin/selinuxenabled
ignore_errors: true
register: _mor_selinux_status
changed_when: false

- name: Apply SELinux container file context to extracted files
ansible.builtin.sefcontext:
target: "{{ mor_cache_dir }}/{{ mor_version }}"
setype: container_file_t
become: true
register: _mor_cache_secontext
retries: 3
delay: 5
until: _mor_cache_secontext is not failed
when:
- _mor_selinux_status.rc == 0

- name: "Make installer command readable from HTTP"
ansible.builtin.file:
path: "{{ mor_cache_dir }}/{{ mor_version }}/{{ mor_installer }}"
state: file
owner: "{{ mor_owner }}"
group: "{{ mor_group }}"
mode: "0755"
setype: "httpd_sys_content_t"
register: _mor_install_mode
- name: Copy artifacts with access policies to release directory
ansible.builtin.shell: |
flock -x {{ mor_cache_dir }}/{{ mor_version }}/f.lock -c '
set -e;
rsync -avz {{ _mor_tmp.path }}/ {{ mor_cache_dir }}/{{ mor_version }}/;{% if _mor_selinux_status.rc == 0 %}
chcon -R -t container_file_t {{ mor_cache_dir }}/{{ mor_version }};
chcon -t httpd_sys_content_t {{ mor_cache_dir }}/{{ mor_version }}/{{ mor_installer }};{% endif %}
chmod 755 {{ mor_cache_dir }}/{{ mor_version }}/{{ mor_installer }};
'
register: _mor_cache_copy
retries: 3
delay: 5
until: _mor_install_mode is not failed

until: _mor_cache_copy is not failed
always:
- name: "Ensure lock file is removed"
- name: Remove temporary directory
ansible.builtin.file:
path: "{{ mor_cache_dir }}/{{ mor_version }}/release_extract.lock"
path: "{{ _mor_tmp.path }}"
state: absent
...
32 changes: 20 additions & 12 deletions roles/mirror_ocp_release/tasks/facts.yml
Original file line number Diff line number Diff line change
Expand Up @@ -13,19 +13,23 @@
mor_release_image: "{{ _mor_release_content.content | b64decode | regex_search('(?<=^Pull From: )(.*)$', multiline=true) }}"

- name: "Read the contents of rhcos.json"
ansible.builtin.command: "cat {{ mor_cache_dir }}/{{ mor_version }}/rhcos.json"
register: rhcos
ansible.builtin.slurp:
src: "{{ mor_cache_dir }}/{{ mor_version }}/rhcos.json"
register: _mor_rhcos
no_log: true
retries: 6
delay: 10
until: rhcos is not failed
until: _mor_rhcos is not failed

- name: "Set image facts"
ansible.builtin.set_fact:
ocp_release_data:
mor_ocp_release_data:
container_image: "{{ mor_release_image }}"
rhcos_version: "{{ rhcos.stdout | from_json | json_query('architectures.x86_64.artifacts.metal.release') }}"
rhcos_images: "{{ ocp_release_data['rhcos_images'] | default({}) | combine({item.key: (rhcos.stdout | from_json | json_query('architectures.x86_64.artifacts.' + item.path))}) }}"
rhcos_version: "{{ rhcos_json | json_query('architectures.x86_64.artifacts.metal.release') }}"
rhcos_images: "{{ mor_ocp_release_data['rhcos_images'] | default({}) | combine({item.key: (rhcos_json | json_query(arch_query))}) }}"
vars:
rhcos_json: "{{ _mor_rhcos.content | b64decode | from_json }}"
arch_query: "architectures.x86_64.artifacts.{{ item.path }}"
with_items:
- {'key': 'aws_location', 'path': 'aws.formats."vmdk.gz".disk.location'}
- {'key': 'aws_sha256', 'path': 'aws.formats."vmdk.gz".disk.sha256'}
Expand Down Expand Up @@ -60,13 +64,15 @@
# TODO: Remove this task when 4.7 is no longer supported
- name: "Set image facts (< 4.8)"
vars:
rhcos_ver: "{{ rhcos.stdout | from_json | json_query('buildid') }}"
rhcos_json: "{{ _mor_rhcos.content | b64decode | from_json }}"
rhcos_ver: "{{ rhcos_json | json_query('buildid') }}"
base_uri: "https://rhcos.mirror.openshift.com/art/storage/releases/rhcos-{{ mor_base_version }}/{{ rhcos_ver }}/x86_64/"
add_item: "{{ {item.key: (item.baseURI | default('')) + (rhcos_json | json_query('images.' + item.path))} }}"
ansible.builtin.set_fact:
ocp_release_data:
mor_ocp_release_data:
container_image: "{{ mor_release_image }}"
rhcos_version: "{{ rhcos_ver }}"
rhcos_images: "{{ ocp_release_data['rhcos_images'] | default({}) | combine({item.key: (item.baseURI | default('')) + (rhcos.stdout | from_json | json_query('images.' + item.path))}) }}"
rhcos_images: "{{ mor_ocp_release_data['rhcos_images'] | default({}) | combine(add_item) }}"
with_items:
- {'key': 'aws_location', 'baseURI': "{{ base_uri }}", 'path': 'aws.path'}
- {'key': 'aws_sha256', 'path': 'aws.sha256'}
Expand Down Expand Up @@ -103,9 +109,11 @@
# TODO: remove for releases >= 4.8
- name: "Set facts for *osimage URL overrides"
ansible.builtin.set_fact:
bootstraposimage: "{{ mor_webserver_url }}/{{ ocp_release_data['rhcos_images']['qemu_location'] | basename }}?sha256={{ ocp_release_data['rhcos_images']['qemu_uncompressed_sha256'] }}"
clusterosimage: "{{ mor_webserver_url }}/{{ ocp_release_data['rhcos_images']['openstack_location'] | basename }}?sha256={{ ocp_release_data['rhcos_images']['openstack_sha256'] }}"
metalosimage: "{{ mor_webserver_url }}/{{ ocp_release_data['rhcos_images']['metal_iso_location'] | basename }}?sha256={{ ocp_release_data['rhcos_images']['metal_iso_sha256'] }}"
bootstraposimage: "{{ mor_webserver_url }}/{{ rhcos_images['qemu_location'] | basename }}?sha256={{ rhcos_images['qemu_uncompressed_sha256'] }}"
clusterosimage: "{{ mor_webserver_url }}/{{ rhcos_images['openstack_location'] | basename }}?sha256={{ rhcos_images['openstack_sha256'] }}"
metalosimage: "{{ mor_webserver_url }}/{{ rhcos_images['metal_iso_location'] | basename }}?sha256={{ rhcos_images['metal_iso_sha256'] }}"
vars:
rhcos_images: "{{ mor_ocp_release_data['rhcos_images'] }}"
when:
- mor_write_custom_config | bool
...
14 changes: 4 additions & 10 deletions roles/mirror_ocp_release/tasks/fetch.yml
Original file line number Diff line number Diff line change
@@ -1,12 +1,4 @@
---
- name: "Check if target file exists"
ansible.builtin.stat:
path: "{{ mor_dir }}/{{ mor_uri | basename }}"
get_checksum: false
register: target
when:
- not mor_force
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Since we're extracting the artifacts on a temporary directory, the file won't exist in advance.


- name: "Fetch file from URL"
ansible.builtin.get_url:
url: "{{ mor_uri }}"
Expand All @@ -20,8 +12,8 @@
become: true
retries: 3
delay: 10
register: downloaded
until: downloaded is not failed
register: _mor_downloaded
until: _mor_downloaded is not failed
when:
- mor_force or not target.stat.exists

Expand All @@ -35,4 +27,6 @@
ansible.builtin.command: /usr/sbin/restorecon -R "{{ mor_dir }}/{{ mor_uri | basename }}"
become: true
when: _mor_selinux.rc == 0
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Restoring the selinux context does not make sense when extracting the artifacts on a temporary directory.
Also, the first tasks in artifacts.yml after including fetch.yml override the context and set it to container_file_t, which should be valid even after moving the artifacts to the target directory served from the cache container.
A different discussion is whether these tasks should be run before or after copying the artifacts to the target directory.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We have to restore this block of code, since fetch.yml is also included from images.yml to pull the disk image directly into the cache store (version directory ignored) so then it's directly served by the cache container.

register: _mor_restorecon
changed_when: _mor_restorecon.rc == 0
...
4 changes: 2 additions & 2 deletions roles/mirror_ocp_release/tasks/files.yml
Original file line number Diff line number Diff line change
Expand Up @@ -13,6 +13,6 @@
become: true
retries: 10
delay: 20
register: downloaded
until: downloaded is not failed
register: _mor_downloaded
until: _mor_downloaded is not failed
...
4 changes: 2 additions & 2 deletions roles/mirror_ocp_release/tasks/images.yml
Original file line number Diff line number Diff line change
Expand Up @@ -2,8 +2,8 @@
- name: "Mirror Disk Images for the install type"
ansible.builtin.include_tasks: fetch.yml
vars:
mor_uri: "{{ ocp_release_data['rhcos_images'][item + '_location'] }}"
mor_checksum: "sha256:{{ ocp_release_data['rhcos_images'][item + '_sha256'] }}"
mor_uri: "{{ mor_ocp_release_data['rhcos_images'][item + '_location'] }}"
mor_checksum: "sha256:{{ mor_ocp_release_data['rhcos_images'][item + '_sha256'] }}"
mor_dir: "{{ mor_cache_dir }}"
loop: "{{ mor_images }}"
...
4 changes: 2 additions & 2 deletions roles/mirror_ocp_release/tasks/main.yml
Original file line number Diff line number Diff line change
Expand Up @@ -12,15 +12,15 @@
ansible.builtin.stat:
path: "{{ mor_auths_file }}"
get_checksum: false
register: mor_auths_file_check
register: _mor_auths_file_check
when:
- mor_auths_file is defined

- name: "Validate optional authentication file"
ansible.builtin.assert:
that:
- mor_auths_file is defined
- mor_auths_file_check.stat.exists | bool
- _mor_auths_file_check.stat.exists | bool
when:
- mor_mirror_container_images | bool

Expand Down
25 changes: 11 additions & 14 deletions roles/mirror_ocp_release/tasks/registry.yml
Original file line number Diff line number Diff line change
Expand Up @@ -3,25 +3,22 @@
ansible.builtin.command: >
skopeo inspect
--no-tags
--authfile {{ mor_auths_file }}
{%- if mor_allow_insecure_registry | bool %}
--tls-verify=false
{%- endif %}
--authfile {{ mor_auths_file }}{%- if mor_allow_insecure_registry | bool %}
--tls-verify=false{%- endif %}
docker://{{ mor_registry_url }}/{{ mor_registry_path }}:{{ mor_version }}
register: _mor_release_image
changed_when: _mor_release_image.rc == 0
failed_when: false
when: not mor_force | bool

- name: Mirror release images to local registry
ansible.builtin.command: >-
{{ mor_oc }} adm release mirror
--registry-config={{ mor_auths_file }}
--from={{ ocp_release_data['container_image'] | quote }}
--from={{ mor_ocp_release_data['container_image'] | quote }}
--to-release-image={{ mor_registry_url }}/{{ mor_registry_path }}:{{ mor_version }}
--to={{ mor_registry_url }}/{{ mor_registry_path }}
{%- if mor_allow_insecure_registry | bool %}
--insecure
{%- endif %}
--to={{ mor_registry_url }}/{{ mor_registry_path }}{%- if mor_allow_insecure_registry | bool %}
--insecure{%- endif %}
retries: 3
delay: 10
register: _mor_result
Expand All @@ -37,13 +34,13 @@
ansible.builtin.command: >-
{{ mor_oc }} adm release mirror
--registry-config={{ mor_auths_file }}
--from={{ ocp_release_data['container_image'] | quote }}
--from={{ mor_ocp_release_data['container_image'] | quote }}
--to-release-image={{ mor_registry_url }}/{{ mor_registry_path }}:{{ mor_version }}
--to={{ mor_registry_url }}/{{ mor_registry_path }}
{{ mor_version is version("4.14", "<") | ternary("--dry-run", "--print-mirror-instructions=" + mor_is_type | lower ) }}
{%- if mor_allow_insecure_registry | bool %}
--insecure
{%- endif %}
{{ mor_version is version("4.14", "<") | ternary("--dry-run", mirror_instructions) }}{%- if mor_allow_insecure_registry | bool %}
--insecure{%- endif %}
vars:
mirror_instructions: "--print-mirror-instructions= {{ mor_is_type | lower }}"
retries: 3
delay: 10
register: _mor_result
Expand Down