-
Notifications
You must be signed in to change notification settings - Fork 42
mirror_ocp_release: fixes for concurrent jobs #626
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: main
Are you sure you want to change the base?
Changes from all commits
File filter
Filter by extension
Conversations
Jump to
Diff view
Diff view
There are no files selected for viewing
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -1,12 +1,4 @@ | ||
--- | ||
- name: "Check if target file exists" | ||
ansible.builtin.stat: | ||
path: "{{ mor_dir }}/{{ mor_uri | basename }}" | ||
get_checksum: false | ||
register: target | ||
when: | ||
- not mor_force | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Since we're extracting the artifacts on a temporary directory, the file won't exist in advance. |
||
|
||
- name: "Fetch file from URL" | ||
ansible.builtin.get_url: | ||
url: "{{ mor_uri }}" | ||
|
@@ -20,8 +12,8 @@ | |
become: true | ||
retries: 3 | ||
delay: 10 | ||
register: downloaded | ||
until: downloaded is not failed | ||
register: _mor_downloaded | ||
until: _mor_downloaded is not failed | ||
when: | ||
- mor_force or not target.stat.exists | ||
|
||
|
@@ -35,4 +27,6 @@ | |
ansible.builtin.command: /usr/sbin/restorecon -R "{{ mor_dir }}/{{ mor_uri | basename }}" | ||
become: true | ||
when: _mor_selinux.rc == 0 | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Restoring the selinux context does not make sense when extracting the artifacts on a temporary directory. There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. We have to restore this block of code, since fetch.yml is also included from images.yml to pull the disk image directly into the cache store (version directory ignored) so then it's directly served by the cache container. |
||
register: _mor_restorecon | ||
changed_when: _mor_restorecon.rc == 0 | ||
... |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
With this new approach, use of filesystem locks is not needed anymore.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Have you considered scenarios where two jobs run concurrently, both extracting to a temporary location and writing to
mor_cache_dir
? How do you prevent conflicts in such cases? Implementing a mechanism like a lock might help avoid these issues.There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
that's a fair point. I was accepting the risk of facing such scenarios provided moving files around the filesystem is must faster than running the operations directly on the mor_cache_dir.
But at this point it's right what we could just limit the protected zone to the task were we copy the files to the cache directory once they have been processed in the temporary directory.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
here comes another thought. The way this implementation works, the problematic task would be when we run the copy module to move the files from the temporary directory to the cache directory. Alternatively or in combination of the lock usage, we can add the parameter "force: false" to the module call, so ansible won't replace a file that already exists, even if the contents are different.
The question here is whether it'd be safe to assume the artifact won't change between jobs deploying the same OCP release.
My guess is the files won't change if the jobs are running concurrently, but would those artifacts change between jobs running with, say, days of difference?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Unfortunately, I don't find a way of implementing locks that would allows us to run ansible tasks in the locked zone.
In other words, when using locks the lock code must be part of the same shell script.