Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

docker: Enable cache for Docker builds #3821

Merged
merged 3 commits into from
Feb 14, 2025
Merged

Conversation

echoix
Copy link
Member

@echoix echoix commented Jun 14, 2024

Currently, none of the docker images use caching. Two of the four docker builds compile PDAL from source, that itself takes 8.3 minutes each. So 2x8.3 minutes could be saved only for that step.

However, usually we would want to have a clean build, built from scratch, when it is a release, so if ever a caching issue was present, it wouldn't affect the released image. But, since the docker builds triggered for releasebranch_8_* are published for each commit (backport) (ie: docker pull osgeo/grass-gis:releasebranch_8_4-ubuntu_wxgui) or commit to main, we're effectively already creating "release"-level docker builds every time. A question to weigh in is whether caching could be used for these or not.

Another question I have, is whether the RUN instruction installing packages (like the apk install for alpine or the apt-get install on debian-based distros) should have the cache invalidated at some point or not. Otherwise, until the layers above the install step invalidates the cache (changing the base image digest or PDAl version, or the text of the install instruction), the layer might never be invalidated and see that the contents of the layer are different (only COPY and ADD do that: https://docs.docker.com/build/cache/invalidation/#general-rules). Otherwise, as suggests the third point of https://docs.docker.com/build/cache/invalidation/#run-instructions, we could need to do a build stage just for installing the dependencies in order to use the --no-cache-filter option. But the question then is when do we determine when to make a run without cache of that step? This becomes equivalent to just having periodically no-cache builds to get new packages built.

To easily implement the no-cache builds, but only sometimes, the Docker workflows should be able to run on workflow_dispatch (manual runs) or scheduled runs, where the no-cache option (available in the build-push-action) could be set. Allowing to run on workflow dispatch would also allow to launch manual runs for a specific ref (branch), as long as its in the same repo (that means unmerged PRs could be launched in the fork only, but at least it could run). But this is for another PR, if ever needed.

The choice of GitHub Actions cache instead of inline, local, or registry cache types (https://docs.docker.com/build/cache/backends/#backends), is that at least with this one, we can manually delete caches when wanted if needed. My second choice would be with the registry type, with mode=max, but without having the chance to control when a caching issue occurs

It is currently a draft, as I also still need to validate the cache size that would be uploaded, to make sure we don't bust the 10 GB limit, causing cache thrashing.

@echoix echoix marked this pull request as draft June 14, 2024 19:59
@echoix
Copy link
Member Author

echoix commented Jun 14, 2024

@mmacata I'd like your thoughts on this, especially with your experiences with multiple local builds using the same existing cache for the install step (layer). Does it invalidate enough? I'm letting this be a draft, but would be ready anytime if the discussion here does not need more changes.

@github-actions github-actions bot added the CI Continuous integration label Jun 14, 2024
@landam landam added this to the 8.5.0 milestone Jun 16, 2024

Verified

This commit was created on GitHub.com and signed with GitHub’s verified signature.
@mmacata
Copy link
Contributor

mmacata commented Jan 28, 2025

@echoix thank you for being so patient with me 😇

The cache itself is a very good idea!

I have very good experiences with the local build cache, it invalidates exactly when it should and is very useful in saving resources and build time.

I don't have experience with the docker cache for github actions. I would expect the cache to be present for the workflow, although not 100% sure if it might need to be downloaded every time?

There is another not so obvious option for using a cache: using a separate dependency Dockerimage. We use this for actinia. It is a separate Dockerfile with a scheduled build, containing everything that is not depending on the source code (eg all except build and steps after). For GRASS GIS it could be scheduled eg once a week or every two weeks. Then the GRASS GIS compilation takes place based on this one like it is triggered now. A new dependency image could be detected by renovate (this part is not working for actinia but not so crucial there as we manually update once in a while).
Advantage is that it is always clear what is in the base image and it doesn't change and therefore cannot unintentionally break things. The dependency image has a scheduled build and the commit builds run on demand with "cache" and update when the base image is updated. Also the 10 GB limit is nothing to think about.
Disadvantage would be that multiple of these dependency images are needed which increases the complexity even more and additional merge requests from renovate which need to be maintained.

I'm not propagating this to be a better solution, I just wanted to suggest it as it came to my mind.

I am also fine with the docker github actions cache, just cannot say much about it.

@echoix
Copy link
Member Author

echoix commented Jan 28, 2025

Dependency images as a base image could make sense, if multiple different variations could be using it. Or using it from different projects. At the "cost" of a single user wanting to build his image not having that image available. In the case where it is different OSes and dependency versions used/linked, or even different C library used, I don't think it's as useful.

Using properly defined build stages (in a single Dockerfile), with some proper caching would allow to rebuild only what is needed, while allowing local builds to work without any dependencies (and could be done in the future with past revisions too, except for installing repo packages).

Overall GitHub Actions cache might be a bit "worse", but would allow anyone with permissions to the repo to solve issues (like me), rather than relying on having access to the docker account, so having a bus factor of 1. We recently seen and continue to be impacted by relying on a part of the GRASS project maintained only by one member, and having to wait until that member is available. It's open source: we don't have deadlines and we can't pressure others for it either :)

Verified

This commit was created on GitHub.com and signed with GitHub’s verified signature.
@echoix echoix marked this pull request as ready for review January 30, 2025 00:28
@mmacata
Copy link
Contributor

mmacata commented Feb 5, 2025

The separate dependency Dockerimages should be of course open as well (in this repository) and pushed via workflow to dockerhub so that there wouldn't be more dependencies / restrictions as now? A base image is needed, yes, but that is already now the case.
But as I wrote above, no strong opinion about either solution, so from my point of view, pls go ahead (:

@echoix
Copy link
Member Author

echoix commented Feb 14, 2025

Can someone approve this PR so we could go forward?

I think we could give this PR a try (it dates from the community meeting in Prague 2024), it still looks correct. I plan to take a look at the docker workflow this weekend, and could adjust by looking at the results after a couple of runs.

Copy link
Contributor

@nilason nilason left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Let’s try it out

@echoix echoix merged commit 1062c2f into OSGeo:main Feb 14, 2025
28 checks passed
@echoix echoix deleted the docker-build-cache branch February 14, 2025 22:19
@echoix
Copy link
Member Author

echoix commented Feb 15, 2025

Ok, I'm clearly not satisfied with this, it doesn't do much, and we pay the overhead of downloading and uploading cache, so it's worse than before.

@wenzeslaus
Copy link
Member

Doesn't that actually fail the builds on main or is that the other PR?

@echoix
Copy link
Member Author

echoix commented Feb 15, 2025

No, it doesn't make them fail, see #5117 (comment)

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
CI Continuous integration
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

5 participants