Skip to content

End-to-end testing with Docker and GitHub Actions #240

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 20 commits into from
Jun 26, 2025

Conversation

lukpueh
Copy link
Contributor

@lukpueh lukpueh commented Jun 13, 2025

This PR adds:

  • a build pipeline to run lind-wasm end-to-end tests locally and on GitHub Actions in a Docker container, writing test results do the Docker build log.
  • basic optimizations to speed-up build time, most notably caching and rust compiler optimization.
  • a skip-list for flakey tests to make build results actually meaningful in a given PR, e.g. green check mark vs. red cross mark on Github no longer need to be ignored.

Why use GitHub Action (GHA) instead of Google Cloud Builds (GCB)?
Given the added build time optimization there is no more reason to use GCB (see #235 for details). That said, the GHA integration is only a small part of this PR (see change details below). If preferred, we can easily port the rest of this PR to GCB.

Why Docker?
lind-wasm requires x86-64 as build environment. This means that local builds on non-x86 dev machines (e.g. macs) need to use Docker or something similar. Using the same Dockerfile locally and on CI allows devs to more easily reproduce and troubleshoot CI failures locally. And more importantly, Docker provides caching support out-of-the box locally and on CI.

Why not Bazel?
Bazel was originally chosen for its caching support. So far nobody has managed to enabled it, which seems especially tricky in ephemeral CI runners. With Docker caching enabled, the original motivation becomes less relevant, and I haven't seen any other obvious benefits from using Bazel either. What remains are a steep learning curve and runtime overhead, which puts additional strain on an already heavy build pipeline.

Change Details (by file)

  • e2e.yml
    Minimal GHA workflow, which triggers the Docker build on PR, using docker/build-push-action. To support caching a non-default docker driver is required and provided by docker/setup-buildx-action. (Note: e2e.yml is the GHA counterpart to cloudbuild.yml for GCB)

  • Dockerfile.e2e
    Simplified copy of .devcontainer/Dockerfile to optimize layer caching. Most notably, it

    • introduces multi-stage builds, to optimize caching,
    • copies sources from the build context more strategically, to optimize cache invalidation,
    • side-steps Bazel for a leaner build.
  • make_glibc_and_sysroot.sh
    Merges several existing scripts in order to setup clang for cross-compiling lind programs (see its doc header for details).

  • skip_test_cases.txt
    Skip list for tests, which have failed at least once in a series of (~10) test runs.

Note to reviewers: The commit history in this PR includes a few detours. I suggest to review file by file using above change details for orientation.

Caveat/TODO

  • With caching now enabled builds are significantly faster, and will likely remain within free GitHub build minutes. However, layer cache sizes are massive and will quickly fill up free GitHub cache space. The good news is, there's plenty of room for optimization.

  • For reasons mentioned above, some existing build tools and scripts are ignored (Bazel, Google Cloud Build) or replicated (.devcontainer/Dockerfile and scripts related to building glibc).

  • Using the existing wasmtestreport.py requires some hacks to workaround hard-coded paths and a lack of meaningful exit codes (see ln -s and ! grep in Dockerfile.e2e).

  • Majority of unit tests are skipped

Let's fix all of these in follow-up PRs.

@ci-response-bot
Copy link

Commit ff6aaf4: Build Failed

View Log

@lukpueh
Copy link
Contributor Author

lukpueh commented Jun 13, 2025

ff6aaf4 adds a GHA workflow to build the dockerfile on GitHub. Resulting .dockerbuild archives can be imported and inspected with Docker Desktop.

With cold caches the build takes ~1h. Surprisingly, the glibc build including sysroot generation takes only ~5 minutes, but running wasmtestreport.py takes more than 50 minutes. Locally on my mac, each takes around 20 minutes.

This means that caching (clang, glibc, sysroot) won't help us much, if the tests are the bottleneck. Will look into improving test times ...

@ci-response-bot
Copy link

Commit 8255a7c: Build Failed

View Log

@lukpueh
Copy link
Contributor Author

lukpueh commented Jun 16, 2025

8255a7c adds a substantial speed gain only by compiling wasmtime with --release flag.

wasmtestreport.py run time just dropped from 50m to 10m. 🚀

@ci-response-bot
Copy link

Commit 46a805a: Build Failed

View Log

@ci-response-bot
Copy link

Commit 2876d61: Build Failed

View Log

@lukpueh
Copy link
Contributor Author

lukpueh commented Jun 18, 2025

2876d61 enables caching. This works well out of the box, see e.g. build times for

The cache layer sizes are still massive (~4GB total; quota is 10GB), so we should definitely look into trimming those. But maybe we can do that in a follow-up PR. What do others think? cc @Lind-Project/lind-team

(NOTE: Before I mark his for review, I need to refactor gen_sysroot.sh so that it does not break existing tooling)

@ci-response-bot
Copy link

Commit 6853d25: Build Failed

View Log

@ci-response-bot
Copy link

Commit ae7a587: Build Failed

View Log

@ci-response-bot
Copy link

Commit 8a0fed0: Build Failed

View Log

@ci-response-bot
Copy link

Commit 663bd14: Build Failed

View Log

@ci-response-bot
Copy link

Commit 07515dc: Build Failed

View Log

@ci-response-bot
Copy link

Commit fe4b7f9: Build Failed

View Log

@lukpueh
Copy link
Contributor Author

lukpueh commented Jun 24, 2025

Linter failure is unrelated: #246

End-to-end build passes: https://github.com/Lind-Project/lind-wasm/actions/runs/15845516172/job/44666650047?pr=240

... but tests fail. Apparently, the wasmtime symlink workaround from 3b444d0 did not work.

@ci-response-bot
Copy link

Commit ea142a1: Build Failed

View Log

lukpueh added 12 commits June 25, 2025 10:45
The new Dockerfile is based on .devcontainer/Dockerfile, and simplified
to allow optimzation for layer caching. Most notably, it
- introduces multi-stage builds, and
- copies sources from the build context, when they are used to
  invalidate caches, only when relevant files are updated

The goal is to run the Docker build in a GitHub Action (e.g.
https://github.com/docker/build-push-action), which requires strategic
caching to not exceed allowed build minutes and cache sizes.

**Caveat**
This is an experimental/explorative patch and does not fit in with the
rest of the toolchain. It:
* ignores bazel for a lighter toolchain and to focus on one
  caching mechanism
* moves all glibc-related build instructions into gen_sysroot.sh to
  reduce complexity

Depending on how well this works, we may replace or revert back to
existing tools.
Building wasmtime with `--release` flag makes `wasmtime compile`, which
is the bottleneck in tests, substantially faster (4s instead of 10s
locally).

Signed-off-by: Lukas Puehringer <[email protected]>
TODO: add issue, to enable them one-by-one

Signed-off-by: Lukas Puehringer <[email protected]>
Signed-off-by: Lukas Puehringer <[email protected]>
Signed-off-by: Lukas Puehringer <[email protected]>
Signed-off-by: Lukas Puehringer <[email protected]>
Goal is to keep the home directory clean.

Signed-off-by: Lukas Puehringer <[email protected]>
The new gen_sysroot.sh also includes glibc + extra build instructions. This
patch renames it accordingly and moves it to the build script directory,
which seems like a good fit.

This allows us to restore the original src/glibc/gen_sysroot.sh to
unbreak other build tooling, e.g. lindtool.sh.

Signed-off-by: Lukas Puehringer <[email protected]>
lukpueh added 6 commits June 25, 2025 10:45
Signed-off-by: Lukas Puehringer <[email protected]>
This is a quick fix to make the data available on CI. Alternatives:
- upload file e.g. to build artifacts
- modify test runner to give more verbose feedback at runtime

Signed-off-by: Lukas Puehringer <[email protected]>
Make sure debug dir exists before adding a symlink.

Signed-off-by: Lukas Puehringer <[email protected]>
Exempt all tests that failed at least once in a series of test runs.
This should leave us with a set of tests, which are likely to pass.

Signed-off-by: Lukas Puehringer <[email protected]>
CI build should fail, if any of the tests fail.  Given that
wasmtestreport.py always returns 0 (pass) regardless of test failures,
we grep for `number_of_failures: <num>` lines in the test results doc,
and return 1 (fail), if we find a line, where <num> is not zero.

Signed-off-by: Lukas Puehringer <[email protected]>
Signed-off-by: Lukas Puehringer <[email protected]>
@ci-response-bot
Copy link

Commit 046140c: Build Failed

View Log

@lukpueh lukpueh changed the title Add experimental Dockerfile for end-to-end testing (WIP) End-to-end testing with Docker and GitHub Actions Jun 25, 2025
@lukpueh lukpueh marked this pull request as ready for review June 25, 2025 09:10
@lukpueh
Copy link
Contributor Author

lukpueh commented Jun 25, 2025

This PR is ready for review! See

  • updated PR description, and
  • passing CI (relevant build logs can be found in the "Build e2e" step).

cc @Yaxuan-w, @rennergade, @m-hemmings

(google cloud build failure is unrelated, see #209)

Yaxuan-w
Yaxuan-w previously approved these changes Jun 25, 2025
Copy link
Member

@Yaxuan-w Yaxuan-w left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Overall logic lgtm. Only one typo suggestions

@Yaxuan-w Yaxuan-w mentioned this pull request Jun 25, 2025
9 tasks
Clang seems to be tolerant in parsing this

Co-authored-by: Alice Wen <[email protected]>
@ci-response-bot
Copy link

Commit 82e1910: Build Failed

View Log

FROM scratch AS clang
ADD https://github.com/llvm/llvm-project/releases/download/llvmorg-16.0.4/clang+llvm-16.0.4-x86_64-linux-gnu-ubuntu-22.04.tar.xz /clang.tar.xz

FROM ubuntu:latest
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is using ubuntu:latest a good idea? It feels like this could be brittle and could be better done by pinning a specific build here like ubuntu:22.04 or ubuntu:24.04

# TODO: only install required hard requirements, see best practices
# https://docs.docker.com/build/building/best-practices/#apt-get
RUN apt-get update && \
apt-get install -y -qq \
Copy link
Member

@m-hemmings m-hemmings Jun 25, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is there a reason to not do a --no-install-recommends here?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I have a branch with --no-install-recommends and with quite a few deps removed: https://github.com/lukpueh/lind-wasm/blob/273d8ba44f274ae3f5c633f3b7c0849c60ca6d5f/Dockerfile#L13-L32

But I didn't have time yet to confirm that it doesn't break anything.

unzip \
vim \
wget \
zip
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We could save a little bit of room here if we clean up after the updates with rm -rf /var/lib/apt/lists/*

@@ -0,0 +1,150 @@
#!/bin/bash
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

if this file replaces other scripts functionality should those scripts be removed?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This needs some planning. lindtool.sh depends on those scripts, and we depend on lindtool.sh via wasmtestreport.py. I will make a proposal to untangle these local build tools.

Copy link
Contributor

@rennergade rennergade left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I really love everything you've done here, thank you very much for your efforts.

I still don't really understand the details of how the caching works and would like to, plus it may be better to document this all even more fully now. Would it be possible to also add something in docs/ going into some detail?

@lukpueh
Copy link
Contributor Author

lukpueh commented Jun 26, 2025

Would it be possible to also add something in docs/ going into some detail?

Absolutely! Let me add a ticket.

@lukpueh lukpueh merged commit 85d7711 into Lind-Project:main Jun 26, 2025
2 of 3 checks passed
lukpueh added a commit to lukpueh/lind-wasm that referenced this pull request Jun 27, 2025
lind-wasm builds were moved from Google Cloud to GitHub Action
in Lind-Project#240. This patch removes now obsolete files, related to Google Cloud
Builds.

It also removes a smart Rust crate to run clippy over a subset of sources
only, based on git status. This turned out to be premature optimization
and is also no longer needed (see Lind-Project#220)

Closes Lind-Project#209, Lind-Project#220

Signed-off-by: Lukas Puehringer <[email protected]>
rennergade pushed a commit that referenced this pull request Jul 1, 2025
lind-wasm builds were moved from Google Cloud to GitHub Action
in #240. This patch removes now obsolete files, related to Google Cloud
Builds.

It also removes a smart Rust crate to run clippy over a subset of sources
only, based on git status. This turned out to be premature optimization
and is also no longer needed (see #220)

Closes #209, #220

Signed-off-by: Lukas Puehringer <[email protected]>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants