End-to-end testing with Docker and GitHub Actions #240

lukpueh · 2025-06-13T09:15:13Z

This PR adds:

a build pipeline to run lind-wasm end-to-end tests locally and on GitHub Actions in a Docker container, writing test results do the Docker build log.
basic optimizations to speed-up build time, most notably caching and rust compiler optimization.
a skip-list for flakey tests to make build results actually meaningful in a given PR, e.g. green check mark vs. red cross mark on Github no longer need to be ignored.

Why use GitHub Action (GHA) instead of Google Cloud Builds (GCB)?
Given the added build time optimization there is no more reason to use GCB (see #235 for details). That said, the GHA integration is only a small part of this PR (see change details below). If preferred, we can easily port the rest of this PR to GCB.

Why Docker?
lind-wasm requires x86-64 as build environment. This means that local builds on non-x86 dev machines (e.g. macs) need to use Docker or something similar. Using the same Dockerfile locally and on CI allows devs to more easily reproduce and troubleshoot CI failures locally. And more importantly, Docker provides caching support out-of-the box locally and on CI.

Why not Bazel?
Bazel was originally chosen for its caching support. So far nobody has managed to enabled it, which seems especially tricky in ephemeral CI runners. With Docker caching enabled, the original motivation becomes less relevant, and I haven't seen any other obvious benefits from using Bazel either. What remains are a steep learning curve and runtime overhead, which puts additional strain on an already heavy build pipeline.

Change Details (by file)

e2e.yml
Minimal GHA workflow, which triggers the Docker build on PR, using docker/build-push-action. To support caching a non-default docker driver is required and provided by docker/setup-buildx-action. (Note: e2e.yml is the GHA counterpart to cloudbuild.yml for GCB)
Dockerfile.e2e
Simplified copy of .devcontainer/Dockerfile to optimize layer caching. Most notably, it
- introduces multi-stage builds, to optimize caching,
- copies sources from the build context more strategically, to optimize cache invalidation,
- side-steps Bazel for a leaner build.
make_glibc_and_sysroot.sh
Merges several existing scripts in order to setup clang for cross-compiling lind programs (see its doc header for details).
skip_test_cases.txt
Skip list for tests, which have failed at least once in a series of (~10) test runs.

Note to reviewers: The commit history in this PR includes a few detours. I suggest to review file by file using above change details for orientation.

Caveat/TODO

With caching now enabled builds are significantly faster, and will likely remain within free GitHub build minutes. However, layer cache sizes are massive and will quickly fill up free GitHub cache space. The good news is, there's plenty of room for optimization.
For reasons mentioned above, some existing build tools and scripts are ignored (Bazel, Google Cloud Build) or replicated (.devcontainer/Dockerfile and scripts related to building glibc).
Using the existing wasmtestreport.py requires some hacks to workaround hard-coded paths and a lack of meaningful exit codes (see ln -s and ! grep in Dockerfile.e2e).
Majority of unit tests are skipped

Let's fix all of these in follow-up PRs.

ci-response-bot · 2025-06-13T09:20:16Z

Commit ff6aaf4: Build Failed

View Log

lukpueh · 2025-06-13T10:01:25Z

ff6aaf4 adds a GHA workflow to build the dockerfile on GitHub. Resulting .dockerbuild archives can be imported and inspected with Docker Desktop.

With cold caches the build takes ~1h. Surprisingly, the glibc build including sysroot generation takes only ~5 minutes, but running wasmtestreport.py takes more than 50 minutes. Locally on my mac, each takes around 20 minutes.

This means that caching (clang, glibc, sysroot) won't help us much, if the tests are the bottleneck. Will look into improving test times ...

ci-response-bot · 2025-06-16T11:32:50Z

Commit 8255a7c: Build Failed

View Log

lukpueh · 2025-06-16T12:14:57Z

8255a7c adds a substantial speed gain only by compiling wasmtime with --release flag.

wasmtestreport.py run time just dropped from 50m to 10m. 🚀

ci-response-bot · 2025-06-16T19:12:03Z

Commit 46a805a: Build Failed

View Log

ci-response-bot · 2025-06-18T08:50:57Z

Commit 2876d61: Build Failed

View Log

lukpueh · 2025-06-18T09:02:02Z

2876d61 enables caching. This works well out of the box, see e.g. build times for

no build-relevant change: 29s
"unit test" change: 5m 25s

The cache layer sizes are still massive (~4GB total; quota is 10GB), so we should definitely look into trimming those. But maybe we can do that in a follow-up PR. What do others think? cc @Lind-Project/lind-team

(NOTE: Before I mark his for review, I need to refactor gen_sysroot.sh so that it does not break existing tooling)

ci-response-bot · 2025-06-23T11:29:13Z

Commit 6853d25: Build Failed

View Log

ci-response-bot · 2025-06-23T11:59:40Z

Commit ae7a587: Build Failed

View Log

ci-response-bot · 2025-06-23T15:54:59Z

Commit 8a0fed0: Build Failed

View Log

ci-response-bot · 2025-06-23T16:03:08Z

Commit 663bd14: Build Failed

View Log

ci-response-bot · 2025-06-23T18:06:13Z

Commit 07515dc: Build Failed

View Log

ci-response-bot · 2025-06-24T08:40:16Z

Commit fe4b7f9: Build Failed

View Log

lukpueh · 2025-06-24T12:38:12Z

Linter failure is unrelated: #246

End-to-end build passes: https://github.com/Lind-Project/lind-wasm/actions/runs/15845516172/job/44666650047?pr=240

... but tests fail. Apparently, the wasmtime symlink workaround from 3b444d0 did not work.

ci-response-bot · 2025-06-25T08:24:10Z

Commit ea142a1: Build Failed

View Log

The new Dockerfile is based on .devcontainer/Dockerfile, and simplified to allow optimzation for layer caching. Most notably, it - introduces multi-stage builds, and - copies sources from the build context, when they are used to invalidate caches, only when relevant files are updated The goal is to run the Docker build in a GitHub Action (e.g. https://github.com/docker/build-push-action), which requires strategic caching to not exceed allowed build minutes and cache sizes. **Caveat** This is an experimental/explorative patch and does not fit in with the rest of the toolchain. It: * ignores bazel for a lighter toolchain and to focus on one caching mechanism * moves all glibc-related build instructions into gen_sysroot.sh to reduce complexity Depending on how well this works, we may replace or revert back to existing tools.

Signed-off-by: Lukas Puehringer <[email protected]>

Building wasmtime with `--release` flag makes `wasmtime compile`, which is the bottleneck in tests, substantially faster (4s instead of 10s locally). Signed-off-by: Lukas Puehringer <[email protected]>

TODO: add issue, to enable them one-by-one Signed-off-by: Lukas Puehringer <[email protected]>

Signed-off-by: Lukas Puehringer <[email protected]>

Goal is to keep the home directory clean. Signed-off-by: Lukas Puehringer <[email protected]>

The new gen_sysroot.sh also includes glibc + extra build instructions. This patch renames it accordingly and moves it to the build script directory, which seems like a good fit. This allows us to restore the original src/glibc/gen_sysroot.sh to unbreak other build tooling, e.g. lindtool.sh. Signed-off-by: Lukas Puehringer <[email protected]>

Signed-off-by: Lukas Puehringer <[email protected]>

This is a quick fix to make the data available on CI. Alternatives: - upload file e.g. to build artifacts - modify test runner to give more verbose feedback at runtime Signed-off-by: Lukas Puehringer <[email protected]>

Make sure debug dir exists before adding a symlink. Signed-off-by: Lukas Puehringer <[email protected]>

Exempt all tests that failed at least once in a series of test runs. This should leave us with a set of tests, which are likely to pass. Signed-off-by: Lukas Puehringer <[email protected]>

CI build should fail, if any of the tests fail. Given that wasmtestreport.py always returns 0 (pass) regardless of test failures, we grep for `number_of_failures: <num>` lines in the test results doc, and return 1 (fail), if we find a line, where <num> is not zero. Signed-off-by: Lukas Puehringer <[email protected]>

Signed-off-by: Lukas Puehringer <[email protected]>

ci-response-bot · 2025-06-25T08:52:08Z

Commit 046140c: Build Failed

View Log

lukpueh · 2025-06-25T11:02:56Z

This PR is ready for review! See

updated PR description, and
passing CI (relevant build logs can be found in the "Build e2e" step).

cc @Yaxuan-w, @rennergade, @m-hemmings

(google cloud build failure is unrelated, see #209)

Yaxuan-w

Overall logic lgtm. Only one typo suggestions

scripts/make_glibc_and_sysroot.sh

Clang seems to be tolerant in parsing this Co-authored-by: Alice Wen <[email protected]>

ci-response-bot · 2025-06-25T14:52:38Z

Commit 82e1910: Build Failed

View Log

m-hemmings · 2025-06-25T18:25:48Z

scripts/Dockerfile.e2e

+FROM scratch AS clang
+ADD https://github.com/llvm/llvm-project/releases/download/llvmorg-16.0.4/clang+llvm-16.0.4-x86_64-linux-gnu-ubuntu-22.04.tar.xz /clang.tar.xz
+
+FROM ubuntu:latest


Is using ubuntu:latest a good idea? It feels like this could be brittle and could be better done by pinning a specific build here like ubuntu:22.04 or ubuntu:24.04

m-hemmings · 2025-06-25T18:27:55Z

scripts/Dockerfile.e2e

+# TODO: only install required hard requirements, see best practices
+# https://docs.docker.com/build/building/best-practices/#apt-get
+RUN apt-get update && \
+    apt-get install -y -qq \


Is there a reason to not do a --no-install-recommends here?

I have a branch with --no-install-recommends and with quite a few deps removed: https://github.com/lukpueh/lind-wasm/blob/273d8ba44f274ae3f5c633f3b7c0849c60ca6d5f/Dockerfile#L13-L32

But I didn't have time yet to confirm that it doesn't break anything.

m-hemmings · 2025-06-25T18:28:48Z

scripts/Dockerfile.e2e

+        unzip \
+        vim \
+        wget \
+        zip


We could save a little bit of room here if we clean up after the updates with rm -rf /var/lib/apt/lists/*

rennergade · 2025-06-26T01:33:39Z

scripts/make_glibc_and_sysroot.sh

@@ -0,0 +1,150 @@
+#!/bin/bash


if this file replaces other scripts functionality should those scripts be removed?

This needs some planning. lindtool.sh depends on those scripts, and we depend on lindtool.sh via wasmtestreport.py. I will make a proposal to untangle these local build tools.

rennergade

I really love everything you've done here, thank you very much for your efforts.

I still don't really understand the details of how the caching works and would like to, plus it may be better to document this all even more fully now. Would it be possible to also add something in docs/ going into some detail?

lukpueh · 2025-06-26T07:42:15Z

Would it be possible to also add something in docs/ going into some detail?

Absolutely! Let me add a ticket.

lind-wasm builds were moved from Google Cloud to GitHub Action in Lind-Project#240. This patch removes now obsolete files, related to Google Cloud Builds. It also removes a smart Rust crate to run clippy over a subset of sources only, based on git status. This turned out to be premature optimization and is also no longer needed (see Lind-Project#220) Closes Lind-Project#209, Lind-Project#220 Signed-off-by: Lukas Puehringer <[email protected]>

lind-wasm builds were moved from Google Cloud to GitHub Action in #240. This patch removes now obsolete files, related to Google Cloud Builds. It also removes a smart Rust crate to run clippy over a subset of sources only, based on git status. This turned out to be premature optimization and is also no longer needed (see #220) Closes #209, #220 Signed-off-by: Lukas Puehringer <[email protected]>

lukpueh mentioned this pull request Jun 16, 2025

CI fails in "Handle Build Status" step #209

Closed

lukpueh force-pushed the docker-build branch from ae7a587 to 8a0fed0 Compare June 23, 2025 15:49

lukpueh force-pushed the docker-build branch from 8a0fed0 to 663bd14 Compare June 23, 2025 15:58

lukpueh added 12 commits June 25, 2025 10:45

Add basic GHA for end-to-end testing with Docker

f800ead

Signed-off-by: Lukas Puehringer <[email protected]>

Disable unnecessary workflow permissions

145aa58

Signed-off-by: Lukas Puehringer <[email protected]>

Speed up tests with wasmtime release build

e75362b

Building wasmtime with `--release` flag makes `wasmtime compile`, which is the bottleneck in tests, substantially faster (4s instead of 10s locally). Signed-off-by: Lukas Puehringer <[email protected]>

Skip failing tests (WIP)

c73f6a3

TODO: add issue, to enable them one-by-one Signed-off-by: Lukas Puehringer <[email protected]>

Remove lind user from CI

715d446

Signed-off-by: Lukas Puehringer <[email protected]>

Use available cores x 2 to make glibc

05c0886

Signed-off-by: Lukas Puehringer <[email protected]>

Minor Dockerfile comment rewords

ca43cae

Signed-off-by: Lukas Puehringer <[email protected]>

Setup basic gha caching for Docker

b8a6124

Signed-off-by: Lukas Puehringer <[email protected]>

Rename Dockerfile for end-to-end testing

f76f014

Goal is to keep the home directory clean. Signed-off-by: Lukas Puehringer <[email protected]>

Add wasmtime workaround and restore lind_config.sh

58cfcad

Signed-off-by: Lukas Puehringer <[email protected]>

lukpueh added 6 commits June 25, 2025 10:45

Trigger e2e GHA on pr

1723238

Signed-off-by: Lukas Puehringer <[email protected]>

Print results.json to e2e docker build log

f562b7f

This is a quick fix to make the data available on CI. Alternatives: - upload file e.g. to build artifacts - modify test runner to give more verbose feedback at runtime Signed-off-by: Lukas Puehringer <[email protected]>

Fix debug->release wasmtime symlink workaround

dc83858

Make sure debug dir exists before adding a symlink. Signed-off-by: Lukas Puehringer <[email protected]>

Skip more flakey tests

a4b2f1f

Exempt all tests that failed at least once in a series of test runs. This should leave us with a set of tests, which are likely to pass. Signed-off-by: Lukas Puehringer <[email protected]>

Skip failing dup3 test

046140c

Signed-off-by: Lukas Puehringer <[email protected]>

lukpueh force-pushed the docker-build branch from ea142a1 to 046140c Compare June 25, 2025 08:46

lukpueh changed the title ~~Add experimental Dockerfile for end-to-end testing (WIP)~~ End-to-end testing with Docker and GitHub Actions Jun 25, 2025

lukpueh marked this pull request as ready for review June 25, 2025 09:10

Yaxuan-w previously approved these changes Jun 25, 2025

View reviewed changes

scripts/make_glibc_and_sysroot.sh Outdated Show resolved Hide resolved

Yaxuan-w mentioned this pull request Jun 25, 2025

Clang 18 #202

Open

9 tasks

Fix typo in target triple

82e1910

Clang seems to be tolerant in parsing this Co-authored-by: Alice Wen <[email protected]>

lukpueh dismissed Yaxuan-w’s stale review via 82e1910 June 25, 2025 14:47

This was referenced Jun 25, 2025

ci: fix and unskip tests #249

Open

Review CI platforms and build tools #235

Closed

m-hemmings reviewed Jun 25, 2025

View reviewed changes

rennergade reviewed Jun 26, 2025

View reviewed changes

lukpueh merged commit 85d7711 into Lind-Project:main Jun 26, 2025
2 of 3 checks passed

This was referenced Jun 26, 2025

docs: describe CI and local end-to-end testing in contributor docs #250

Open

build: optimize Docker caching #251

Closed

Enable weekly builds with pushes to Dockerhub #252

Closed

This was referenced Jun 27, 2025

Remove Google Cloud Build scripts #254

Merged

GHA cache not shared between PRs #256

Closed

End-to-end testing with Docker and GitHub Actions #240

End-to-end testing with Docker and GitHub Actions #240

Uh oh!

Conversation

lukpueh commented Jun 13, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

ci-response-bot bot commented Jun 13, 2025

Uh oh!

lukpueh commented Jun 13, 2025

Uh oh!

ci-response-bot bot commented Jun 16, 2025

Uh oh!

lukpueh commented Jun 16, 2025

Uh oh!

ci-response-bot bot commented Jun 16, 2025

Uh oh!

ci-response-bot bot commented Jun 18, 2025

Uh oh!

lukpueh commented Jun 18, 2025

Uh oh!

ci-response-bot bot commented Jun 23, 2025

Uh oh!

ci-response-bot bot commented Jun 23, 2025

Uh oh!

ci-response-bot bot commented Jun 23, 2025

Uh oh!

ci-response-bot bot commented Jun 23, 2025

Uh oh!

ci-response-bot bot commented Jun 23, 2025

Uh oh!

ci-response-bot bot commented Jun 24, 2025

Uh oh!

lukpueh commented Jun 24, 2025

Uh oh!

ci-response-bot bot commented Jun 25, 2025

Uh oh!

ci-response-bot bot commented Jun 25, 2025

Uh oh!

lukpueh commented Jun 25, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

Yaxuan-w left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

ci-response-bot bot commented Jun 25, 2025

Uh oh!

m-hemmings Jun 25, 2025

Choose a reason for hiding this comment

Uh oh!

m-hemmings Jun 25, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

lukpueh Jun 25, 2025

Choose a reason for hiding this comment

Uh oh!

m-hemmings Jun 25, 2025

Choose a reason for hiding this comment

Uh oh!

rennergade Jun 26, 2025

Choose a reason for hiding this comment

Uh oh!

lukpueh Jun 26, 2025

Choose a reason for hiding this comment

Uh oh!

rennergade left a comment

Choose a reason for hiding this comment

Uh oh!

lukpueh commented Jun 26, 2025

Uh oh!

Uh oh!

Uh oh!

lukpueh commented Jun 13, 2025 •

edited

Loading

lukpueh commented Jun 25, 2025 •

edited

Loading

m-hemmings Jun 25, 2025 •

edited

Loading