Skip to content

Conversation

@jackluo923
Copy link
Member

@jackluo923 jackluo923 commented Nov 11, 2025

Description

Checklist

  • The PR satisfies the contribution guidelines.
  • This is a breaking change and that has been indicated in the PR title, OR this isn't a
    breaking change.
  • Necessary docs have been updated, OR no docs need to be updated.

Validation performed

Summary by CodeRabbit

  • Chores
    • Updated CI workflows to run on self-hosted runners to standardize execution across builds.
    • Added a "Test NPM Availability" workflow to check Node/NPM presence and likely installation locations.
    • Added a "Test Runner Pickup" diagnostic workflow to display runtime host information (hostname, date, user).

jackluo923 and others added 7 commits November 10, 2025 19:18
- Update maven-checks job to run on baker runner group
- Update presto-coordinator-image job to run on baker runner group
- Enables self-hosted CI infrastructure for maven builds and Docker image builds
- Specify ubuntu:22.04 container for maven-checks job
- Specify ubuntu:22.04 container for presto-coordinator-image job
- Ensures consistent environment on self-hosted baker runners
- Running containers in ARC requires Kubernetes mode which needs additional configuration
- Jobs will now run directly on the runner pods (Ubuntu-based)
- This simplifies the runner setup while maintaining functionality
- Update prestissimo-worker-images-build to use baker runner group
- Update all 4 jobs in prestocpp-linux-build-and-unit-test to use baker runner group:
  - prestocpp-linux-build-for-test
  - prestocpp-linux-presto-e2e-tests
  - prestocpp-linux-presto-native-tests
  - prestocpp-linux-presto-sidecar-tests
- All jobs retain their container specifications for consistent build environments
Update all workflows to use runs-on: [self-hosted] instead of runner groups.
This allows jobs to run on any available self-hosted runner across baker1,
baker2, baker4, and baker7 clusters.

Generated with [Claude Code](https://claude.com/claude-code)

Co-Authored-By: Claude <[email protected]>
- prestocpp-format-and-header-check.yml: Use [self-hosted] instead of ubuntu-latest
- tests.yml: Use [self-hosted] for both 'changes' and 'test' jobs

This enables all workflows to run on the y-scope ARC infrastructure
across baker1, baker2, baker4, and baker7.

🤖 Generated with [Claude Code](https://claude.com/claude-code)

Co-Authored-By: Claude <[email protected]>
@coderabbitai
Copy link

coderabbitai bot commented Nov 11, 2025

Walkthrough

Multiple GitHub Actions workflows were updated to use self-hosted runners instead of GitHub-hosted runners (ubuntu-latest, ubuntu-22.04). Two new diagnostic workflows were added: .github/workflows/test-npm.yml and .github/workflows/test-runner-pickup.yml. No other job steps or control flow were changed.

Changes

Cohort / File(s) Summary
Runner specification updates
\.github/workflows/maven-checks.yml`, `.github/workflows/prestissimo-worker-images-build.yml`, `.github/workflows/prestocpp-format-and-header-check.yml`, `.github/workflows/prestocpp-linux-build-and-unit-test.yml`, `.github/workflows/tests.yml`, `.github/workflows/docs.yml```
Changed job runs-on values from GitHub-hosted runners (ubuntu-latest, ubuntu-22.04) to self-hosted for listed jobs. No other steps, conditions or logic modified.
New workflow — NPM diagnostics
\.github/workflows/test-npm.yml```
Added "Test NPM Availability" workflow (trigger: workflow_dispatch) with a single job running on self-hosted that prints system info, checks node/npm availability and versions, and lists candidate npm locations / PATH.
New workflow — Runner pickup test
\.github/workflows/test-runner-pickup.yml```
Added "Test Runner Pickup" workflow (triggers: workflow_dispatch, push to ci-test) with a single simple-test job on self-hosted that echoes runtime info (hostname, date, user).

Estimated code review effort

🎯 2 (Simple) | ⏱️ ~10 minutes

  • Changes are mostly repetitive runs-on replacements and two small diagnostic workflows.
  • Spot-checks:
    • Confirm the CI environment provides the expected self-hosted runner labels and available tools.
    • Verify any downstream assumptions about GitHub-hosted tool availability (e.g., npm, node, system paths) are handled where needed.

Pre-merge checks and finishing touches

❌ Failed checks (1 warning, 1 inconclusive)
Check name Status Explanation Resolution
Description check ⚠️ Warning The PR description is almost entirely empty, with template placeholders but no actual content filled in—no description, motivation, impact, test plan, or validation details provided. Complete all required template sections with substantive content: describe what changes were made (migrating workflows to self-hosted runners), explain the motivation, document the test plan, and check the contributor checklist items.
Title check ❓ Inconclusive The title 'Ci test' is vague and generic, providing no meaningful information about the substantial workflow runner configuration changes in the changeset. Update the title to be more descriptive, such as 'Migrate GitHub Actions workflows to self-hosted runners' or 'Configure CI workflows to use self-hosted runner infrastructure'.
✅ Passed checks (1 passed)
Check name Status Explanation
Docstring Coverage ✅ Passed No functions found in the changed files to evaluate docstring coverage. Skipping docstring coverage check.
✨ Finishing touches
🧪 Generate unit tests (beta)
  • Create PR with unit tests
  • Post copyable unit tests in a comment

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

Comment @coderabbitai help to get the list of available commands and usage tips.

jackluo923 and others added 4 commits November 11, 2025 15:57
This minimal workflow will help determine if self-hosted runners are
properly registered and picking up jobs. It will run on push to ci-test
branch and can also be triggered manually via workflow_dispatch.
…osted label

The baker runners are registered in the 'Default' runner group, but
specifying 'runs-on: [self-hosted]' doesn't match runners in named
groups. Changing to 'runs-on: Default' allows GitHub Actions to
properly route jobs to the available runners in the Default group.

This fixes the issue where jobs were stuck in pending state because
they couldn't find matching runners.

Updated workflows:
- tests.yml: changes and test jobs
- maven-checks.yml: maven-checks and presto-coordinator-image jobs
- prestocpp-linux-build-and-unit-test.yml: all jobs
- prestocpp-format-and-header-check.yml
- prestissimo-worker-images-build.yml
- test-npm.yml
- test-runner-pickup.yml
Replace 'Default' and 'ubuntu-latest' with 'self-hosted' runner label
across all GitHub Actions workflows to use self-hosted runners instead
of GitHub-hosted runners.
Copy link

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 11

Caution

Some comments are outside the diff and can’t be posted inline due to platform limitations.

⚠️ Outside diff range comments (4)
.github/workflows/docs.yml (1)

41-43: Verify LaTeX dependencies pre-installed on self-hosted runners.

The job attempts to install LaTeX dependencies via sudo apt-get. Ensure that:

  1. Self-hosted runners have sudo access configured.
  2. If not pre-installed, these packages are installed during every run (significant performance overhead).
  3. The runner's operating system is Debian/Ubuntu-based.

To improve performance, consider pre-installing LaTeX dependencies on runner images instead of installing at run time. Alternatively, document runner requirements in your repository's contributing guide to ensure all runners have these dependencies pre-configured.

.github/workflows/maven-checks.yml (1)

31-34: Verify sudo access and disk space management on self-hosted runners.

The maven-checks job runs sudo apt-get clean to free disk space. Ensure self-hosted runners are configured with:

  1. Sudo access without password prompts (configured in sudoers)
  2. Regular cleanup strategy to prevent disk exhaustion during builds

If runners lack sudo access, consider implementing a cleanup step that doesn't require elevated privileges, or pre-configure runners to auto-clean temporary directories.

.github/workflows/prestocpp-linux-build-and-unit-test.yml (2)

48-81: Document ccache persistence strategy for non-ephemeral runners.

The build job uses ccache (C/C++ compiler cache) stored in ${{ github.workspace }}/ccache (lines 48-81). On non-ephemeral self-hosted runners, this cache persists across jobs, which can:

  • Benefit: Accelerate rebuilds with cache hits
  • Risk: Accumulate disk space (ccache can grow to GBs), risk cache corruption, cause stale artifacts

Implement a cleanup strategy:

- name: Limit ccache size
  run: ccache --max-size=5G

- name: Cleanup old cache entries
  run: ccache --cleanup

Also monitor disk usage on self-hosted runners to prevent No space left on device errors during builds.


72-72: Review hardcoded build parallelism for self-hosted runners.

Line 72 specifies -DMAX_LINK_JOBS=4, which limits linker parallelism regardless of runner CPU cores. This:

  • May underutilize runners with >4 cores (slow builds)
  • May cause OOM errors on runners with <4 cores or limited RAM

Consider making this dynamic based on available CPU cores and available memory:

-            -DMAX_LINK_JOBS=4
+            -DMAX_LINK_JOBS=$(( $(nproc) / 2 ))

Or document the assumption that runners have at least 4 cores, 8GB RAM, and add resource requirement labels to ensure appropriate runner selection.

📜 Review details

Configuration used: Path: .coderabbit.yaml

Review profile: ASSERTIVE

Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between bcd79dd and 5c54c6f.

📒 Files selected for processing (8)
  • .github/workflows/docs.yml (1 hunks)
  • .github/workflows/maven-checks.yml (2 hunks)
  • .github/workflows/prestissimo-worker-images-build.yml (1 hunks)
  • .github/workflows/prestocpp-format-and-header-check.yml (1 hunks)
  • .github/workflows/prestocpp-linux-build-and-unit-test.yml (4 hunks)
  • .github/workflows/test-npm.yml (1 hunks)
  • .github/workflows/test-runner-pickup.yml (1 hunks)
  • .github/workflows/tests.yml (2 hunks)
🧰 Additional context used
🧠 Learnings (1)
📚 Learning: 2025-06-04T14:50:43.554Z
Learnt from: anlowee
Repo: y-scope/presto PR: 7
File: .github/workflows/hive-tests.yml:33-35
Timestamp: 2025-06-04T14:50:43.554Z
Learning: In the y-scope/presto repository, GitHub workflows are extended from OSS Presto and should maintain consistent formatting with the upstream patterns, including YAML indentation styles for workflow steps.

Applied to files:

  • .github/workflows/prestocpp-format-and-header-check.yml
  • .github/workflows/maven-checks.yml
  • .github/workflows/prestissimo-worker-images-build.yml
  • .github/workflows/prestocpp-linux-build-and-unit-test.yml
⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (3)
  • GitHub Check: maven-checks (17.0.13)
  • GitHub Check: maven-checks (8.0.442)
  • GitHub Check: prestocpp-linux-build-for-test

name: "presto-coordinator-image"
needs: "maven-checks"
runs-on: "ubuntu-22.04"
runs-on: self-hosted
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🧹 Nitpick | 🔵 Trivial

Verify self-hosted runner has Docker daemon and registry credentials.

The presto-coordinator-image job builds and pushes Docker images to ghcr.io. Ensure self-hosted runners have:

  1. Docker daemon running and accessible
  2. Sufficient disk space for image builds and temporary layers
  3. Network connectivity to ghcr.io

No Docker-specific runner labels are specified, which could lead to job failures if assigned to runners without Docker support.

Add explicit Docker support labels to the runner specification:

  presto-coordinator-image:
    name: "presto-coordinator-image"
    needs: "maven-checks"
-    runs-on: self-hosted
+    runs-on: [self-hosted, linux, docker, x64]

prestissimo-worker-images-build:
name: "prestissimo-worker-images-build"
runs-on: "ubuntu-22.04"
runs-on: self-hosted
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🧹 Nitpick | 🔵 Trivial

Verify self-hosted runner OS version and architecture matching.

The workflow previously specified ubuntu-22.04 but now uses generic self-hosted. This could cause build inconsistencies if:

  1. Runners are running different Ubuntu versions (20.04 vs 22.04 vs 24.04)
  2. Build dependencies vary across OS versions
  3. Toolchain versions differ (CMAKE, GCC, LLVM, etc.)

To ensure reproducibility and catch OS-related regressions early, either:

  1. Explicitly label runners with their OS version:
    -    runs-on: self-hosted
    +    runs-on: [self-hosted, linux, ubuntu-22.04, x64, docker]
  2. Or add a step to verify the runtime OS and fail if unexpected:
    - name: Verify OS version
      run: |
        expected_version="22.04"
        actual_version=$(lsb_release -rs)
        if [ "$actual_version" != "$expected_version" ]; then
          echo "ERROR: Expected Ubuntu $expected_version, got $actual_version"
          exit 1
        fi
🤖 Prompt for AI Agents
In .github/workflows/prestissimo-worker-images-build.yml around line 10 the
runner is changed from an explicit ubuntu-22.04 to a generic self-hosted which
can introduce OS/architecture variability; update the workflow to enforce a
consistent environment by either (A) adding explicit runner labels that include
OS version and architecture (for example add labels like "self-hosted",
"ubuntu-22.04", "x86_64" to the runs-on selector) or (B) add an early
verification step that checks the runtime OS version and architecture and fails
the job if they do not match the expected values (e.g., check lsb_release -rs
and uname -m and exit non-zero on mismatch); implement one of these approaches
and ensure the expected OS/version/arch values are documented in the workflow
file.

jobs:
prestocpp-linux-build-for-test:
runs-on: ubuntu-22.04
runs-on: self-hosted
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🧹 Nitpick | 🔵 Trivial

Verify all self-hosted runners have Docker support and pre-configured containers.

All four jobs in this workflow use a specific container image: prestodb/presto-native-dependency:0.293-20250522140509-484b00e. Ensure that:

  1. All self-hosted runners have Docker daemon running and accessible
  2. The container image is available (or will be pulled on first use)
  3. Container startup overhead is acceptable for your use case

Add explicit Docker runner labels to prevent job assignment to incompatible runners:

  prestocpp-linux-build-for-test:
-    runs-on: self-hosted
+    runs-on: [self-hosted, linux, docker, x64]

Committable suggestion skipped: line range outside the PR's diff.

🤖 Prompt for AI Agents
.github/workflows/prestocpp-linux-build-and-unit-test.yml around line 23: the
workflow currently uses "runs-on: self-hosted" but does not guarantee Docker
support or image availability; update the runs-on to a more specific label (for
example "runs-on: [self-hosted, docker]" or your org's Docker-capable runner
label) to prevent scheduling on incompatible hosts, document in the
workflow/comments that runners must have the Docker daemon accessible and that
the container image
prestodb/presto-native-dependency:0.293-20250522140509-484b00e should be either
pre-pulled or allowed to be pulled at runtime, and if startup latency is a
concern add a note to pre-cache the image on runners or use a warm-up job; also
ensure job-level timeout/visibility and any required permissions for using
Docker are declared.

prestocpp-linux-presto-e2e-tests:
needs: prestocpp-linux-build-for-test
runs-on: ubuntu-22.04
runs-on: self-hosted
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🧹 Nitpick | 🔵 Trivial

Verify e2e test job has adequate runner resources.

The prestocpp-linux-presto-e2e-tests job runs presto server with test suite, requiring:

  • Docker support (container job)
  • Adequate memory (Presto server + Maven tests)
  • Network access for remote function tests (commented TODO at line 159-161)

Ensure runners are labeled with resource specifications to prevent out-of-memory kills or job timeouts.

Add runner labels:

  prestocpp-linux-presto-e2e-tests:
    needs: prestocpp-linux-build-for-test
-    runs-on: self-hosted
+    runs-on: [self-hosted, linux, docker, x64, presto-e2e-runner]

Then configure runners with the presto-e2e-runner label to have adequate resources (recommended: ≥8GB RAM, ≥4 CPU cores).

📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change
runs-on: self-hosted
runs-on: [self-hosted, linux, docker, x64, presto-e2e-runner]
🤖 Prompt for AI Agents
.github/workflows/prestocpp-linux-build-and-unit-test.yml around line 100: the
prestocpp-linux-presto-e2e-tests job uses a generic self-hosted runner which may
not provide Docker, sufficient memory, CPU or network access; update the job to
use specific runner labels (e.g., add "presto-e2e-runner" and an OS label)
instead of only "self-hosted", and ensure your self-hosted runners are
provisioned/configured with Docker and adequate resources (recommended ≥8GB RAM
and ≥4 CPU cores) and required network access for remote function tests.

prestocpp-linux-presto-sidecar-tests:
needs: prestocpp-linux-build-for-test
runs-on: ubuntu-22.04
runs-on: self-hosted
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🧹 Nitpick | 🔵 Trivial

Add runner labels to prevent job misassignment for sidecar tests.

The prestocpp-linux-presto-sidecar-tests job requires Docker container support and adequate resources (line 251). No runner labels are specified, which could lead to assignment to incompatible or under-provisioned runners.

Add explicit runner labels:

  prestocpp-linux-presto-sidecar-tests:
    needs: prestocpp-linux-build-for-test
-    runs-on: self-hosted
+    runs-on: [self-hosted, linux, docker, x64, presto-sidecar-tests-runner]
📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change
runs-on: self-hosted
prestocpp-linux-presto-sidecar-tests:
needs: prestocpp-linux-build-for-test
runs-on: [self-hosted, linux, docker, x64, presto-sidecar-tests-runner]
🤖 Prompt for AI Agents
In .github/workflows/prestocpp-linux-build-and-unit-test.yml around line 251,
the prestocpp-linux-presto-sidecar-tests job uses runs-on: self-hosted which can
be picked up by incompatible or under-provisioned runners; update runs-on to
specify an explicit set of runner labels (e.g., include self-hosted plus linux,
docker, and a resource-sizing label such as xlarge or high-memory) so the job
will only be scheduled on hosts that provide Docker and sufficient resources.

Comment on lines +1 to +30
name: Test NPM Availability

on:
workflow_dispatch:

jobs:
test-npm:
runs-on: self-hosted
steps:
- name: Check system info
run: |
echo "=== Runner Information ==="
echo "Hostname: $(hostname)"
echo "OS: $(uname -a)"
echo "User: $(whoami)"
- name: Check npm and node versions
run: |
echo "=== NPM and Node Versions ==="
which npm || echo "npm not found in PATH"
npm --version || echo "npm command failed"
which node || echo "node not found in PATH"
node --version || echo "node command failed"
- name: List installed packages
run: |
echo "=== Checking for npm in common locations ==="
ls -la /usr/bin/npm 2>/dev/null || echo "/usr/bin/npm not found"
ls -la /usr/local/bin/npm 2>/dev/null || echo "/usr/local/bin/npm not found"
which -a npm 2>/dev/null || echo "npm not in PATH" No newline at end of file
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🧹 Nitpick | 🔵 Trivial

Clarify purpose and long-term intent of NPM availability diagnostic workflow.

The test-npm.yml workflow provides useful diagnostics for checking npm/node availability and paths on self-hosted runners. However, before merging, clarify:

  1. Is this workflow intended to remain in the main branch as a permanent troubleshooting tool?
  2. Should it be run only on specific branches or moved to a separate documentation/testing repository?
  3. How will results be used? (Should they auto-fail if npm is not found?)
  4. Are there automated actions triggered by workflow failures?

If this is a temporary diagnostic workflow for self-hosted runner setup validation, add documentation:

# TODO: Review this diagnostic workflow after self-hosted runner setup is complete
# Created: 2025-11-11, Purpose: Verify npm/node availability on self-hosted runners
name: Test NPM Availability  # Diagnostic workflow - may be removed after setup validation

Additionally, consider making the workflow fail if npm is not available, to surface configuration issues:

      - name: Check npm and node versions
        run: |
          echo "=== NPM and Node Versions ==="
+         set -e  # Exit on first error
          which npm || (echo "ERROR: npm not found in PATH" && exit 1)
          npm --version
          which node || (echo "ERROR: node not found in PATH" && exit 1)
          node --version
🤖 Prompt for AI Agents
.github/workflows/test-npm.yml lines 1-30: clarify intent and make diagnostics
actionable by (1) adding a top-of-file TODO header comment with creation date,
purpose and note that it may be removed after self-hosted runner validation, (2)
deciding and encoding scope/retention — either restrict triggers to specific
branches or move the workflow to a docs/tests repo and update the name to
indicate “diagnostic”, and (3) make the job fail when npm/node are missing by
replacing the permissive checks with explicit exit codes (e.g., run commands
that exit non-zero when npm/node are not found) or add an explicit conditional
step that checks presence and uses "exit 1" if absent so the workflow surfaces
configuration issues.

Comment on lines +1 to +18
name: Test Runner Pickup

on:
workflow_dispatch:
push:
branches:
- ci-test

jobs:
simple-test:
runs-on: self-hosted
steps:
- name: Echo test
run: |
echo "Runner picked up the job!"
echo "Hostname: $(hostname)"
echo "Date: $(date)"
echo "User: $(whoami)"
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🧹 Nitpick | 🔵 Trivial

Clarify purpose and long-term intent of diagnostic workflow.

The test-runner-pickup.yml workflow is a simple diagnostic tool for testing self-hosted runner availability and responsiveness. Before merging, clarify:

  1. Is this workflow intended to remain in the main branch?
  2. Should it be run only on specific branches or moved to a separate test repository?
  3. Should it be added to .github/workflows/ or a separate experimental directory?
  4. Who will monitor these results, and what action triggers on failure?

If this is a temporary diagnostic workflow for validating the CI migration, consider adding a comment documenting its purpose and planned removal date:

# TODO: Remove this diagnostic workflow after validating self-hosted runner setup
# Created: 2025-11-11, Target removal: 2025-12-11
name: Test Runner Pickup  # Diagnostic: Self-hosted runner availability test
🤖 Prompt for AI Agents
.github/workflows/test-runner-pickup.yml lines 1-18: this diagnostic workflow
needs explicit documentation and decision guidance before merging — add a
top-of-file comment that states its purpose (self-hosted runner availability
test), intended lifespan (creation date and planned removal date or retained
indefinitely), recommended branch policy (e.g., only run on a specific branch or
move to a separate experimental repo/dir), and who owns/monitors the workflow
and what actions to take on failures; optionally restrict triggers to a specific
branch or remove from main workflows directory if meant temporary.

jobs:
changes:
runs-on: ubuntu-latest
runs-on: self-hosted
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🧹 Nitpick | 🔵 Trivial

Add runner labels to prevent job starvation with matrix strategy.

The changes job and subsequent test job (line 44) use a matrix strategy that can spawn up to 20 concurrent jobs (2 Java versions × 10 test modules). Without explicit runner labels or a dedicated runner group, this could cause significant job queuing if runners are shared across other workflows.

Consider adding specific runner labels to ensure adequate capacity:

  changes:
-    runs-on: self-hosted
+    runs-on: [self-hosted, linux, x64, test-runner-pool]

Then configure a runner group or add a custom label test-runner-pool to the appropriate subset of self-hosted runners to ensure tests have dedicated resources.

📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change
runs-on: self-hosted
runs-on: [self-hosted, linux, x64, test-runner-pool]
🤖 Prompt for AI Agents
.github/workflows/tests.yml around line 25 (and the test job at line 44): the
workflow currently uses runs-on: self-hosted which can lead to job starvation
with the 20-job matrix; update the jobs that run the matrix (the changes job and
the test job at line 44) to use a specific runner label such as runs-on:
[self-hosted, test-runner-pool] (or another dedicated label name you choose),
and then ensure your self-hosted runners are grouped/configured to include that
custom label (or create a runner group) so the matrix jobs target only those
dedicated runners and avoid contention with other workflows.

test:
runs-on: ubuntu-latest
runs-on: self-hosted
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🧹 Nitpick | 🔵 Trivial

Verify self-hosted runner disk space for Maven cache accumulation.

The test job runs Maven builds with a 80-minute timeout and matrix strategy spawning multiple jobs. Over time, Maven caches and build artifacts may accumulate on persistent self-hosted runners, potentially exhausting disk space.

Implement a cache cleanup strategy, such as:

  • Configure Maven cache limits or implement periodic cleanup jobs
  • Use ephemeral runners with automatic cleanup between jobs (recommended per GitHub documentation)
  • Monitor disk usage on self-hosted runners and alert if thresholds are exceeded
🤖 Prompt for AI Agents
.github/workflows/tests.yml around line 44: the workflow uses self-hosted
runners ("runs-on: self-hosted") which can accumulate Maven caches/artifacts
across matrix jobs and eventually exhaust disk space; update CI to either switch
to ephemeral hosted runners or add cleanup and monitoring: prefer using
GitHub-hosted runners if possible; otherwise add steps to the job (or a periodic
maintenance workflow) to prune Maven local repository (~/.m2/repository) and
GitHub Actions runner work directories after builds, enforce maven cache size
limits or use mvn dependency:purge-local-repository where appropriate, schedule
a weekly/cron workflow to run disk-cleanup on self-hosted runners, and add
runner-side disk-usage monitoring/alerts so administrators are notified when
free space drops below a threshold.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant