-
Notifications
You must be signed in to change notification settings - Fork 4
Ci test #102
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: release-0.293-clp-connector
Are you sure you want to change the base?
Ci test #102
Conversation
- Update maven-checks job to run on baker runner group - Update presto-coordinator-image job to run on baker runner group - Enables self-hosted CI infrastructure for maven builds and Docker image builds
- Specify ubuntu:22.04 container for maven-checks job - Specify ubuntu:22.04 container for presto-coordinator-image job - Ensures consistent environment on self-hosted baker runners
- Running containers in ARC requires Kubernetes mode which needs additional configuration - Jobs will now run directly on the runner pods (Ubuntu-based) - This simplifies the runner setup while maintaining functionality
- Update prestissimo-worker-images-build to use baker runner group - Update all 4 jobs in prestocpp-linux-build-and-unit-test to use baker runner group: - prestocpp-linux-build-for-test - prestocpp-linux-presto-e2e-tests - prestocpp-linux-presto-native-tests - prestocpp-linux-presto-sidecar-tests - All jobs retain their container specifications for consistent build environments
Update all workflows to use runs-on: [self-hosted] instead of runner groups. This allows jobs to run on any available self-hosted runner across baker1, baker2, baker4, and baker7 clusters. Generated with [Claude Code](https://claude.com/claude-code) Co-Authored-By: Claude <[email protected]>
- prestocpp-format-and-header-check.yml: Use [self-hosted] instead of ubuntu-latest - tests.yml: Use [self-hosted] for both 'changes' and 'test' jobs This enables all workflows to run on the y-scope ARC infrastructure across baker1, baker2, baker4, and baker7. 🤖 Generated with [Claude Code](https://claude.com/claude-code) Co-Authored-By: Claude <[email protected]>
WalkthroughMultiple GitHub Actions workflows were updated to use Changes
Estimated code review effort🎯 2 (Simple) | ⏱️ ~10 minutes
Pre-merge checks and finishing touches❌ Failed checks (1 warning, 1 inconclusive)
✅ Passed checks (1 passed)
✨ Finishing touches🧪 Generate unit tests (beta)
Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out. Comment |
This minimal workflow will help determine if self-hosted runners are properly registered and picking up jobs. It will run on push to ci-test branch and can also be triggered manually via workflow_dispatch.
…osted label The baker runners are registered in the 'Default' runner group, but specifying 'runs-on: [self-hosted]' doesn't match runners in named groups. Changing to 'runs-on: Default' allows GitHub Actions to properly route jobs to the available runners in the Default group. This fixes the issue where jobs were stuck in pending state because they couldn't find matching runners. Updated workflows: - tests.yml: changes and test jobs - maven-checks.yml: maven-checks and presto-coordinator-image jobs - prestocpp-linux-build-and-unit-test.yml: all jobs - prestocpp-format-and-header-check.yml - prestissimo-worker-images-build.yml - test-npm.yml - test-runner-pickup.yml
Replace 'Default' and 'ubuntu-latest' with 'self-hosted' runner label across all GitHub Actions workflows to use self-hosted runners instead of GitHub-hosted runners.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actionable comments posted: 11
Caution
Some comments are outside the diff and can’t be posted inline due to platform limitations.
⚠️ Outside diff range comments (4)
.github/workflows/docs.yml (1)
41-43: Verify LaTeX dependencies pre-installed on self-hosted runners.The job attempts to install LaTeX dependencies via
sudo apt-get. Ensure that:
- Self-hosted runners have sudo access configured.
- If not pre-installed, these packages are installed during every run (significant performance overhead).
- The runner's operating system is Debian/Ubuntu-based.
To improve performance, consider pre-installing LaTeX dependencies on runner images instead of installing at run time. Alternatively, document runner requirements in your repository's contributing guide to ensure all runners have these dependencies pre-configured.
.github/workflows/maven-checks.yml (1)
31-34: Verify sudo access and disk space management on self-hosted runners.The
maven-checksjob runssudo apt-get cleanto free disk space. Ensure self-hosted runners are configured with:
- Sudo access without password prompts (configured in sudoers)
- Regular cleanup strategy to prevent disk exhaustion during builds
If runners lack sudo access, consider implementing a cleanup step that doesn't require elevated privileges, or pre-configure runners to auto-clean temporary directories.
.github/workflows/prestocpp-linux-build-and-unit-test.yml (2)
48-81: Document ccache persistence strategy for non-ephemeral runners.The build job uses ccache (C/C++ compiler cache) stored in
${{ github.workspace }}/ccache(lines 48-81). On non-ephemeral self-hosted runners, this cache persists across jobs, which can:
- Benefit: Accelerate rebuilds with cache hits
- Risk: Accumulate disk space (ccache can grow to GBs), risk cache corruption, cause stale artifacts
Implement a cleanup strategy:
- name: Limit ccache size run: ccache --max-size=5G - name: Cleanup old cache entries run: ccache --cleanupAlso monitor disk usage on self-hosted runners to prevent
No space left on deviceerrors during builds.
72-72: Review hardcoded build parallelism for self-hosted runners.Line 72 specifies
-DMAX_LINK_JOBS=4, which limits linker parallelism regardless of runner CPU cores. This:
- May underutilize runners with >4 cores (slow builds)
- May cause OOM errors on runners with <4 cores or limited RAM
Consider making this dynamic based on available CPU cores and available memory:
- -DMAX_LINK_JOBS=4 + -DMAX_LINK_JOBS=$(( $(nproc) / 2 ))Or document the assumption that runners have at least 4 cores, 8GB RAM, and add resource requirement labels to ensure appropriate runner selection.
📜 Review details
Configuration used: Path: .coderabbit.yaml
Review profile: ASSERTIVE
Plan: Pro
📒 Files selected for processing (8)
.github/workflows/docs.yml(1 hunks).github/workflows/maven-checks.yml(2 hunks).github/workflows/prestissimo-worker-images-build.yml(1 hunks).github/workflows/prestocpp-format-and-header-check.yml(1 hunks).github/workflows/prestocpp-linux-build-and-unit-test.yml(4 hunks).github/workflows/test-npm.yml(1 hunks).github/workflows/test-runner-pickup.yml(1 hunks).github/workflows/tests.yml(2 hunks)
🧰 Additional context used
🧠 Learnings (1)
📚 Learning: 2025-06-04T14:50:43.554Z
Learnt from: anlowee
Repo: y-scope/presto PR: 7
File: .github/workflows/hive-tests.yml:33-35
Timestamp: 2025-06-04T14:50:43.554Z
Learning: In the y-scope/presto repository, GitHub workflows are extended from OSS Presto and should maintain consistent formatting with the upstream patterns, including YAML indentation styles for workflow steps.
Applied to files:
.github/workflows/prestocpp-format-and-header-check.yml.github/workflows/maven-checks.yml.github/workflows/prestissimo-worker-images-build.yml.github/workflows/prestocpp-linux-build-and-unit-test.yml
⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (3)
- GitHub Check: maven-checks (17.0.13)
- GitHub Check: maven-checks (8.0.442)
- GitHub Check: prestocpp-linux-build-for-test
| name: "presto-coordinator-image" | ||
| needs: "maven-checks" | ||
| runs-on: "ubuntu-22.04" | ||
| runs-on: self-hosted |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
🧹 Nitpick | 🔵 Trivial
Verify self-hosted runner has Docker daemon and registry credentials.
The presto-coordinator-image job builds and pushes Docker images to ghcr.io. Ensure self-hosted runners have:
- Docker daemon running and accessible
- Sufficient disk space for image builds and temporary layers
- Network connectivity to
ghcr.io
No Docker-specific runner labels are specified, which could lead to job failures if assigned to runners without Docker support.
Add explicit Docker support labels to the runner specification:
presto-coordinator-image:
name: "presto-coordinator-image"
needs: "maven-checks"
- runs-on: self-hosted
+ runs-on: [self-hosted, linux, docker, x64]| prestissimo-worker-images-build: | ||
| name: "prestissimo-worker-images-build" | ||
| runs-on: "ubuntu-22.04" | ||
| runs-on: self-hosted |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
🧹 Nitpick | 🔵 Trivial
Verify self-hosted runner OS version and architecture matching.
The workflow previously specified ubuntu-22.04 but now uses generic self-hosted. This could cause build inconsistencies if:
- Runners are running different Ubuntu versions (20.04 vs 22.04 vs 24.04)
- Build dependencies vary across OS versions
- Toolchain versions differ (CMAKE, GCC, LLVM, etc.)
To ensure reproducibility and catch OS-related regressions early, either:
- Explicitly label runners with their OS version:
- runs-on: self-hosted + runs-on: [self-hosted, linux, ubuntu-22.04, x64, docker]
- Or add a step to verify the runtime OS and fail if unexpected:
- name: Verify OS version run: | expected_version="22.04" actual_version=$(lsb_release -rs) if [ "$actual_version" != "$expected_version" ]; then echo "ERROR: Expected Ubuntu $expected_version, got $actual_version" exit 1 fi
🤖 Prompt for AI Agents
In .github/workflows/prestissimo-worker-images-build.yml around line 10 the
runner is changed from an explicit ubuntu-22.04 to a generic self-hosted which
can introduce OS/architecture variability; update the workflow to enforce a
consistent environment by either (A) adding explicit runner labels that include
OS version and architecture (for example add labels like "self-hosted",
"ubuntu-22.04", "x86_64" to the runs-on selector) or (B) add an early
verification step that checks the runtime OS version and architecture and fails
the job if they do not match the expected values (e.g., check lsb_release -rs
and uname -m and exit non-zero on mismatch); implement one of these approaches
and ensure the expected OS/version/arch values are documented in the workflow
file.
| jobs: | ||
| prestocpp-linux-build-for-test: | ||
| runs-on: ubuntu-22.04 | ||
| runs-on: self-hosted |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
🧹 Nitpick | 🔵 Trivial
Verify all self-hosted runners have Docker support and pre-configured containers.
All four jobs in this workflow use a specific container image: prestodb/presto-native-dependency:0.293-20250522140509-484b00e. Ensure that:
- All self-hosted runners have Docker daemon running and accessible
- The container image is available (or will be pulled on first use)
- Container startup overhead is acceptable for your use case
Add explicit Docker runner labels to prevent job assignment to incompatible runners:
prestocpp-linux-build-for-test:
- runs-on: self-hosted
+ runs-on: [self-hosted, linux, docker, x64]Committable suggestion skipped: line range outside the PR's diff.
🤖 Prompt for AI Agents
.github/workflows/prestocpp-linux-build-and-unit-test.yml around line 23: the
workflow currently uses "runs-on: self-hosted" but does not guarantee Docker
support or image availability; update the runs-on to a more specific label (for
example "runs-on: [self-hosted, docker]" or your org's Docker-capable runner
label) to prevent scheduling on incompatible hosts, document in the
workflow/comments that runners must have the Docker daemon accessible and that
the container image
prestodb/presto-native-dependency:0.293-20250522140509-484b00e should be either
pre-pulled or allowed to be pulled at runtime, and if startup latency is a
concern add a note to pre-cache the image on runners or use a warm-up job; also
ensure job-level timeout/visibility and any required permissions for using
Docker are declared.
| prestocpp-linux-presto-e2e-tests: | ||
| needs: prestocpp-linux-build-for-test | ||
| runs-on: ubuntu-22.04 | ||
| runs-on: self-hosted |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
🧹 Nitpick | 🔵 Trivial
Verify e2e test job has adequate runner resources.
The prestocpp-linux-presto-e2e-tests job runs presto server with test suite, requiring:
- Docker support (container job)
- Adequate memory (Presto server + Maven tests)
- Network access for remote function tests (commented TODO at line 159-161)
Ensure runners are labeled with resource specifications to prevent out-of-memory kills or job timeouts.
Add runner labels:
prestocpp-linux-presto-e2e-tests:
needs: prestocpp-linux-build-for-test
- runs-on: self-hosted
+ runs-on: [self-hosted, linux, docker, x64, presto-e2e-runner]Then configure runners with the presto-e2e-runner label to have adequate resources (recommended: ≥8GB RAM, ≥4 CPU cores).
📝 Committable suggestion
‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.
| runs-on: self-hosted | |
| runs-on: [self-hosted, linux, docker, x64, presto-e2e-runner] |
🤖 Prompt for AI Agents
.github/workflows/prestocpp-linux-build-and-unit-test.yml around line 100: the
prestocpp-linux-presto-e2e-tests job uses a generic self-hosted runner which may
not provide Docker, sufficient memory, CPU or network access; update the job to
use specific runner labels (e.g., add "presto-e2e-runner" and an OS label)
instead of only "self-hosted", and ensure your self-hosted runners are
provisioned/configured with Docker and adequate resources (recommended ≥8GB RAM
and ≥4 CPU cores) and required network access for remote function tests.
| prestocpp-linux-presto-sidecar-tests: | ||
| needs: prestocpp-linux-build-for-test | ||
| runs-on: ubuntu-22.04 | ||
| runs-on: self-hosted |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
🧹 Nitpick | 🔵 Trivial
Add runner labels to prevent job misassignment for sidecar tests.
The prestocpp-linux-presto-sidecar-tests job requires Docker container support and adequate resources (line 251). No runner labels are specified, which could lead to assignment to incompatible or under-provisioned runners.
Add explicit runner labels:
prestocpp-linux-presto-sidecar-tests:
needs: prestocpp-linux-build-for-test
- runs-on: self-hosted
+ runs-on: [self-hosted, linux, docker, x64, presto-sidecar-tests-runner]📝 Committable suggestion
‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.
| runs-on: self-hosted | |
| prestocpp-linux-presto-sidecar-tests: | |
| needs: prestocpp-linux-build-for-test | |
| runs-on: [self-hosted, linux, docker, x64, presto-sidecar-tests-runner] |
🤖 Prompt for AI Agents
In .github/workflows/prestocpp-linux-build-and-unit-test.yml around line 251,
the prestocpp-linux-presto-sidecar-tests job uses runs-on: self-hosted which can
be picked up by incompatible or under-provisioned runners; update runs-on to
specify an explicit set of runner labels (e.g., include self-hosted plus linux,
docker, and a resource-sizing label such as xlarge or high-memory) so the job
will only be scheduled on hosts that provide Docker and sufficient resources.
| name: Test NPM Availability | ||
|
|
||
| on: | ||
| workflow_dispatch: | ||
|
|
||
| jobs: | ||
| test-npm: | ||
| runs-on: self-hosted | ||
| steps: | ||
| - name: Check system info | ||
| run: | | ||
| echo "=== Runner Information ===" | ||
| echo "Hostname: $(hostname)" | ||
| echo "OS: $(uname -a)" | ||
| echo "User: $(whoami)" | ||
| - name: Check npm and node versions | ||
| run: | | ||
| echo "=== NPM and Node Versions ===" | ||
| which npm || echo "npm not found in PATH" | ||
| npm --version || echo "npm command failed" | ||
| which node || echo "node not found in PATH" | ||
| node --version || echo "node command failed" | ||
| - name: List installed packages | ||
| run: | | ||
| echo "=== Checking for npm in common locations ===" | ||
| ls -la /usr/bin/npm 2>/dev/null || echo "/usr/bin/npm not found" | ||
| ls -la /usr/local/bin/npm 2>/dev/null || echo "/usr/local/bin/npm not found" | ||
| which -a npm 2>/dev/null || echo "npm not in PATH" No newline at end of file |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
🧹 Nitpick | 🔵 Trivial
Clarify purpose and long-term intent of NPM availability diagnostic workflow.
The test-npm.yml workflow provides useful diagnostics for checking npm/node availability and paths on self-hosted runners. However, before merging, clarify:
- Is this workflow intended to remain in the main branch as a permanent troubleshooting tool?
- Should it be run only on specific branches or moved to a separate documentation/testing repository?
- How will results be used? (Should they auto-fail if npm is not found?)
- Are there automated actions triggered by workflow failures?
If this is a temporary diagnostic workflow for self-hosted runner setup validation, add documentation:
# TODO: Review this diagnostic workflow after self-hosted runner setup is complete
# Created: 2025-11-11, Purpose: Verify npm/node availability on self-hosted runners
name: Test NPM Availability # Diagnostic workflow - may be removed after setup validationAdditionally, consider making the workflow fail if npm is not available, to surface configuration issues:
- name: Check npm and node versions
run: |
echo "=== NPM and Node Versions ==="
+ set -e # Exit on first error
which npm || (echo "ERROR: npm not found in PATH" && exit 1)
npm --version
which node || (echo "ERROR: node not found in PATH" && exit 1)
node --version🤖 Prompt for AI Agents
.github/workflows/test-npm.yml lines 1-30: clarify intent and make diagnostics
actionable by (1) adding a top-of-file TODO header comment with creation date,
purpose and note that it may be removed after self-hosted runner validation, (2)
deciding and encoding scope/retention — either restrict triggers to specific
branches or move the workflow to a docs/tests repo and update the name to
indicate “diagnostic”, and (3) make the job fail when npm/node are missing by
replacing the permissive checks with explicit exit codes (e.g., run commands
that exit non-zero when npm/node are not found) or add an explicit conditional
step that checks presence and uses "exit 1" if absent so the workflow surfaces
configuration issues.
| name: Test Runner Pickup | ||
|
|
||
| on: | ||
| workflow_dispatch: | ||
| push: | ||
| branches: | ||
| - ci-test | ||
|
|
||
| jobs: | ||
| simple-test: | ||
| runs-on: self-hosted | ||
| steps: | ||
| - name: Echo test | ||
| run: | | ||
| echo "Runner picked up the job!" | ||
| echo "Hostname: $(hostname)" | ||
| echo "Date: $(date)" | ||
| echo "User: $(whoami)" |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
🧹 Nitpick | 🔵 Trivial
Clarify purpose and long-term intent of diagnostic workflow.
The test-runner-pickup.yml workflow is a simple diagnostic tool for testing self-hosted runner availability and responsiveness. Before merging, clarify:
- Is this workflow intended to remain in the main branch?
- Should it be run only on specific branches or moved to a separate test repository?
- Should it be added to
.github/workflows/or a separate experimental directory? - Who will monitor these results, and what action triggers on failure?
If this is a temporary diagnostic workflow for validating the CI migration, consider adding a comment documenting its purpose and planned removal date:
# TODO: Remove this diagnostic workflow after validating self-hosted runner setup
# Created: 2025-11-11, Target removal: 2025-12-11
name: Test Runner Pickup # Diagnostic: Self-hosted runner availability test🤖 Prompt for AI Agents
.github/workflows/test-runner-pickup.yml lines 1-18: this diagnostic workflow
needs explicit documentation and decision guidance before merging — add a
top-of-file comment that states its purpose (self-hosted runner availability
test), intended lifespan (creation date and planned removal date or retained
indefinitely), recommended branch policy (e.g., only run on a specific branch or
move to a separate experimental repo/dir), and who owns/monitors the workflow
and what actions to take on failures; optionally restrict triggers to a specific
branch or remove from main workflows directory if meant temporary.
| jobs: | ||
| changes: | ||
| runs-on: ubuntu-latest | ||
| runs-on: self-hosted |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
🧹 Nitpick | 🔵 Trivial
Add runner labels to prevent job starvation with matrix strategy.
The changes job and subsequent test job (line 44) use a matrix strategy that can spawn up to 20 concurrent jobs (2 Java versions × 10 test modules). Without explicit runner labels or a dedicated runner group, this could cause significant job queuing if runners are shared across other workflows.
Consider adding specific runner labels to ensure adequate capacity:
changes:
- runs-on: self-hosted
+ runs-on: [self-hosted, linux, x64, test-runner-pool]Then configure a runner group or add a custom label test-runner-pool to the appropriate subset of self-hosted runners to ensure tests have dedicated resources.
📝 Committable suggestion
‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.
| runs-on: self-hosted | |
| runs-on: [self-hosted, linux, x64, test-runner-pool] |
🤖 Prompt for AI Agents
.github/workflows/tests.yml around line 25 (and the test job at line 44): the
workflow currently uses runs-on: self-hosted which can lead to job starvation
with the 20-job matrix; update the jobs that run the matrix (the changes job and
the test job at line 44) to use a specific runner label such as runs-on:
[self-hosted, test-runner-pool] (or another dedicated label name you choose),
and then ensure your self-hosted runners are grouped/configured to include that
custom label (or create a runner group) so the matrix jobs target only those
dedicated runners and avoid contention with other workflows.
| test: | ||
| runs-on: ubuntu-latest | ||
| runs-on: self-hosted |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
🧹 Nitpick | 🔵 Trivial
Verify self-hosted runner disk space for Maven cache accumulation.
The test job runs Maven builds with a 80-minute timeout and matrix strategy spawning multiple jobs. Over time, Maven caches and build artifacts may accumulate on persistent self-hosted runners, potentially exhausting disk space.
Implement a cache cleanup strategy, such as:
- Configure Maven cache limits or implement periodic cleanup jobs
- Use ephemeral runners with automatic cleanup between jobs (recommended per GitHub documentation)
- Monitor disk usage on self-hosted runners and alert if thresholds are exceeded
🤖 Prompt for AI Agents
.github/workflows/tests.yml around line 44: the workflow uses self-hosted
runners ("runs-on: self-hosted") which can accumulate Maven caches/artifacts
across matrix jobs and eventually exhaust disk space; update CI to either switch
to ephemeral hosted runners or add cleanup and monitoring: prefer using
GitHub-hosted runners if possible; otherwise add steps to the job (or a periodic
maintenance workflow) to prune Maven local repository (~/.m2/repository) and
GitHub Actions runner work directories after builds, enforce maven cache size
limits or use mvn dependency:purge-local-repository where appropriate, schedule
a weekly/cron workflow to run disk-cleanup on self-hosted runners, and add
runner-side disk-usage monitoring/alerts so administrators are notified when
free space drops below a threshold.
Description
Checklist
breaking change.
Validation performed
Summary by CodeRabbit