Skip to content

ci: test gpu on self-hosted runners #108

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Open
wants to merge 3 commits into
base: master
Choose a base branch
from
Open
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
37 changes: 29 additions & 8 deletions .github/workflows/ci.yml
Original file line number Diff line number Diff line change
@@ -1,8 +1,12 @@
name: CI

on: [pull_request, push]
on:
pull_request:
push:
branches:
- master

# Cancel a job if there's a new on on the same branch started.
# Cancel a job if there's a new one on the same branch started.
# Based on https://stackoverflow.com/questions/58895283/stop-already-running-workflow-job-in-github-actions/67223051#67223051
concurrency:
group: ${{ github.ref }}
Expand All @@ -14,8 +18,7 @@ env:
# Faster crates.io index checkout.
CARGO_REGISTRIES_CRATES_IO_PROTOCOL: sparse
RUST_LOG: debug
# Build the kernel only for the single architecture . This should reduce
# the overall compile-time significantly.
# Build the kernel only for the single architecture. This should reduce the overall compile-time significantly.
EC_GPU_CUDA_NVCC_ARGS: --fatbin --gpu-architecture=sm_75 --generate-code=arch=compute_75,code=sm_75
BELLMAN_CUDA_NVCC_ARGS: --fatbin --gpu-architecture=sm_75 --generate-code=arch=compute_75,code=sm_75
NEPTUNE_CUDA_NVCC_ARGS: --fatbin --gpu-architecture=sm_75 --generate-code=arch=compute_75,code=sm_75
Expand All @@ -27,7 +30,9 @@ jobs:
steps:
- uses: actions/checkout@v4
- name: Install required packages
run: sudo apt install --no-install-recommends --yes libhwloc-dev nvidia-cuda-toolkit ocl-icd-opencl-dev
run: |
sudo apt-get update
sudo apt-get install --no-install-recommends --yes libhwloc-dev nvidia-cuda-toolkit ocl-icd-opencl-dev
- name: Install cargo clippy
run: rustup component add clippy
- name: Run cargo clippy
Expand All @@ -44,13 +49,29 @@ jobs:
run: cargo fmt --all -- --check

test:
runs-on: ubuntu-24.04
runs-on: ['self-hosted', 'linux', 'x64', '2xlarge+gpu']
name: Test
steps:
- uses: actions/checkout@v4
# TODO: Move the driver installation to the AMI.
# https://docs.aws.amazon.com/AWSEC2/latest/UserGuide/install-nvidia-driver.html
# https://www.nvidia.com/en-us/drivers/
- name: Install CUDA drivers
run: |
curl -L -o nvidia-driver-local-repo-ubuntu2404-570.148.08_1.0-1_amd64.deb https://us.download.nvidia.com/tesla/570.148.08/nvidia-driver-local-repo-ubuntu2404-570.148.08_1.0-1_amd64.deb
sudo dpkg -i nvidia-driver-local-repo-ubuntu2404-570.148.08_1.0-1_amd64.deb
sudo cp /var/nvidia-driver-local-repo-ubuntu2404-570.148.08/nvidia-driver-local-*-keyring.gpg /usr/share/keyrings/
sudo apt-get update
sudo apt-get install --no-install-recommends --yes cuda-drivers
rm nvidia-driver-local-repo-ubuntu2404-570.148.08_1.0-1_amd64.deb
- name: Install required packages
run: sudo apt install --no-install-recommends --yes libhwloc-dev nvidia-cuda-toolkit ocl-icd-opencl-dev
# In case no GPUs are available, it's using the CPU fallback.
run: |
sudo apt-get update
sudo apt-get install --no-install-recommends --yes libhwloc-dev nvidia-cuda-toolkit ocl-icd-opencl-dev
# TODO: Remove this and other rust installation directives from jobs running
- uses: dtolnay/rust-toolchain@21dc36fb71dd22e3317045c0c31a3f4249868b17
with:
toolchain: 1.83
- name: Test
run: cargo test --verbose

Expand Down
8 changes: 7 additions & 1 deletion CHANGELOG.md
Original file line number Diff line number Diff line change
Expand Up @@ -7,6 +7,11 @@ and this project adheres to [Semantic Versioning](https://book.async.rs/overview

## [Unreleased]

## [19.0.0] - 2025-07-07

- Fix remove clear_layer_data call [#95](https://github.com/filecoin-project/rust-filecoin-proofs-api/pull/95)
- Fix update cache clearing calls [#106](https://github.com/filecoin-project/rust-filecoin-proofs-api/pull/106)

## [18.1.0] - 2024-06-18

- Update API doc comments [#101](https://github.com/filecoin-project/rust-filecoin-proofs-api/pull/101)
Expand Down Expand Up @@ -168,7 +173,8 @@ and this project adheres to [Semantic Versioning](https://book.async.rs/overview

- Initial stable release

[Unreleased]: https://github.com/filecoin-project/rust-filecoin-proofs-api/compare/v18.1.0...HEAD
[Unreleased]: https://github.com/filecoin-project/rust-filecoin-proofs-api/compare/v19.0.0...HEAD
[19.0.0]: https://github.com/filecoin-project/rust-filecoin-proofs-api/tree/v19.0.0
[18.1.0]: https://github.com/filecoin-project/rust-filecoin-proofs-api/tree/v18.1.0
[18.0.1]: https://github.com/filecoin-project/rust-filecoin-proofs-api/tree/v18.0.1
[18.0.0]: https://github.com/filecoin-project/rust-filecoin-proofs-api/tree/v18.0.0
Expand Down
8 changes: 4 additions & 4 deletions Cargo.toml
Original file line number Diff line number Diff line change
@@ -1,6 +1,6 @@
[package]
name = "filecoin-proofs-api"
version = "18.1.0"
version = "19.0.0"
description = "API to interact with the proofs system in Filecoin"
authors = ["dignifiedquire <[email protected]>"]
edition = "2018"
Expand All @@ -14,9 +14,9 @@ bincode = "1.1.2"
blstrs = "0.7"
lazy_static = "1.2"
serde = "1.0.104"
filecoin-proofs-v1 = { package = "filecoin-proofs", version = "~18.1.0", default-features = false }
fr32 = { version = "~11.1.0", default-features = false }
storage-proofs-core = { version = "~18.1.0", default-features = false }
filecoin-proofs-v1 = { package = "filecoin-proofs", version = "~19.0.0", default-features = false }
fr32 = { version = "~12.0.0", default-features = false }
storage-proofs-core = { version = "~19.0.0", default-features = false }

[features]
default = ["opencl", "cuda"]
Expand Down