-
Notifications
You must be signed in to change notification settings - Fork 20
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[RELEASE] cugraph-gnn v25.02 #139
Open
raydouglass
wants to merge
37
commits into
main
Choose a base branch
from
branch-25.02
base: main
Could not load branches
Branch not found: {{ refName }}
Loading
Could not load tags
Nothing to show
Loading
Are you sure you want to change the base?
Some commits from the old base branch may be removed from the timeline,
and old review comments may become outdated.
Open
Conversation
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Forward-merge branch-24.12 into branch-25.02
Forward-merge branch-24.12 into branch-25.02
Forward-merge branch-24.12 into branch-25.02
Forward-merge branch-24.12 into branch-25.02
Adds a workflow that triggers a second workflow which sends a notification to a designated Slack channel on every PR labelled with breaking, whenever any of the following events are triggered on the PR: - closed - reopened - labeled - unlabeled Depends on rapidsai/shared-workflows#257
By default, CI runs on draft PRs. This leads to many CI runs that may be unnecessary. With this PR's change to `.github/copy-pr-bot.yaml`, an `/ok to test` comment from a trusted user is required to trigger CI on draft PRs. Non-draft PRs will run CI by default, assuming that all commits are signed by trusted users. Otherwise an `/ok to test` is required (as before) -- see the `copy-pr-bot` docs at https://docs.gha-runners.nvidia.com/apps/copy-pr-bot/ for more information. Part of rapidsai/build-planning#123.
Forward-merge branch-24.12 into branch-25.02
All of this repo's `conda-python-tests` jobs have conditions in them like "skip on ARM": https://github.com/rapidsai/cugraph-gnn/blob/2dd300122dfd6fdea70c9d20c276a3c5946b7613/ci/test_python.sh#L100 https://github.com/rapidsai/cugraph-gnn/blob/2dd300122dfd6fdea70c9d20c276a3c5946b7613/ci/test_python.sh#L141 https://github.com/rapidsai/cugraph-gnn/blob/2dd300122dfd6fdea70c9d20c276a3c5946b7613/ci/test_python.sh#L183 As a result, right now the arm64 `conda-python-tests` jobs are just wasting CI resources... they're spending ~40+~ 5-10 minutes occupying a GPU runner just to download some datasets and then exit ([example build link](https://github.com/rapidsai/cugraph-gnn/actions/runs/11858773988/job/33056063652?pr=69)). This proposes never even starting those jobs, to make CI here less expensive. ## Notes for Reviewers ### But why are we skipping arm at all? Lack of pytorch packages. See #61 (comment) Authors: - James Lamb (https://github.com/jameslamb) Approvers: - Alex Barghi (https://github.com/alexbarghi-nv) - Jake Awe (https://github.com/AyodeAwe) URL: #70
Follow-up to these PRs: * rapidsai/devcontainers#417 * #68 Proposes adding devcontainers and a devcontainers CI job to the repo. ## Notes for Reviewers ### Benefits of these changes * faster and easier local development * reduced risk of changes here breaking the RAPIDS unified devcontainers maintained in https://github.com/rapidsai/devcontainers Similar to rapidsai/nx-cugraph#25 ### How I made these changes Copied the `.devcontainer/` directory from https://github.com/rapidsai/cugraph, then just changed `cugraph` references to `cugraph-gnn`. ### How I tested this Tested the `update-version.sh` changes like this: ```shell ./ci/release/update-version.sh '25.04.00' git grep -E '25\.[0-9]+' ``` Authors: - James Lamb (https://github.com/jameslamb) Approvers: - Bradley Dice (https://github.com/bdice) URL: #79
Forward-merge branch-24.12 into branch-25.02
merge branch-24.12 into branch-25.02
Update version references in breaking-change trigger workflow Authors: - Jake Awe (https://github.com/AyodeAwe) - James Lamb (https://github.com/jameslamb) Approvers: - James Lamb (https://github.com/jameslamb) URL: #93
Fixes #94 Uploads API docs for `libwholegraph`, to be used by rapidsai/cugraph-docs#46 Also removes `sphinx` dependencies... this repo only needs to produce Doxygen docs for `libwholegraph`, all the other Sphinx stuff will be done in https://github.com/rapidsai/cugraph-docs. Authors: - James Lamb (https://github.com/jameslamb) - Don Acosta (https://github.com/acostadon) Approvers: - Don Acosta (https://github.com/acostadon) - Bradley Dice (https://github.com/bdice) URL: #96
The branch build triggered by merging #96 failed immediately. > The workflow is not valid. .github/workflows/build.yaml (Line: 47, Col: 12): Job 'docs-build' depends on unknown job 'conda-cpp-build'. ([build link](https://github.com/rapidsai/cugraph-gnn/actions/runs/12379736454)) This fixes that. Authors: - James Lamb (https://github.com/jameslamb) Approvers: - Don Acosta (https://github.com/acostadon) - Bradley Dice (https://github.com/bdice) URL: #97
Proposes miscellaneous small changes: * removes unused dependency groups in `dependencies.yaml` * removes commented-out CMake code * fixes lingering references to things from the `cugraph` repo Authors: - James Lamb (https://github.com/jameslamb) Approvers: - https://github.com/linhu-nv - Ray Douglass (https://github.com/raydouglass) URL: #98
…102) Proposes some miscellaneous packaging cleanup: * sets minimum CMake version to 3.26.4 everywhere, to match the rest of RAPIDS * removes commented-out CMake code * removes unnecessary variables throughout CMake code - *including consolidating version references to use `RAPIDS_VERSION` from https://github.com/rapidsai/cugraph-gnn/blob/af22a1271251dc6b02d91cd593ac32b504356b8d/rapids_config.cmake#L20* * updates some `pre-commit` hooks to their latest versions Authors: - James Lamb (https://github.com/jameslamb) Approvers: - https://github.com/linhu-nv - Bradley Dice (https://github.com/bdice) URL: #102
`wholgraph`'s CMake has some configuration to run `flake8`, `clang-tidy`, and `clang-foramt` via CMake. This proposes removing it. ## Notes for Reviewers ### Risks to doing this? I don't think any. `flake8` and `clang-format` configs are unnecessary, as those are already run via `pre-commit` here: https://github.com/rapidsai/cugraph-gnn/blob/71675d868589ff9f904197f729985de1555cb914/.pre-commit-config.yaml#L22 https://github.com/rapidsai/cugraph-gnn/blob/71675d868589ff9f904197f729985de1555cb914/.pre-commit-config.yaml#L37 The `clang-tidy` support must not actually be used today... it refers to a script that doesn't exist in this repo: https://github.com/rapidsai/cugraph-gnn/blob/71675d868589ff9f904197f729985de1555cb914/cpp/cmake/CodeChecker.cmake#L42-L43 This code had been in the `wholegraph` repo for a while... it was added in June 2023 (rapidsai/wholegraph#24) and then never modified again. ### Benefits of doing this? Similar to #102, I'm putting up PRs like this because I'm planning to attempt to add `libwholegraph` wheels, and want to simplify the `wholegraph` / `pylibwholegraph` CMake as much as possible before doing that, to reduce the implementation and reviewing effort. Authors: - James Lamb (https://github.com/jameslamb) Approvers: - https://github.com/linhu-nv URL: #103
The nightly tests have been failing despite all runs succeeding because the workflow's logic for filtering out notebook runs is invalid. Examples: https://github.com/rapidsai/cugraph-gnn/actions/runs/12649473784, https://github.com/rapidsai/cugraph-gnn/actions/runs/12609514484. Hopefully this change is sufficient to get the nightly suite passing. Authors: - Vyas Ramasubramani (https://github.com/vyasr) Approvers: - James Lamb (https://github.com/jameslamb) URL: #105
The pull-request input is simply wrong, while the other inputs are necessary to pull the correct artifacts. Authors: - Vyas Ramasubramani (https://github.com/vyasr) Approvers: - Bradley Dice (https://github.com/bdice) URL: #106
Contributes to rapidsai/build-planning#127 This PR cannot be merged unless nightly CI has passed within the past 7 days, so if it remains unmerged that will itself be an indication that nightly CI needs fixing. Authors: - Vyas Ramasubramani (https://github.com/vyasr) Approvers: - James Lamb (https://github.com/jameslamb) URL: #100
Removes the build directory from `cugraph-pyg` which should not have been committed. Authors: - Alex Barghi (https://github.com/alexbarghi-nv) Approvers: - James Lamb (https://github.com/jameslamb) - Tingyu Wang (https://github.com/tingyu66) URL: #107
Allows sampling of heterogeneous graphs. Removes unbuffered sampling from the PyG examples and completely disables it in DGL. A future PR will completely drop PyG support for unbuffered sampling, and a future `cugraph` PR will drop support for unbuffered sampling in the distributed sampler. Merge after rapidsai/cugraph#4795 Closes rapidsai/cugraph#4402 Authors: - Alex Barghi (https://github.com/alexbarghi-nv) Approvers: - Tingyu Wang (https://github.com/tingyu66) - James Lamb (https://github.com/jameslamb) URL: #82
Proposes some simplifications for `wholegraph` CMake: * removes code adding timing information via `RULE_LAUNCH_COMPILE` and `RULE_LAUNCH_LINK` - *these are internal to `ctest`, per https://cmake.org/cmake/help/latest/prop_dir/RULE_LAUNCH_LINK.html* * removes `find_package(Python)` and related code in `pylibwholegraph` - *this is already handled by `rapids_cython_init()`: https://github.com/rapidsai/cugraph-gnn/blob/87455cfedcc6721f24c783ba555af14a9a180624/python/pylibwholegraph/CMakeLists.txt#L119-L120* ## Notes for Reviewers ### Benefits of doing this? Similar to #102 and #103, I'm putting up PRs like this because I'm planning to attempt to add libwholegraph wheels, and want to simplify the wholegraph / pylibwholegraph CMake as much as possible before doing that, to reduce the implementation and reviewing effort. Authors: - James Lamb (https://github.com/jameslamb) Approvers: - https://github.com/linhu-nv URL: #109
Address #81 Authors: - Tingyu Wang (https://github.com/tingyu66) - James Lamb (https://github.com/jameslamb) - Bradley Dice (https://github.com/bdice) Approvers: - James Lamb (https://github.com/jameslamb) - Bradley Dice (https://github.com/bdice) - Alex Barghi (https://github.com/alexbarghi-nv) URL: #99
conda-forge is using GCC 13 for CUDA 12 builds. This PR updates CUDA 12 conda builds to use GCC 13, for alignment. These PRs should be merged in a specific order, see rapidsai/build-planning#129 for details. Authors: - Bradley Dice (https://github.com/bdice) - Kyle Edwards (https://github.com/KyleFromNVIDIA) Approvers: - James Lamb (https://github.com/jameslamb) URL: #108
#111) Part of rapidsai/build-planning#136, which tracks some building/packaging simplifications and conventions we'd like to standardize across RAPIDS. This proposes the following: * using `cmake-format` to autoformat CMake code * using `cmake-lint` to enforce style preferences for CMake code * removing unnecessary use of `-DDETECT_CONDA_ENV` for wheel builds * explicitly passing package type to GitHub Actions / `gha-tools` things handling wheels ## Notes for Reviewers The `cmake-format` / `cmake-lint` approach was copied directly from RAFT: * https://github.com/rapidsai/raft/blob/596d4b7338e62a92652503cd76feaeaa187ad740/.pre-commit-config.yaml#L52 * https://github.com/rapidsai/raft/blob/596d4b7338e62a92652503cd76feaeaa187ad740/cpp/cmake/config.json * https://github.com/rapidsai/raft/blob/596d4b7338e62a92652503cd76feaeaa187ad740/cpp/scripts/run-cmake-format.sh Other RAPIDS projects ([like cuDF](https://github.com/rapidsai/cudf/blob/1f0f51f96b79edd820e81343ca521c684b1f4918/.pre-commit-config.yaml#L97)) do this the same way. All formatting-only changes to CMake in this PR were made automatically by `cmake-foramt`. Authors: - James Lamb (https://github.com/jameslamb) Approvers: - Gil Forsyth (https://github.com/gforsyth) - https://github.com/linhu-nv URL: #111
Contributes to rapidsai/build-planning#138 Updates to using UCX 1.18 in pip devcontainers here. Authors: - James Lamb (https://github.com/jameslamb) Approvers: - https://github.com/jakirkham URL: #112
Adds support for PyG 2.6 in cuGraph-PyG. The primary change is updating the examples so they fully specify all tensors, since partial specification is no longer allowed. Authors: - Alex Barghi (https://github.com/alexbarghi-nv) Approvers: - Jake Awe (https://github.com/AyodeAwe) - Tingyu Wang (https://github.com/tingyu66) URL: #114
This PR uses CUDA 12.8.0 to build and test. xref: rapidsai/build-planning#139 Authors: - Bradley Dice (https://github.com/bdice) Approvers: - James Lamb (https://github.com/jameslamb) URL: #115
This PR points the shared workflow branches back to the default 25.02 branches. xref: rapidsai/build-planning#139
Adds a heterogeneous link prediction example for cuGraph-PyG that uses the Taobao dataset. Loosely based on the Taobao example from the PyG repository. Adds ability to specify fanout as a dictionary to better align with PyG API. Fixes a bug where the number of negative samples was calculated incorrectly, causing additional unwanted negative samples to be generated. Updates the negative sampling call to match the new behavior added in rapidsai/cugraph#4885 Merge after rapidsai/cugraph#4898 Authors: - Alex Barghi (https://github.com/alexbarghi-nv) Approvers: - Tingyu Wang (https://github.com/tingyu66) - Kyle Edwards (https://github.com/KyleFromNVIDIA) URL: #104
Adds an example for MNMG PyTorch/NCCL renumbering. Authors: - Alex Barghi (https://github.com/alexbarghi-nv) Approvers: - Tingyu Wang (https://github.com/tingyu66) URL: #101
Now that all features supported by the Dask API are available in the new API, we are deprecating the Dask API. It will be removed in release 25.06. Merge after #104 Closes #86 Authors: - Alex Barghi (https://github.com/alexbarghi-nv) Approvers: - Tingyu Wang (https://github.com/tingyu66) URL: #118
quick fix to create_node_classification function: use data_and_label dict as the parameter instead of pickle_data_path Authors: - https://github.com/linhu-nv Approvers: - Alex Barghi (https://github.com/alexbarghi-nv) URL: #128
Uses a retry wrapper for `pip` commands to try to alleviate CI failures due to hash mismatches that result from network hiccups xref rapidsai/build-planning#148 This will retry failures that show up in CI like: ``` Collecting nvidia-cublas-cu12 (from libraft-cu12==25.2.*,>=0.0.0a0) Downloading https://pypi.nvidia.com/nvidia-cublas-cu12/nvidia_cublas_cu12-12.8.3.14-py3-none-manylinux_2_27_aarch64.whl (604.9 MB) ━━━━━━━━━━━━━━━━━━━━━ 350.2/604.9 MB 229.2 MB/s eta 0:00:02 ERROR: THESE PACKAGES DO NOT MATCH THE HASHES FROM THE REQUIREMENTS FILE. If you have updated the package versions, please update the hashes. Otherwise, examine the package contents carefully; someone may have tampered with them. nvidia-cublas-cu12 from https://pypi.nvidia.com/nvidia-cublas-cu12/nvidia_cublas_cu12-12.8.3.14-py3-none-manylinux_2_27_aarch64.whl#sha256=93a4e0e386cc7f6e56c822531396de8170ed17068a1e18f987574895044cd8c3 (from libraft-cu12==25.2.*,>=0.0.0a0): Expected sha256 93a4e0e386cc7f6e56c822531396de8170ed17068a1e18f987574895044cd8c3 Got 849c88d155cb4b4a3fdfebff9270fb367c58370b4243a2bdbcb1b9e7e940b7be ``` Authors: - Gil Forsyth (https://github.com/gforsyth) Approvers: - Mike Sarahan (https://github.com/msarahan) - Bradley Dice (https://github.com/bdice) URL: #133
linhu-nv
reviewed
Feb 12, 2025
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
seems good to me. thx
linhu-nv
approved these changes
Feb 12, 2025
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
❄️ Code freeze for
branch-25.02
and v25.02 releaseWhat does this mean?
Only critical/hotfix level issues should be merged into
branch-25.02
until release (merging of this PR).What is the purpose of this PR?
branch-25.02
intomain
for the release