You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Follow-up backlog for the CUDA target/runtime work introduced by PR #1.
The current PR adds the CUDA target, native CUDA runtime module support, CUDA-aware NTT runtime primitives, CUDA kernel tests, and a dedicated Linux CUDA CI job. The items below are intentionally tracked outside the PR so they do not block the initial CUDA support merge.
Backlog
Register and maintain a dedicated self-hosted CUDA runner pool so the CUDA job can run as a required PR check.
Expand CUDA CI from the focused kernel coverage toward the full CUDA kernel test class once runner capacity and runtime stability are sufficient.
Add matrix coverage for additional CUDA architectures or toolkit versions if compatibility across GPU generations becomes a release requirement.
Continue reducing duplicated CPU/CUDA codegen and runtime paths as the NTT target abstraction stabilizes.
Notes
Full UnitTestCUDAKernels execution requires a Linux runner with an NVIDIA GPU, CUDA toolkit, nvcc, clang/clang++, and the labels self-hosted, linux, x64, cuda.
Context
Follow-up backlog for the CUDA target/runtime work introduced by PR #1.
The current PR adds the CUDA target, native CUDA runtime module support, CUDA-aware NTT runtime primitives, CUDA kernel tests, and a dedicated Linux CUDA CI job. The items below are intentionally tracked outside the PR so they do not block the initial CUDA support merge.
Backlog
Notes
UnitTestCUDAKernelsexecution requires a Linux runner with an NVIDIA GPU, CUDA toolkit,nvcc,clang/clang++, and the labelsself-hosted,linux,x64,cuda.120, matching the local validation environment used for PR Add CUDA Target, Runtime, and Kernel CI Support #1.