Remove CUDA 11.0/11.1 support and upgrade to CCCL 2.3.2+ #1155

ptheywood · 2023-11-28T19:31:53Z

Remove support for CUDA 11.0 and 11.1 and switch from Thrust to CCCL >= 2.3.2

Does not remove all code/workarounds/checks related to 11.0 and 11.1, i.e. in case the previsouly compiler issues re-emerge
Fixes some typos in nearby code
CCCL >= 2.3.2 is required due for CMake and MSVC fixes in previous CCCL releases
Removes known issues for CUDA <= 11.1 from the readme

CUDA 12.5 is the first CUDA which is documented as shipping with CCCL >= 2.3.2, must test with 11.x, <= 12.4, >= 12.5 (May actually need to check 12.3, as 12.4.1 on linux includes 2.3.2)

Run tests on Linux CUDA 11.2
Run tests on Linux CUDA 12.3
Run tests on Linux CUDA 12.4.1 (ships with 2.3.2, but documented as being 2.3.1)
Run tests on Linux CUDA 12.5
Run tests on Linux CUDA 12.8
Run tests on Windows CUDA 11.x (VS 2019, or older 2022 release)
Run tests on Linux CUDA 12.4.0 (VS 2022, ships with 2.3.1)
Run tests on Linux CUDA 12.5+ (VS 2022)

(Edited to reflect current status of the PR)

ptheywood · 2023-11-28T19:58:27Z

using CCCL with 11.0 is DOA:

For instance, CCCL requires a minimum supported version of 11.1 from the 11.x series due to an unavoidable compiler issue present in CTK 11.0.

To be honest, I don't strictly mind dropping 11.0 support (or 11.1, so we can just rely on 11.2 +'s stability promises).

I could make 11.0 work by using old cub/thrust from their respective locations, but eventually that will break. Will discuss this before I do anything else.

Robadob · 2023-11-28T20:00:31Z

so we can just rely on 11.2 +'s stability promises

This seems like the best plan, assuming you're not aware of any HPC with only super old CUDA available.

ptheywood · 2023-11-28T20:16:55Z

so we can just rely on 11.2 +'s stability promises

This seems like the best plan, assuming you're not aware of any HPC with only super old CUDA available.

Bessemer's central install is 11.0 or 11.1 iirc, but the driver is 12.2/3 compatible, so we can always grab the toolkit from conda (assuming cmake agrees...) (or open a ticket).
Stanage has at least one 12.x release, as does Bede. Unsure on JADE, but I'd be surpised/worried if the driver wasn't 12.x compatible.

Google colab is 11.2+ as well (the reason we were producing 11.0 wheels previously).

11.2 was released December 2020, 11.1 September 2020, and 11.0 June 2020.

Turns out 11.0 on linux compiles fine too, just windows it doesn't, so even if we don't "support" it, it currently works.

so it could be "11.2+ is supported, 11.0 & 11.1 may work under linux, but are not supported"

Robadob · 2023-11-28T20:18:30Z

Your choice tbh, you manage it. 11.2+ only feels simpler.

ptheywood · 2023-12-01T12:59:51Z

During meeting concluded that dropping CUDA 11.0 support in favour of future CCCL support is worthwhile / CUDA 11.2 is old enough to be a minimum.

I'll adjust CI to reflect this and test windows on 11.1+ at some point.

We don't need to drop 11.1, but probably simpler to just say 11.2+ as then our python wheels will be consistent with what we support.

ptheywood · 2023-12-01T17:26:21Z

Looks like the previous windows cuda 11.0 CI errors were actually a github actions windows-2019 vs windows-2022 difference.

i.e. visual studio 2019 issue?

2023-12-01T16:10:13.7088440Z ##[error]     1>D:\a\FLAMEGPU2\FLAMEGPU2\build\_deps\cccl-src\libcudacxx\include\cuda\std\detail\libcxx\include\__type_traits/is_constant_evaluated.h(27): error : identifier "__builtin_is_constant_evaluated" is undefined [D:\a\FLAMEGPU2\FLAMEGPU2\build\FLAMEGPU\flamegpu.vcxproj]

Robadob · 2023-12-01T17:28:22Z

i.e. visual studio 2019 issue?

I have that on my home desktop, can try fighting it at some point. Or we just sack off VS2019, given it doesnt built current jitify2-preprocess branch either.

ptheywood · 2023-12-04T16:49:50Z

It's still supported by CUDA, which libcucxx/CCCL claim to support the same platforms as so afiak it should work / be supported.
I had a quick look into the error message but didn't find any one with the particular error.

I think I've got vs 2019 installed too, just need to spend the time in windows at some point.

Looking at the full draft-release CI log, there is a windows-2019 CUDA 11.8 job which did pass, so its an older CUDA + vs2019 thing when doing things with libcu++'s type_traits (via cub should be our only inclusion of libcu++ headers currently).

Might be worth trying cccl on its own to see if we can repro it on vs2019 as well when testing locally (if the main issue can be repro'd locally, otherwise it'll be CI fun).

ptheywood · 2023-12-04T18:19:23Z

I do have visual studio 2019 installed, but only cuda 11.7 and 12.0 on windows currently.

Can select the cuda toolkit via -T from the command line, i.e. using 11.7 with vs 2019:

$ cmake .. -A x64 -G "Visual Studio 16 2019" -T cuda=11.7 -DCMAKE_CUDA_ARCHITECTURES=86 -DFLAMEGPU_BUILD_TESTS=ON
-- Selecting Windows SDK version 10.0.19041.0 to target Windows 10.0.22621.
... 
-- The CXX compiler identification is MSVC 19.29.30139.0
... 
-- Looking for a CUDA compiler - C:/Program Files/NVIDIA GPU Computing Toolkit/CUDA/v11.7/bin/nvcc.exe
...

And building with:

$ cmake --build . --target flamegpu tests -j 8 --config Release

This correctly configured, and built the current state of the CCCL branch (though an env var may be needed for RTC if the cuda on path older than the one selected)

Installing 11.2 to try and repro the issue, but unlikely to actualyl dig into any failures this evening if it does repro the error.

ptheywood · 2023-12-04T18:42:55Z

11.2 + visual studio 2019 has reproduced the error locally.

A quick attempt to configure cccl standalone failed as I'm missing a test suite dependency.

ptheywood · 2023-12-05T17:30:08Z

Reproduced the error using the CCCL example locally with visual studio 2019.

git clone [email protected]:nvidia/CCCL
cd CCCL/examples/example_project
mkdir build
cd build
cmake 
cmake .. -A x64 -G "Visual Studio 16 2019" -T cuda=11.2 -DCMAKE_CUDA_ARCHITECTURES=86
cmake --build .

$ cmake --build .
Microsoft (R) Build Engine version 16.11.2+f32259642 for .NET Framework
Copyright (C) Microsoft Corporation. All rights reserved.

  Compiling CUDA source file ..\example.cu...

  C:\Users\ptheywood\code\cccl\examples\example_project\build>"C:\Program Files\NVIDIA GPU Computing Toolkit\CUDA\v11.2
  \bin\nvcc.exe" -gencode=arch=compute_86,code=\"compute_86,compute_86\" -gencode=arch=compute_86,code=\"sm_86,compute_
  86\" --use-local-env -ccbin "C:\Program Files (x86)\Microsoft Visual Studio\2019\Community\VC\Tools\MSVC\14.29.30133\
  bin\HostX64\x64" -x cu   -I"C:\Users\ptheywood\code\cccl\examples\example_project\build\_deps\cccl-src\thrust\thrust\
  cmake\..\.." -I"C:\Users\ptheywood\code\cccl\examples\example_project\build\_deps\cccl-src\libcudacxx\lib\cmake\libcu
  dacxx\..\..\..\include" -I"C:\Users\ptheywood\code\cccl\examples\example_project\build\_deps\cccl-src\cub\cub\cmake\.
  .\.." -I"C:\Program Files\NVIDIA GPU Computing Toolkit\CUDA\v11.2\include"     --keep-dir x64\Debug  -maxrregcount=0
   --machine 64 --compile -cudart static -Xcompiler="/EHsc -Zi -Ob0" -g  -D_WINDOWS -DTHRUST_HOST_SYSTEM=THRUST_HOST_SY
  STEM_CPP -DTHRUST_DEVICE_SYSTEM=THRUST_DEVICE_SYSTEM_CUDA -D"CMAKE_INTDIR=\"Debug\"" -D_MBCS -D"CMAKE_INTDIR=\"Debug\
  "" -Xcompiler "/EHsc /W3 /nologo /Od /Fdexample_project.dir\Debug\vc142.pdb /FS /Zi /RTC1 /MDd /GR" -o example_projec
  t.dir\Debug\example.obj "C:\Users\ptheywood\code\cccl\examples\example_project\example.cu"
C:\Users\ptheywood\code\cccl\examples\example_project\build\_deps\cccl-src\libcudacxx\include\cuda\std\detail\libcxx\in
clude\__type_traits/is_constant_evaluated.h(31): error : identifier "__builtin_is_constant_evaluated" is undefined [C:\
Users\ptheywood\code\cccl\examples\example_project\build\example_project.vcxproj]

C:\Users\ptheywood\code\cccl\examples\example_project\build\_deps\cccl-src\libcudacxx\include\cuda\std\detail\libcxx\in
clude\__type_traits/is_constant_evaluated.h(36): error : identifier "__builtin_is_constant_evaluated" is undefined [C:\
Users\ptheywood\code\cccl\examples\example_project\build\example_project.vcxproj]

  2 errors detected in the compilation of "C:/Users/ptheywood/code/cccl/examples/example_project/example.cu".
  example.cu
C:\Program Files (x86)\Microsoft Visual Studio\2019\Community\MSBuild\Microsoft\VC\v160\BuildCustomizations\CUDA 11.2.t
argets(785,9): error MSB3721: The command ""C:\Program Files\NVIDIA GPU Computing Toolkit\CUDA\v11.2\bin\nvcc.exe" -gen
code=arch=compute_86,code=\"compute_86,compute_86\" -gencode=arch=compute_86,code=\"sm_86,compute_86\" --use-local-env
-ccbin "C:\Program Files (x86)\Microsoft Visual Studio\2019\Community\VC\Tools\MSVC\14.29.30133\bin\HostX64\x64" -x cu
  -I"C:\Users\ptheywood\code\cccl\examples\example_project\build\_deps\cccl-src\thrust\thrust\cmake\..\.." -I"C:\Users\
ptheywood\code\cccl\examples\example_project\build\_deps\cccl-src\libcudacxx\lib\cmake\libcudacxx\..\..\..\include" -I"
C:\Users\ptheywood\code\cccl\examples\example_project\build\_deps\cccl-src\cub\cub\cmake\..\.." -I"C:\Program Files\NVI
DIA GPU Computing Toolkit\CUDA\v11.2\include"     --keep-dir x64\Debug  -maxrregcount=0  --machine 64 --compile -cudart
 static -Xcompiler="/EHsc -Zi -Ob0" -g  -D_WINDOWS -DTHRUST_HOST_SYSTEM=THRUST_HOST_SYSTEM_CPP -DTHRUST_DEVICE_SYSTEM=T
HRUST_DEVICE_SYSTEM_CUDA -D"CMAKE_INTDIR=\"Debug\"" -D_MBCS -D"CMAKE_INTDIR=\"Debug\"" -Xcompiler "/EHsc /W3 /nologo /O
d /Fdexample_project.dir\Debug\vc142.pdb /FS /Zi /RTC1 /MDd /GR" -o example_project.dir\Debug\example.obj "C:\Users\pth
eywood\code\cccl\examples\example_project\example.cu"" exited with code 1. [C:\Users\ptheywood\code\cccl\examples\examp
le_project\build\example_project.vcxproj]

I'll set up a quick reproducer CI repo to pin down the affected CUDA versions and report it upstream.

ptheywood · 2023-12-05T19:11:54Z

CUDA 12.3 vs 2022 tests pass.

CI sweep pinning down vs2019 + CUDA 11.x versions which exhibit the libcu++ compilation error, will report upstream tomorrow once versions are known. 11.7 works, 11.2 doesn't.

CCCL/libcu++ includes some msvc conditions for using the offending symbol for an older vs 2019 version(s) + other combos, so it could be that a more recent vs 2019 sub-update (locally have 1929, 1924 was the prev version with a workaround) so that might be relevant.

https://github.com/ptheywood/cccl-is-constant-evaluated-mwe/actions/runs/7105395494/job/19342497821

ptheywood · 2023-12-06T12:30:15Z

CUDA 11.3+ is fine with visual studio 2019, so its just 11.2 (and 11.1) which breaks for us. This would prevent us from producing 11.2 wheels.

We can't drop 11.2 yet, as its the version installed on google colab iirc (and I'd rather not either).

I've reported this upstream: NVIDIA/cccl#1179

ptheywood · 2023-12-06T14:42:45Z

Upstream has a PR in to fix this.

The simplest way to incorporate this into the cmake logic would be to just make our minimum the next release post merge, but I'm not sure when that would be.
If we want it sooner we could pin agaisnt the post merge hash pre tag, and just adjust our min version to what it should be (might be

Additionally, if we depend on something newer than 2.2.0 CCCL treats itself as system headers, even when not using isystem which will be good for our relatively strong warning levels.

ptheywood · 2024-02-28T18:43:20Z

CCCL 2.3.0 has been released on github: https://github.com/NVIDIA/cccl/releases/tag/v2.3.0

This should include the fixes we require, so making our min CCCL 2.3.0 and fetching newer if not found should be ok, but worth checking that both required fixes made it into this release.

ptheywood · 2024-02-29T11:56:09Z

The v2.3.0 tagged commit does not include the cmake fix or msvc fixes, although they were backported to the branch/v2.3.x branch.

There's a v2.3.1 tagged commit which also does not include these fixes, so presumably we need to wait for 2.3.2 or 2.4.0

We'll probably just need to keep the first find_package(COMPONENTS) workaround in place anyway, as CUDA 12.3 includes CCCL 2.2 for instance just to avoid any problems.

ptheywood · 2024-03-07T10:32:24Z

CUDA 12.4 has been released, which includes CCCL 2.3.1 according to the release notes.
Checking /usr/local/cuda-12.4/lib64/cmake/cccl/cccl-config.cmake this still does not contain the cmake fix we require (i.e. it lines up with the v2.3.1 tag on github) so we still need to wait for 2.3.2 or 2.4 before as our minimum cccl that will work on windows and not cause errors on re-finding in CMake.

    if (TARGET Thrust::Thrust AND NOT CCCL::Thrust)

ptheywood · 2024-03-13T10:15:06Z

CCCL v2.3.2 has just been tagged / released, which does include the 2 fixes we need, so it should now be possible to switch to this / this shouldn't be blocked any more.

https://github.com/NVIDIA/cccl/releases/tag/v2.3.2

ptheywood · 2024-06-24T11:52:32Z

CCCL 2.5.0 has been released on github, mostly fixes but also some potentially interesting additions (but not yet safe to use).

Shouldn't be a need to bump our minimum/fetched version to this in the PR though, 2.3.2 should be fine still (unless i've misssed something)

This PR is more or less good to go, just wanting to re-run windows testing with it requiring 2.3.2 just in case (Though I believe it would be fine). And a rebase would prolly be worthwhile.

Merging this prior to a non pre-release would be best, due to dropping CUDA 11.0/11.1 support.

ptheywood · 2025-03-25T10:39:45Z

Rebased to include more recent changes / fix conflicts, although history needs tidying and should probably check the diff is as it should be.

I did make it a hard 11.2+ check in CMake rather than a flexible 11.0 and 11.1 might work but no longer tested, as jitify2 on windows on 11.0 and 11.1 hits compiler errors.

Needs re-testing in practice, but a separate CI run (cccl-rebase-2025) passed last week so it compiles after the changes atleast.

Given the following CCCL versions which ship with different CUDA toolkits, and our min CCCL of 2.3.2 (for cmake/msvc fixees) we shoudl check <= 12.4 and > 12.4.

CUDA	CCCL/Thrust
11.8	1.15.1
12.0	2.0.1
12.1	2.0.1
12.2	2.0.1
12.3	2.2.0
12.4	2.3.1 (12.4.1. ships 2.3.2 on linux via apt atleast)
12.5	2.4.0
12.6	2.5.0
12.8	2.7.0

…>= 2.3.2 - Does not remove all code/workarounds/checks related to 11.0 and 11.1, i.e. in case the previsouly compiler issues re-emerge - Fixes some typos in nearby code - CCCL >= 2.3.2 is required due for CMake and MSVC fixes in previous CCCL releases - Removes known issues for CUDA <= 11.1 from the readme Closes #1021

ptheywood · 2025-03-25T19:07:41Z

When configuring on windows with VS 2022 and CUDA 12.3-12.6 installed, and explicitly requesting CUDA 12.4 via:

$ cmake  -S . -B build-cu124-vs22 -A x64 -G "Visual Studio 17 2022" -T cuda=12.4 -DCMAKE_CUDA_ARCHITECTURES=86 -DFLAMEGPU_BUILD_TESTS=ON

This correctly finds CUDA 12.4, but is finding thrust/cub/cccl/ from the 12.6 install.

-- Found CUDAToolkit: C:/Program Files/NVIDIA GPU Computing Toolkit/CUDA/v12.4/include (found version "12.4.99")
-- Found Thrust: C:/Program Files/NVIDIA GPU Computing Toolkit/CUDA/v12.6/lib/cmake/thrust/thrust-config.cmake (found suitable exact version "2.5.0.0")
-- Found CUB: C:/Program Files/NVIDIA GPU Computing Toolkit/CUDA/v12.6/lib/cmake/cub/cub-config.cmake (found suitable version "2.5.0.0", minimum required is "2.5.0.0")
-- Found CCCL: C:/Program Files/NVIDIA GPU Computing Toolkit/CUDA/v12.6/lib/cmake/cccl/cccl-config.cmake (found suitable version "2.5.0.0", minimum required is "2.3.2")

The same occurs for 12.5, implying it is not just when the minimum version is not found.

Unsetting the CUDA_HOME env var which was pointed at 12.6 did not help.
Removing 12.6 from my PATH (which was the first CUDA toolkit on my path) made it find the version distributed with 12.5 instead.
Specifying NO_DEFAULT_PATH prevents it from finding CCCL with the current hints

This suggests the HINTS values are incorrect on windows.

On windows, for CUDA 12.4 cccl-config.cmake is located at /c/Program Files/NVIDIA GPU Computing Toolkit/CUDA/v12.4/lib/cmake/cccl/cccl-config.cmake, which does not match with the provided hints of:

C:/Program Files/NVIDIA GPU Computing Toolkit/CUDA/v12.4/include
C:/Program Files/NVIDIA GPU Computing Toolkit/CUDA/v12.4/lib/x64/cmake

Because CUDAToolkit_LIBRARY_DIR is C:/Program Files/NVIDIA GPU Computing Toolkit/CUDA/v12.4/lib/x64 in windows (vs /usr/local/cuda-12.4/lib64 on linux, which does contain the cmake directory)

Adding ${CUDAToolkit_LIBRARY_DIR}/../cmake as a hint results in the correct behaviour for 12.5, but for 12.4 it still finds the version from the PATH (12.6) as the hinted version is insufficient.
This is probably OK, but might cause issues if a user has CUDA 12.x and 13.x installed (although how I'm specifying the version might actually force CUDA 13.x to fetch the specified version of CCCL anyway...)

ptheywood · 2025-03-25T20:06:57Z

I'm confident this is worth reviewing / ready to go now.

I cannot test CUDA < 12.4 on windows though, as my VS2019 install is very broken and I'm not willing to spend the time/effort to fix it (which worst case would require a format based on Rob's past experience with broken VS).

VS2019 CI shows the correct fetching behaviour and it compiles fine, so I'm confident enough to skip manually testing that. VS2022 with CUDA 12.4 and 12.5 behaves as intended and all tests pass.

ptheywood force-pushed the cccl branch from e07e869 to 7029f8f Compare December 1, 2023 15:06

ptheywood changed the title ~~Switch to CCCL 2.2.0+ from Thrust/Cub 1.x~~ Remove CUDA 11.0/11.1 support and upgrade to CCCL 2.2.0+ Dec 1, 2023

ptheywood force-pushed the cccl branch 3 times, most recently from df8f583 to e947497 Compare December 1, 2023 15:16

ptheywood added the blocked label Dec 11, 2023

ptheywood force-pushed the cccl branch from 1f8ac42 to 1f2fa9f Compare December 14, 2023 13:27

ptheywood mentioned this pull request Dec 16, 2023

Changelog for 2.0.0-rc1 #1159

Merged

3 tasks

ptheywood added this to the 2.0.0-rc2 milestone Jan 12, 2024

ptheywood force-pushed the cccl branch from 1f2fa9f to 4ea6196 Compare February 29, 2024 11:56

ptheywood removed the blocked label Mar 13, 2024

This was referenced Nov 19, 2024

CMake: Ensure fetched version of pacakges are found #1249

Open

Post 2.0.0-rc.2 release activites #1250

Open

ptheywood mentioned this pull request Mar 12, 2025

Migrate to Jitify2 #1150

Draft

13 tasks

ptheywood force-pushed the cccl branch from dc75bb6 to be271af Compare March 25, 2025 10:33

ptheywood force-pushed the cccl branch from be271af to 211d15e Compare March 25, 2025 12:51

ptheywood changed the title ~~Remove CUDA 11.0/11.1 support and upgrade to CCCL 2.2.0+~~ Remove CUDA 11.0/11.1 support and upgrade to CCCL 2.3.2+ Mar 25, 2025

ptheywood force-pushed the cccl branch from 211d15e to 87c2621 Compare March 25, 2025 19:07

ptheywood marked this pull request as ready for review March 25, 2025 20:01

ptheywood requested review from Robadob and mondus March 25, 2025 20:07

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Remove CUDA 11.0/11.1 support and upgrade to CCCL 2.3.2+ #1155

Remove CUDA 11.0/11.1 support and upgrade to CCCL 2.3.2+ #1155

ptheywood commented Nov 28, 2023 •

edited

Loading

ptheywood commented Nov 28, 2023

Robadob commented Nov 28, 2023

ptheywood commented Nov 28, 2023

Robadob commented Nov 28, 2023

ptheywood commented Dec 1, 2023

ptheywood commented Dec 1, 2023

Robadob commented Dec 1, 2023

ptheywood commented Dec 4, 2023

ptheywood commented Dec 4, 2023

ptheywood commented Dec 4, 2023

ptheywood commented Dec 5, 2023

ptheywood commented Dec 5, 2023

ptheywood commented Dec 6, 2023

ptheywood commented Dec 6, 2023

ptheywood commented Feb 28, 2024

ptheywood commented Feb 29, 2024

ptheywood commented Mar 7, 2024

ptheywood commented Mar 13, 2024 •

edited

Loading

ptheywood commented Jun 24, 2024 •

edited

Loading

ptheywood commented Mar 25, 2025 •

edited

Loading

ptheywood commented Mar 25, 2025

ptheywood commented Mar 25, 2025

Remove CUDA 11.0/11.1 support and upgrade to CCCL 2.3.2+ #1155

Are you sure you want to change the base?

Remove CUDA 11.0/11.1 support and upgrade to CCCL 2.3.2+ #1155

Conversation

ptheywood commented Nov 28, 2023 • edited Loading

ptheywood commented Nov 28, 2023

Robadob commented Nov 28, 2023

ptheywood commented Nov 28, 2023

Robadob commented Nov 28, 2023

ptheywood commented Dec 1, 2023

ptheywood commented Dec 1, 2023

Robadob commented Dec 1, 2023

ptheywood commented Dec 4, 2023

ptheywood commented Dec 4, 2023

ptheywood commented Dec 4, 2023

ptheywood commented Dec 5, 2023

ptheywood commented Dec 5, 2023

ptheywood commented Dec 6, 2023

ptheywood commented Dec 6, 2023

ptheywood commented Feb 28, 2024

ptheywood commented Feb 29, 2024

ptheywood commented Mar 7, 2024

ptheywood commented Mar 13, 2024 • edited Loading

ptheywood commented Jun 24, 2024 • edited Loading

ptheywood commented Mar 25, 2025 • edited Loading

ptheywood commented Mar 25, 2025

ptheywood commented Mar 25, 2025

ptheywood commented Nov 28, 2023 •

edited

Loading

ptheywood commented Mar 13, 2024 •

edited

Loading

ptheywood commented Jun 24, 2024 •

edited

Loading

ptheywood commented Mar 25, 2025 •

edited

Loading