-
Notifications
You must be signed in to change notification settings - Fork 193
GPU compute API abstraction on top of WebGPU #3238
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
clang-tidy made some suggestions
There were too many comments to post at once. Showing the first 25 out of 83. Check the log or trigger a new build to see more.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
clang-tidy made some suggestions
There were too many comments to post at once. Showing the first 25 out of 58. Check the log or trigger a new build to see more.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
clang-tidy made some suggestions
There were too many comments to post at once. Showing the first 25 out of 33. Check the log or trigger a new build to see more.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
clang-tidy made some suggestions
There were too many comments to post at once. Showing the first 25 out of 33. Check the log or trigger a new build to see more.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
clang-tidy made some suggestions
There were too many comments to post at once. Showing the first 25 out of 33. Check the log or trigger a new build to see more.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
clang-tidy made some suggestions
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
clang-tidy made some suggestions
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
clang-tidy made some suggestions
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
clang-tidy made some suggestions
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
clang-tidy made some suggestions
4e205c0 to
24b7532
Compare
This a generic GPGPU compute abstraction built on top of WebGPU. To run operations on the GPU, shaders need to be written using the Slang programming language. See https://dawn.googlesource.com/dawn See https://shader-slang.org/
DOWNLOAD_EXTRACT_TIMESTAMP is only available on CMake >=3.24. This change make CMake ignore the option on older versions.
We also set CMP0135 policy behaviour to NEW to fix warnings.
We no longer require this. If the user of the API wants to write arbitrary data into a GPU buffer, they construct a span object and use the other overload with `tcb::span`.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
clang-tidy made some suggestions
Include device limits in DeviceInfo and enforce 4-byte alignment for buffer write offsets and sizes. Validate uniform buffer offsets against minUniformBufferOffsetAlignment.
Previously we looked up the requested entry point of a Slang shader by name, but then assumed entry point index 0 when: - extracting WGSL and computing the shader-cache key - reflecting bindings and compute workgroup size from ProgramLayout With multiple entry points in the module, kernels could silently compile/cache WGSL for the wrong entry point or reflect bindings/workgroup size from the wrong entry point. This fixes the problem by selecting the intended entry point from the linked slang::ProgramLayout by matching nameOverride/name, then using its index for WGSL extraction and hashing. Also remove the TODOs regarding supporting multiple entry points. Our shader compilation requires a given WebGPU kernel is tied to a single entry point, so supporting multiple entry points is not needed.
This reverts commit 0428640.
Co-authored-by: Robert Smith <[email protected]>
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
clang-tidy made some suggestions
This solution doesn't seem to work in all cases. For example, it doesn't guarantee successfully creating the Vulkan-backed WebGPU instance in our CI tests. For now, we remove this and we'll investigate a more appopriate solution later.
|
I've now addressed all feedback in this PR. I've also removed all TODOs in the code by fixing the issues, except in the case of obtaining a Vulkan-backed WebGPU instance for MSYS2 environments (in which case I've just deleted the logic to manually copy the Vulkan dll). As mentioned earlier, this is not blocking, and we can investigate later what can be done about it. |
|
We discussed in the meeting today to gatekeep the addition of this PR (and future PRs related to GPU work), under a new CMake configuration option |
|
@MRtrix3/mrtrix3-devs Unless there are any further comments regarding this PR, I will merge this tomorrow. |
This PR supersedes #3096.
It introduces a new GPU compute abstraction built on top of WebGPU. See #3096 for general motivation and design philosophy. A notable change from that PR is now shaders are now required to be written in Slang, a new programming language created by NVidia (now under the umbrella of the Khronos group). This choice has been motivated by the fact that Slang provides many useful features like modules and generics for better code reusability and modularisation.
Some issues still need to be resolved (in future PRs):
MR::Imageinstances to/from GPU (currently the API only supportsMR::Image<float>).One additional thing that this PR introduces is the addition of a new
tcb::spanclass (see #3219); however, unlike for other third-party dependencies, the class has been directly added to the codebase rather than fetched via CMake. The hope is that this class will become redundant once we have access to C++20'sstd::span.