Skip to content

Conversation

@daljit46
Copy link
Member

@daljit46 daljit46 commented Nov 26, 2025

This PR supersedes #3096.

It introduces a new GPU compute abstraction built on top of WebGPU. See #3096 for general motivation and design philosophy. A notable change from that PR is now shaders are now required to be written in Slang, a new programming language created by NVidia (now under the umbrella of the Khronos group). This choice has been motivated by the fact that Slang provides many useful features like modules and generics for better code reusability and modularisation.

Some issues still need to be resolved (in future PRs):

  • Deal with the issue raised in Add option for canonical direct I/O layout #3108.
  • Add more tests, especially for the upload/download of MR::Image instances to/from GPU (currently the API only supports MR::Image<float>).
  • Create a real-world example that illustrates the use of the API.

One additional thing that this PR introduces is the addition of a new tcb::span class (see #3219); however, unlike for other third-party dependencies, the class has been directly added to the codebase rather than fetched via CMake. The hope is that this class will become redundant once we have access to C++20's std::span.

@daljit46 daljit46 self-assigned this Nov 26, 2025
@daljit46 daljit46 requested review from a team and removed request for a team November 26, 2025 13:02
Copy link

@github-actions github-actions bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

clang-tidy made some suggestions

There were too many comments to post at once. Showing the first 25 out of 83. Check the log or trigger a new build to see more.

Copy link

@github-actions github-actions bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

clang-tidy made some suggestions

There were too many comments to post at once. Showing the first 25 out of 58. Check the log or trigger a new build to see more.

Copy link

@github-actions github-actions bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

clang-tidy made some suggestions

There were too many comments to post at once. Showing the first 25 out of 33. Check the log or trigger a new build to see more.

Copy link

@github-actions github-actions bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

clang-tidy made some suggestions

There were too many comments to post at once. Showing the first 25 out of 33. Check the log or trigger a new build to see more.

Copy link

@github-actions github-actions bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

clang-tidy made some suggestions

There were too many comments to post at once. Showing the first 25 out of 33. Check the log or trigger a new build to see more.

Copy link

@github-actions github-actions bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

clang-tidy made some suggestions

Copy link

@github-actions github-actions bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

clang-tidy made some suggestions

Copy link

@github-actions github-actions bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

clang-tidy made some suggestions

Copy link

@github-actions github-actions bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

clang-tidy made some suggestions

Copy link

@github-actions github-actions bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

clang-tidy made some suggestions

@daljit46 daljit46 force-pushed the webgpu branch 2 times, most recently from 4e205c0 to 24b7532 Compare November 27, 2025 14:25
This a generic GPGPU compute abstraction built on top of WebGPU. To run
operations on the GPU, shaders need to be written using the Slang
programming language.

See https://dawn.googlesource.com/dawn
See https://shader-slang.org/
DOWNLOAD_EXTRACT_TIMESTAMP is only available on CMake >=3.24. This
change make CMake ignore the option on older versions.
We also set CMP0135 policy behaviour to NEW to fix warnings.
We no longer require this. If the user of the API wants to write
arbitrary data into a GPU buffer, they construct a span object and use
the other overload with `tcb::span`.
Copy link

@github-actions github-actions bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

clang-tidy made some suggestions

daljit46 and others added 13 commits January 15, 2026 17:48
Include device limits in DeviceInfo and enforce 4-byte alignment
for buffer write offsets and sizes. Validate uniform buffer offsets
against minUniformBufferOffsetAlignment.
Previously we looked up the requested entry point of a Slang shader by
name, but then  assumed entry point index 0 when:
- extracting WGSL and computing the shader-cache key
- reflecting bindings and compute workgroup size from ProgramLayout

With multiple entry points in the module, kernels could silently
compile/cache WGSL for the wrong entry point or reflect
bindings/workgroup size from the wrong entry point. This fixes the
problem by selecting the intended entry point from the linked
slang::ProgramLayout  by matching nameOverride/name, then using its
index for WGSL extraction and hashing.

Also remove the TODOs regarding supporting multiple entry points. Our
shader compilation requires a given WebGPU kernel is tied to a single
entry point, so supporting multiple entry points is not needed.
Copy link

@github-actions github-actions bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

clang-tidy made some suggestions

This solution doesn't seem to work in all cases. For example, it doesn't
guarantee successfully creating the Vulkan-backed WebGPU instance in our
CI tests. For now, we remove this and we'll investigate a more appopriate
solution later.
@daljit46
Copy link
Member Author

I've now addressed all feedback in this PR. I've also removed all TODOs in the code by fixing the issues, except in the case of obtaining a Vulkan-backed WebGPU instance for MSYS2 environments (in which case I've just deleted the logic to manually copy the Vulkan dll). As mentioned earlier, this is not blocking, and we can investigate later what can be done about it.
All tests are passing, so this is ready for merging.

@daljit46
Copy link
Member Author

We discussed in the meeting today to gatekeep the addition of this PR (and future PRs related to GPU work), under a new CMake configuration option MRTRIX_ENABLE_GPU. This flag would enable/disable the fetching of WebGPU and Slang dependencies (in addition to the all cpp files added here). The motivation is that we want to release a version 3.1.0 of MRtrix3 that will (may?) not include the addition of this work and instead create a separate branch specificaly targeting that release.
It's fairly straightforward to add this flag, but @jdtournier mentioned that this flag should be OFF by default. One downside of that is this will disable CI tests for the GPU work (including future related PRs). So my proposal is to instead enable the flag by default and explicitly disable in the new release branch. Alternatively, we could explcitly enable the flag in the CI workflows, but I think that's a bit unncessary and more invasive.

@daljit46
Copy link
Member Author

daljit46 commented Feb 4, 2026

@MRtrix3/mrtrix3-devs Unless there are any further comments regarding this PR, I will merge this tomorrow.

@daljit46 daljit46 dismissed Lestropie’s stale review February 5, 2026 07:29

All feeback has been addressed

@daljit46 daljit46 merged commit 3035c90 into dev Feb 5, 2026
5 of 6 checks passed
@daljit46 daljit46 deleted the webgpu branch February 5, 2026 07:30
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants