The Intel Triton runtime enumerates the sycl devices list which maybe not consistent with the Torch runtime. #1916

chengjunlu · 2024-08-19T02:22:34Z

In the code Driver.c, we enumerate the SYCL devices list from the SYCL context directly and save it in an internal vector.

There maybe an issue that the IPEX uses the difference indexing to refer different SYCL device than Triton.

chengjunlu · 2024-08-23T02:20:38Z

Confirmed with Pytorch team.

The implementation of the upstream Pytorch is using the indexing of the order of the devices enumerated from the SYCL API as the torch device identity to refer the underlaying SYCL device.

No extra sorting the enumeration results.
No extra tiling and sub-partitioning on the SYCL devices.
No extra filter and reordering on iGPU and dGPU.

To support JIT the Triton kernel with Pytorch framework correctly, the Triton could enumerate the SYCL devices from SYCL runtime by the same practice. And the torch device identity should map to the same underlaying SYCL device correctly.

For long term, we want to decouple this logic of the Pytorch and Triton.
We propose that the Pytorch should supply a method to return the SYCL device without the assumption of how the SYCL devices are mapped.

alexbaden · 2024-08-23T12:52:20Z

In the NVIDIA backend the active device is loaded directly from PyTorch: https://github.com/triton-lang/triton/blob/main/python/triton/backends/driver.py#L29
There is also a method for getting the current stream, akin to the sycl::queue.
If we can retrieve both those objects from PyTorch at the appropriate time then it is not clear that we need the internal state we are storing in driver.c.

vlad-penkin added bug Something isn't working research labels Aug 19, 2024

vlad-penkin added this to the 0.3 [Triton] Language and Runtime milestone Aug 19, 2024

chengjunlu assigned chengjunlu, guangyey and EikanWang Aug 23, 2024

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

The Intel Triton runtime enumerates the sycl devices list which maybe not consistent with the Torch runtime. #1916

The Intel Triton runtime enumerates the sycl devices list which maybe not consistent with the Torch runtime. #1916

chengjunlu commented Aug 19, 2024

chengjunlu commented Aug 23, 2024 •

edited

Loading

alexbaden commented Aug 23, 2024

The Intel Triton runtime enumerates the sycl devices list which maybe not consistent with the Torch runtime. #1916

The Intel Triton runtime enumerates the sycl devices list which maybe not consistent with the Torch runtime. #1916

Comments

chengjunlu commented Aug 19, 2024

chengjunlu commented Aug 23, 2024 • edited Loading

alexbaden commented Aug 23, 2024

chengjunlu commented Aug 23, 2024 •

edited

Loading