-
Notifications
You must be signed in to change notification settings - Fork 795
Open
Labels
bugSomething isn't workingSomething isn't workinghelp wantedWe don't have ability to look into this at the moment, but contributions are welcomeWe don't have ability to look into this at the moment, but contributions are welcome
Description
Describe the bug
Running Sycl E2E tests that compile OpenCL kernels at runtime fails when the system has 2 distinct L0 capable GPUs (e.g. a Battlemage GPU and an iGPU).
This happens because sycl uses different flags in this scenario. If a single device is used, sycl will just pass the -device
flag to ocloc
. However, when there are distinct devices, sycl passes a list of extensions instead which triggers the bug. This logic can be found in kernel_compiler_opencl.cpp#L257
Compilation error log: ocloc_compilation_error.log
To reproduce
1- Use a server that contains 2 distinct intel GPUs. For example:
fabio@ed-dlpc-2e11:~/projects/dpcpp/llvm/cmake-build-l0-release-slurm-bmg/bin$ ./sycl-ls
[level_zero:gpu][level_zero:0] Intel(R) oneAPI Unified Runtime over Level-Zero, Intel(R) Arc(TM) B580 Graphics 20.1.0 [1.6.32536]
[level_zero:gpu][level_zero:1] Intel(R) oneAPI Unified Runtime over Level-Zero, Intel(R) UHD Graphics 770 12.2.0 [1.6.32536]
2 - Compile DPCPP with L0 support
3- Run E2E tests that compile OpenCL kernels using kernel bundles :
cd <build-dir>/tools/sycl/test-e2e
../../../bin/llvm-lit -sva RawKernelArg
Environment
- OS: Linux
- Target device and vendor: System with both intel iGPU and a Battlemage GPU.
- DPC++ version: 194ec74
- Dependencies version:
fabio@ed-dlpc-2e11:~/projects/dpcpp/llvm/cmake-build-l0-release-slurm-bmg/bin$ ./sycl-ls --verbose
[level_zero:gpu][level_zero:0] Intel(R) oneAPI Unified Runtime over Level-Zero, Intel(R) Arc(TM) B580 Graphics 20.1.0 [1.6.32536]
[level_zero:gpu][level_zero:1] Intel(R) oneAPI Unified Runtime over Level-Zero, Intel(R) UHD Graphics 770 12.2.0 [1.6.32536]
Platforms: 1
Platform [#1]:
Version : 1.6
Name : Intel(R) oneAPI Unified Runtime over Level-Zero
Vendor : Intel(R) Corporation
Devices : 2
Device [#0]:
Type : gpu
Version : 20.1.0
Name : Intel(R) Arc(TM) B580 Graphics
Vendor : Intel(R) Corporation
Driver : 1.6.32536
UUID : 13412811226000030000000
DeviceID : 57867
Num SubDevices : 0
Num SubSubDevices : 0
Aspects : gpu fp16 fp64 online_compiler online_linker queue_profiling usm_device_allocations usm_host_allocations usm_shared_allocations ext_intel_pci_address ext_intel_gpu_eu_count ext_intel_gpu_eu_simd_width ext_intel_gpu_slices ext_intel_gpu_subslices_per_slice ext_intel_gpu_eu_count_per_subslice atomic64 ext_intel_device_info_uuid ext_intel_gpu_hw_threads_per_eu ext_oneapi_cuda_async_barrier ext_intel_free_memory ext_intel_device_id ext_intel_memory_clock_rate ext_intel_memory_bus_width ext_intel_legacy_image ext_oneapi_bindless_images ext_oneapi_bindless_images_1d_usm ext_oneapi_bindless_images_2d_usm ext_intel_esimd ext_oneapi_ballot_group ext_oneapi_fixed_size_group ext_oneapi_opportunistic_group ext_oneapi_tangle_group ext_intel_matrix ext_oneapi_graph ext_oneapi_limited_graph ext_oneapi_private_alloca ext_oneapi_queue_profiling_tag ext_oneapi_virtual_mem ext_oneapi_virtual_functions
info::device::sub_group_sizes: 16 32
Architecture: intel_gpu_bmg_g21
Device [#1]:
Type : gpu
Version : 12.2.0
Name : Intel(R) UHD Graphics 770
Vendor : Intel(R) Corporation
Driver : 1.6.32536
UUID : 134128128167400002000000
DeviceID : 42880
Num SubDevices : 0
Num SubSubDevices : 0
Aspects : gpu fp16 online_compiler online_linker queue_profiling usm_device_allocations usm_host_allocations usm_shared_allocations ext_intel_pci_address ext_intel_gpu_eu_count ext_intel_gpu_eu_simd_width ext_intel_gpu_slices ext_intel_gpu_subslices_per_slice ext_intel_gpu_eu_count_per_subslice atomic64 ext_intel_device_info_uuid ext_intel_gpu_hw_threads_per_eu ext_oneapi_cuda_async_barrier ext_intel_device_id ext_intel_memory_clock_rate ext_intel_memory_bus_width ext_intel_legacy_image ext_intel_esimd ext_oneapi_ballot_group ext_oneapi_fixed_size_group ext_oneapi_opportunistic_group ext_oneapi_tangle_group ext_oneapi_limited_graph ext_oneapi_private_alloca ext_oneapi_queue_profiling_tag ext_oneapi_virtual_mem ext_oneapi_virtual_functions
info::device::sub_group_sizes: 8 16 32
Architecture: intel_gpu_adl_s
default_selector() : gpu, Intel(R) oneAPI Unified Runtime over Level-Zero, Intel(R) Arc(TM) B580 Graphics 20.1.0 [1.6.32536]
accelerator_selector() : No device of requested type available.
cpu_selector() : No device of requested type available.
gpu_selector() : gpu, Intel(R) oneAPI Unified Runtime over Level-Zero, Intel(R) Arc(TM) B580 Graphics 20.1.0 [1.6.32536]
custom_selector(gpu) : gpu, Intel(R) oneAPI Unified Runtime over Level-Zero, Intel(R) Arc(TM) B580 Graphics 20.1.0 [1.6.32536]
custom_selector(cpu) : No device of requested type available.
custom_selector(acc) : No device of requested type available.
Additional context
No response
Metadata
Metadata
Assignees
Labels
bugSomething isn't workingSomething isn't workinghelp wantedWe don't have ability to look into this at the moment, but contributions are welcomeWe don't have ability to look into this at the moment, but contributions are welcome