-
Notifications
You must be signed in to change notification settings - Fork 798
[SYCL] Support sycl_ext_oneapi_platform_device_index #20758
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: sycl
Are you sure you want to change the base?
Conversation
|
@vinser52 @intel/llvm-reviewers-runtime could you take a look please. |
| size_t device::ext_oneapi_index_within_platform() const { | ||
| auto devices = get_platform().get_devices(); | ||
| auto it = std::find(devices.begin(), devices.end(), *this); | ||
| if (it == devices.end()) | ||
| throw sycl::exception(sycl::make_error_code(errc::invalid), | ||
| "this device is not a root device"); | ||
|
|
||
| size_t index = std::distance(devices.begin(), it); | ||
| return index; | ||
| } |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The goal of the extension is to get the device index in an efficient way. Do we really need to scan the vector returned by platform::get_devices() everytime? Can we cache the result of the scan on the first call to the ext_oneapi_index_within_platform and return the cached value on all subsequent calls?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yeah, looks like I've implemented what we wanted to avoid.
Anyways, with cache the first run is still expensive. I'd like to hear from @gmlueck - could you please tell, what was the idea regarding improving the efficiency? To use L0 and OCL calls?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Anyways, with cache the first run is still expensive.
Devices are only created once, together with the platform I believe. Maybe we could store the index inside the device_impl when we create the platforms and their devices?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes, this is what I had in mind. Store each device's index when the device_impl is first created. Then, we can retrieve the index in O(1) time.
For the reverse operation (index to device), I assume we already store a vector of devices for each platform? If so, converting from index to device should be a simple index operation into this vector.
https://github.com/intel/llvm/blob/sycl/sycl/doc/extensions/proposed/sycl_ext_oneapi_platform_device_index.asciidoc