Skip to content

Conversation

@KornevNikita
Copy link
Contributor

@KornevNikita KornevNikita requested a review from a team as a code owner November 25, 2025 17:11
@KornevNikita
Copy link
Contributor Author

@vinser52 @intel/llvm-reviewers-runtime could you take a look please.

Comment on lines +346 to +355
size_t device::ext_oneapi_index_within_platform() const {
auto devices = get_platform().get_devices();
auto it = std::find(devices.begin(), devices.end(), *this);
if (it == devices.end())
throw sycl::exception(sycl::make_error_code(errc::invalid),
"this device is not a root device");

size_t index = std::distance(devices.begin(), it);
return index;
}
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The goal of the extension is to get the device index in an efficient way. Do we really need to scan the vector returned by platform::get_devices() everytime? Can we cache the result of the scan on the first call to the ext_oneapi_index_within_platform and return the cached value on all subsequent calls?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah, looks like I've implemented what we wanted to avoid.

Anyways, with cache the first run is still expensive. I'd like to hear from @gmlueck - could you please tell, what was the idea regarding improving the efficiency? To use L0 and OCL calls?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Anyways, with cache the first run is still expensive.

Devices are only created once, together with the platform I believe. Maybe we could store the index inside the device_impl when we create the platforms and their devices?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, this is what I had in mind. Store each device's index when the device_impl is first created. Then, we can retrieve the index in O(1) time.

For the reverse operation (index to device), I assume we already store a vector of devices for each platform? If so, converting from index to device should be a simple index operation into this vector.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants