-
Notifications
You must be signed in to change notification settings - Fork 752
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[SYCL][DOC] Add spec and design for "if_device" #8917
Open
gmlueck
wants to merge
6
commits into
intel:sycl
Choose a base branch
from
gmlueck:gmlueck/if-device
base: sycl
Could not load branches
Branch not found: {{ refName }}
Loading
Could not load tags
Nothing to show
Loading
Are you sure you want to change the base?
Some commits from the old base branch may be removed from the timeline,
and old review comments may become outdated.
Open
Changes from 3 commits
Commits
Show all changes
6 commits
Select commit
Hold shift + click to select a range
0f0c8ec
[SYCL][DOC] Add spec and design for "if_device"
gmlueck 9597c92
Allow function pointer to "if_host"
gmlueck 169c6c8
Merge branch 'sycl' into gmlueck/if-device
gmlueck 9cbc078
Use perfect forwarding for callable
gmlueck f3e15a7
Use std::forward
gmlueck f636482
Update overview and add example
gmlueck File filter
Filter by extension
Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
There are no files selected for viewing
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,179 @@ | ||
# Implementation design for "if\_device" | ||
|
||
This document describes the design for the DPC++ implementation of the | ||
[sycl\_ext\_oneapi\_if\_device][1] extension. | ||
|
||
[1]: <../extensions/proposed/sycl_ext_oneapi_if_device.asciidoc> | ||
|
||
|
||
## Phased implementation | ||
|
||
Although the main motivation for the "if\_device" extension is to enable a | ||
1-pass compiler, it can still be implemented in our existing multi-pass | ||
compiler. This is useful because it allows us to gain experience using this | ||
extension even before we implement the 1-pass compiler. | ||
|
||
This document, therefore, describes two implementations. The first is a | ||
trivial implementation that works in the current multi-pass compiler. The | ||
other is the design that we will ultimately use in the 1-pass compiler. | ||
|
||
|
||
## Multi-pass compiler implementation | ||
|
||
This implementation requires changes only to the device headers. The | ||
implementation is very trivial, leveraging the existing `__SYCL_DEVICE_ONLY__` | ||
macro which is defined differently in the host compiler pass vs. the device | ||
compiler passes. | ||
|
||
``` | ||
namespace sycl::ext::oneapi::experimental { | ||
namespace detail { | ||
|
||
// Helper object used to implement "otherwise". The "MakeCall" template | ||
// parameter tells whether the previous call to "if_device" or "if_host" called | ||
// its "fn". When "MakeCall" is true, the previous call to "fn" did not | ||
// happen, so the "otherwise" should call "fn". | ||
template<bool MakeCall> | ||
class if_device_or_host_helper { | ||
public: | ||
template<typename T> | ||
void otherwise(T fn) { | ||
if constexpr (MakeCall) { | ||
fn(); | ||
} | ||
} | ||
}; | ||
|
||
} // namespace detail | ||
|
||
template<typename T> | ||
static auto if_device(T fn) { | ||
#ifdef __SYCL_DEVICE_ONLY__ | ||
fn(); | ||
return detail::if_device_or_host_helper<false>{}; | ||
#else | ||
return detail::if_device_or_host_helper<true>{}; | ||
#endif | ||
} | ||
|
||
template<typename T> | ||
static auto if_host(T fn) { | ||
#ifdef __SYCL_DEVICE_ONLY__ | ||
return detail::if_device_or_host_helper<true>{}; | ||
#else | ||
fn(); | ||
return detail::if_device_or_host_helper<false>{}; | ||
#endif | ||
} | ||
|
||
} // namespace sycl::ext::oneapi::experimental | ||
``` | ||
|
||
|
||
## Single-pass compiler implementation | ||
|
||
This implementation requires changes to the device headers, some changes to | ||
the error diagnostics in the front-end (CFE), and a new IR pass. | ||
|
||
### Device headers | ||
|
||
The device headers translate the API into calls to two functions that are | ||
decorated with attributes named "sycl-call-if-on-device" and | ||
"sycl-call-if-on-host". | ||
|
||
``` | ||
namespace sycl::ext::oneapi::experimental { | ||
namespace detail { | ||
|
||
// Call the callable object "fn" only when this code runs on a device. | ||
// | ||
// IR passes recognize this function from the "sycl-call-if-on-device" | ||
// attribute. | ||
template<typename T> | ||
[[clang::noinline]] | ||
[[__sycl_detail__::add_ir_attributes_function("sycl-call-if-on-device", true)]] | ||
void call_if_on_device(T fn) { | ||
fn(); | ||
} | ||
|
||
// Call the callable object "fn" only when this code runs on the host. | ||
// | ||
// IR passes recognize this function from the "sycl-call-if-on-host" attribute. | ||
template<typename T> | ||
[[clang::noinline]] | ||
[[__sycl_detail__::add_ir_attributes_function("sycl-call-if-on-host", true)]] | ||
void call_if_on_host(T fn) { | ||
fn(); | ||
} | ||
|
||
class call_if_on_device_helper { | ||
public: | ||
template<typename T> | ||
void otherwise(T fn) { | ||
call_if_on_device(fn); | ||
} | ||
}; | ||
|
||
class call_if_on_host_helper { | ||
public: | ||
template<typename T> | ||
void otherwise(T fn) { | ||
call_if_on_host(fn); | ||
} | ||
}; | ||
|
||
} // namespace detail | ||
|
||
template<typename T> | ||
static auto if_device(T fn) { | ||
detail::call_if_on_device(fn); | ||
return detail::call_if_on_host_helper{}; | ||
} | ||
|
||
template<typename T> | ||
static auto if_host(T fn) { | ||
detail::call_if_on_host(fn); | ||
return detail::call_if_on_device_helper{}; | ||
} | ||
|
||
} // namespace sycl::ext::oneapi::experimental | ||
``` | ||
|
||
Note the use of `[[clang::noinline]]`. It is important that the bodies of | ||
these functions are not inlined until after the IR pass described below. | ||
|
||
### Changes to the front-end (CFE) | ||
|
||
The CFE currently diagnoses some errors that are specific to device code. To do | ||
this, the CFE must first traverse the static call tree to determine which | ||
functions are called from kernels. This pass of the CFE must recognize the | ||
functions marked with the attribute "sycl-call-if-on-host" and skip the bodies | ||
of these functions when building the static call tree of the kernels. As a | ||
result, the CFE will not emit any diagnostics that are specific to device code | ||
for the callable object that is passed to these functions. | ||
|
||
In a 1-pass compiler, we expect that the CFE will emit a single stream of | ||
LLVM IR for both host and device. This IR retains any calls to the functions | ||
marked with "sycl-call-if-on-host" or "sycl-call-if-on-device" and retains the | ||
full bodies of those functions. The filtering described above is used only to | ||
determine the functions that are checked for device-specific errors. | ||
|
||
### New IR pass | ||
|
||
The 1-pass compiler will eventually split the LLVM IR into two parts: one that | ||
contains the device code and one that contains the host code. We expect that | ||
this pass will traverse the static call tree of the kernels to identify device | ||
code. This pass also recognizes the functions marked with | ||
"sycl-call-if-on-host" and "sycl-call-if-on-device". When generating the IR | ||
for the device code, the bodies of functions marked "sycl-call-if-on-host" are | ||
deleted, leaving empty functions. When generating the IR for the host code, | ||
the bodies of functions marked "sycl-call-if-on-device" are deleted. | ||
|
||
Alternatively, the IR pass could use metadata from the CFE to identify host vs. | ||
device code, rather than repeating the static call tree traversal here. These | ||
details will be resolved later as part of the 1-pass compiler design. | ||
|
||
Up until this point, it was important to prevent inlining of the functions | ||
marked "sycl-call-if-on-host" and "sycl-call-if-on-device". Once the IR is | ||
split, inlining is permitted, so this IR pass also removes the LLVM IR | ||
`noinline` attributes from these functions. |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Oops, something went wrong.
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@rolandschulz
I think I do not need to use
std::forward
here, correct?There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Without the
std::forward
, if a r-value-ref gets passed toif_device
thencall_if_on_device
gets passed a l-value-ref. If the callable only works with an r-value-ref this break. Always useforward
for any universal reference (if you don't use it anymore afterwards).There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks. Added in f3e15a7.