-
Notifications
You must be signed in to change notification settings - Fork 752
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[SYCL][DOC] Update dot_acc extension spec #10113
base: sycl
Are you sure you want to change the base?
Conversation
Update the sycl_ext_oneapi_dot_accumulate extension spec to: * Use the latest specification template. * Document the "packed" APIs. These were previously shown in the "sample header" section, but there was no description. * Deprecate the old "packed" overloads and introduce new overloads with the more descriptive name `doc_acc_4x8packed`. This new name is consistent with the OpenCL C naming style for similar functions.
sycl/doc/extensions/supported/sycl_ext_oneapi_dot_accumulate.asciidoc
Outdated
Show resolved
Hide resolved
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Mostly LGTM, this is a nice improvement.
No blockers but here are a few comments to consider.
int32_t dot_acc(vec<int8_t,4> a, vec<int8_t,4> b, int32_t c); | ||
int32_t dot_acc(vec<int8_t,4> a, vec<uint8_t,4> b, int32_t c); | ||
int32_t dot_acc(vec<uint8_t,4> a, vec<int8_t,4> b, int32_t c); | ||
int32_t dot_acc(vec<uint8_t,4> a, vec<uint8_t,4> b, int32_t c); | ||
|
||
=== Add to Section 4.13.6 - Geometric Functions | ||
int32_t doc_acc_4x8packed(int32_t a, int32_t b, int32_t c); | ||
int32_t doc_acc_4x8packed(int32_t a, uint32_t b, int32_t c); | ||
int32_t doc_acc_4x8packed(uint32_t a, int32_t b, int32_t c); | ||
int32_t doc_acc_4x8packed(uint32_t a, uint32_t b, int32_t c); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I don't have a terribly strong opinion about either of these, but noting for completeness: If we want to align even closer to the related OpenCL extension we should consider the following changes:
- The OpenCL extension only has
dot_acc
functions that do the dot product and accumulate for the variants that saturate, otherwise the function is simplydot
, which does a dot product. - In addition to the return value difference, the type for the packed
a
andb
arguments is unconditionally an unsigned integer in the OpenCL extension. To disambiguate the different types the signedness of the argument is encoded in the function name (the return type is also), so for example if the packeda
is signed andb
is unsigned the full function isint dot_4x8packed_su_int(uint a, uint b)
.
Links for reference:
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I like the suggestion to use uint32_t
always for the "packed" parameters, so I also adopted the "_ss" (etc.) convention. Done in 84f1ff8.
It didn't make sense to me to adopt the final "_int" / "_uint" signifying the return type for the packed APIs unless we also adopt that convention for the unpacked APIs. I decided to leave it off for both.
I don't know why this extension added dot_acc
rather than just dot
. Dropping the "accumulate" operation seemed like a bigger change, so I decided not to do that here.
|1 | ||
|Initial version of this extension. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Do we really want to call this version one and the initial version? IMHO this loses history we might want to preserve. If we do call this version one would it be better to add it as a new document and move the existing document to a "deprecated" or "removed" directory so we don't lose track of it?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is an API version, not a document version. Normally, I would bump the API version since this PR adds new APIs. However, we never implemented any API versioning prior to this PR, and I can't go back in time and add it now. The best I can do is to start versioning at "1" now.
Regarding the loss of history ... this PR doesn't remove any of the old APIs. They are still retained as "deprecated", and this extension document still describes them. (In fact, they are described better now than in the previous document.)
* Deprecate the overload taking two unsigned vectors and returning a signed result and replace it with a version returning an unsigned result. Addresses the open issue. * Change the names of the new "packed" APIs to include a suffix which tells how the `a` and `b` vectors are interpreted (signed vs. unsigned). * Change the new "packed" APIs so that `a` and `b` are always unsigned integers. The name of the function now tells how to interpret them.
@intel/dpcpp-specification-reviewers this is ready for re-review now. |
This pull request is stale because it has been open 180 days with no activity. Remove stale label or comment or this will be automatically closed in 30 days. |
This pull request is stale because it has been open 180 days with no activity. Remove stale label or comment or this will be automatically closed in 30 days. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Spotted a typo, otherwise LGTM.
int32_t doc_acc_4x8packed_ss(uint32_t a, uint32_t b, int32_t c); | ||
int32_t doc_acc_4x8packed_su(uint32_t a, uint32_t b, int32_t c); | ||
int32_t doc_acc_4x8packed_us(uint32_t a, uint32_t b, int32_t c); | ||
uint32_t doc_acc_4x8packed_uu(uint32_t a, uint32_t b, uint32_t c); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Should this be:
int32_t doc_acc_4x8packed_ss(uint32_t a, uint32_t b, int32_t c); | |
int32_t doc_acc_4x8packed_su(uint32_t a, uint32_t b, int32_t c); | |
int32_t doc_acc_4x8packed_us(uint32_t a, uint32_t b, int32_t c); | |
uint32_t doc_acc_4x8packed_uu(uint32_t a, uint32_t b, uint32_t c); | |
int32_t dot_acc_4x8packed_ss(uint32_t a, uint32_t b, int32_t c); | |
int32_t dot_acc_4x8packed_su(uint32_t a, uint32_t b, int32_t c); | |
int32_t dot_acc_4x8packed_us(uint32_t a, uint32_t b, int32_t c); | |
uint32_t dot_acc_4x8packed_uu(uint32_t a, uint32_t b, uint32_t c); |
The doc
typo appears in a few other places below, too.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks. Fixed in 995e783
* Update to single-year copyright format * Update base SYCL 2020 revision to latest
Update the sycl_ext_oneapi_dot_accumulate extension spec to:
doc_acc_4x8packed_su
. This new names are similar to the OpenCL C naming style for the equivalent functions.