-
Notifications
You must be signed in to change notification settings - Fork 752
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Add SPV_INTEL_float_controls2 extension preview #1611
base: sycl
Are you sure you want to change the base?
Conversation
@bashbaug @AlexeySotkin @kvladimi @aus-intel could you please take a look at this MR? |
Co-authored-by: Alexey Sachkov <[email protected]>
Co-authored-by: Alexey Sotkin <[email protected]>
Extension is published at intel/llvm#1611
SPV_INTEL_float_controls2 and SPV_KHR_float_controls extensions Extensions are published at intel/llvm#1612 intel/llvm#1611
Extension is published at intel#1611
…float_controls2 and SPV_KHR_float_controls extensions Extensions are published at intel#1612 intel#1611
Extension is published at intel/llvm#1611 Change-Id: I8e67d12a776e4843f724ed1111c29022af2fed6a
SPV_INTEL_float_controls2 and SPV_KHR_float_controls extensions Extensions are published at intel/llvm#1612 intel/llvm#1611
Extension is published at intel/llvm#1611
SPV_INTEL_float_controls2 and SPV_KHR_float_controls extensions Extensions are published at intel/llvm#1612 intel/llvm#1611
Extension is published at intel/llvm#1611
… SPV_INTEL_float_controls2 and SPV_KHR_float_controls extensions Extensions are published at intel/llvm#1612 intel/llvm#1611
Added following decorations for function VCFloatControl attributes: DecorateFunctionRoundingModeINTEL DecorateFunctionDenormModeINTEL DecorateFunctionFloatingPointModeINTEL Specification can be found in intel/llvm#1611
Added following decorations for function VCFloatControl attributes: DecorateFunctionRoundingModeINTEL DecorateFunctionDenormModeINTEL DecorateFunctionFloatingPointModeINTEL Specification can be found in intel/llvm#1611
…truction scope decorations. Added validation rules.
@AlexeySachkov, @bader - when we can expect this extension to be reviewed/merged? Some of ESIMD features depend on this. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Have you given any thought how similar float controls could apply to bfloat16
or tf32
types, which would have the same "target width" as float
and half
types respectively?
| 0 | *Preserve* + | ||
Denormalized values must be preserved. | | ||
| 1 | *FlushToZero* + | ||
Denormalized values must be flushed to zero. | |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Confirming: For the FlushToZero denorm mode denormalized values must be flushed to zero? Or is it possible that some implementations may still preserve denorms? I'm reminded of this text for the -cl-denorms-are-zero
build option (link, emphasis mine):
This option controls how single precision and double precision denormalized numbers are handled. If specified as a build option, the single precision denormalized numbers may be flushed to zero; double precision denormalized numbers may also be flushed to zero if the optional extension for double precision is supported. This is intended to be a performance hint and the OpenCL compiler can choose not to flush denorms to zero if the device supports single precision (or double precision) denormalized numbers.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We already have corresponding execution modes whose description uses "is flushed to zero" or "is preserved". We could do the same here (instead of must ) and defer to Client API spec.
Note: Vulkan spec defines a list of instructions that must flush to zero or preserve. Other instructions may flush to zero or preserve.
|
||
Add the following rule: | ||
|
||
- Each OpEntryPoint must not set more than one of the *FloatingPointModeALTINTEL* or *FloatingPointModeALTINTEL* execution modes for any given _Target Width_. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Is there a typo here? FloatingPointModeALTINTEL is listed twice.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
yes, thank you
Add the following rule to the description of *DenormPreserve* execution mode: | ||
|
||
If an instruction is inside of a function decorated with *FunctionDenormModeINTEL* with matching _Target Width_ operand, the denorm mode specified by *FunctionDenormModeINTEL* is applied and *DenormPreserve* is ignored. | ||
|
||
Add the following rule to the description of *DenormFlushToZero* execution mode: | ||
|
||
If an instruction is inside of a function decorated with *FunctionDenormModeINTEL* with matching _Target Width_ operand, the denorm mode specified by *FunctionDenormModeINTEL* is applied and *DenormFlushToZero* is ignored. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It's fine if we defer this to a client API environment spec, but so we're thinking about it: How should these execution modes and decorations interact with program build options like -cl-denorms-are-zero
? Does the decoration or the build option "win"?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Does -cl-denorms-are-zero
build option cause that the corresponding DenormPreserve Execution Mode is present in the shader?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The -cl-denorms-are-zero
build option is be specified when compiling SPIR-V, not when generating SPIR-V, so I think we'd need to work out the interactions between the build option and the execution mode. I don't think this has been an issue before now because most of the compute consumers haven't moved to SPIR-V 1.4 (and hence don't need to worry about DenormPreserve), but it's something we'll need to figure out eventually.
Do we have a recommendation how this should work?
The default rounding mode for floating-point arithmetic and conversions instructions must be round to positive infinity. | ||
If an instruction is decorated with *FPRoundingMode* or defines a rounding mode in its description, that rounding mode is applied and *RoundingModeRTPINTEL* is ignored. | ||
If an instruction is not decorated with *FPRoundingMode* and does not define a rounding mode in its description and is inside of a function decorated | ||
with *FunctionRoundingModeINTEL* with matching _Target Width_ operand, the rounding mode specified by *FunctionRoundingModeINTEL* is applied and *RoundingModeRTPINTEL* is ignored. + |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think we need to define "matching Target Width" a little more precisely, especially for operations like OpFConvert where there is a source width and a destination width.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
In this sentence, "matching" was supposed to mean the same Target Width operands in the RoundingModeRTPINTEL execution mode and FunctionRoundingModeINTEL decoration.
Would the following text be better:
If an instruction is not decorated with FPRoundingMode and does not define a rounding mode in its description and is inside of a function decorated with FunctionRoundingModeINTEL whose Target Width operand is the same as RoundingModeRTPINTEL Target Width operand, the rounding mode specified by FunctionRoundingModeINTEL is applied and RoundingModeRTPINTEL is ignored.
Only affects arithmetic instructions operating on a floating-point type whose component width is Target Width and conversion instructions whose result type component width is Target Width.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Sorry, I think my previous comment was unclear.
For OpFConvert there is the width of the source operand and the width of the result. Which one of these is checked to determine if the Target Width is matched?
This text kind of makes it seem like it is the source operand that gets checked (due to "operating on"):
Only affects instructions operating on a floating-point type whose component width is Target Width.
That might be a little surprising though, so I'd recommend being a little more precise and explicit.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This text kind of makes it seem like it is the source operand that gets checked (due to "operating on"):
Only affects instructions operating on a floating-point type whose component width is Target Width.
That might be a little surprising though, so I'd recommend being a little more precise and explicit.
In my previous comment I proposed an updated text:
Only affects arithmetic instructions operating on a floating-point type whose component width is Target Width and conversion instructions whose result type component width is Target Width.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ah, I think I missed this trying to parse the sentence about the rounding modes.
Do we need to differentiate between arithmetic and conversion instructions, or can we always look at the result type component width to determine if it matches the target width?
|
||
Modify the Entry Point rules: | ||
|
||
Add *RoundingModeRTPINTEL* and *RoundingModeRTNINTEL* to the list of execution modes in the third rule. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Minor: I'd consider describing what "the third rule" is here, since the universal validation rules appear to have been changed in the latest SPIR-V 1.6 spec and I'm having a hard time finding the older SPIR-V 1.5 revision 3 spec.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Good point, I will quote the rule:
Add RoundingModeRTPINTEL and RoundingModeRTNINTEL to the list of execution modes in the following validation rule:
- Each OpEntryPoint must not set more than one of the RoundingModeRTE or RoundingModeRTZ execution modes for any given Target Width
yes, I thought we could replace "target width" with an enum whose values would be float = 32, half = 16, double = 64 and something new for |
Nice, this seems reasonable. Since the "target width" is an unsigned 32-bit integer maybe it makes sense to choose absurdly large values (e.g. with the high bit set) to represent alternate floating-point encodings like bfloat16 or tf32? |
Opaque pointers requires us to keep more careful track of the pointee types of the builtin methods. While these are discoverable via demangling, the existing interface of mutateCallInst* does not provide easy ways to maintain these types during remapping of OpenCL builtins to SPIR-V builtins. This patch adds a BuiltinCallMutator that keeps track not only of the pointer element types discovered by demangling, but also of the LLVM attributes the arguments originally had, even as arguments are inserted or deleted for OpenCL<->SPIR-V builtin conversions. Additionally, the interface is intended to be more forgiving of the varieties of lambdas needed to do the conversions: for example, there is no need for different entry points to indicate a type-aware mapping, or even to indicate a function whose return type changes. It is intended that all of the existing calls to mutateCallInst be replaced with this new interface, so as to be able to eliminate the getPointerElementType call in the existing implementation of that function. However, the actual migration of the calls will be done in future patches to keep the maximum size of patches relatively small.
Opaque pointers requires us to keep more careful track of the pointee types of the builtin methods. While these are discoverable via demangling, the existing interface of mutateCallInst* does not provide easy ways to maintain these types during remapping of OpenCL builtins to SPIR-V builtins. This patch adds a BuiltinCallMutator that keeps track not only of the pointer element types discovered by demangling, but also of the LLVM attributes the arguments originally had, even as arguments are inserted or deleted for OpenCL<->SPIR-V builtin conversions. Additionally, the interface is intended to be more forgiving of the varieties of lambdas needed to do the conversions: for example, there is no need for different entry points to indicate a type-aware mapping, or even to indicate a function whose return type changes. It is intended that all of the existing calls to mutateCallInst be replaced with this new interface, so as to be able to eliminate the getPointerElementType call in the existing implementation of that function. However, the actual migration of the calls will be done in future patches to keep the maximum size of patches relatively small. Original commit: KhronosGroup/SPIRV-LLVM-Translator@139b08b
Opaque pointers requires us to keep more careful track of the pointee types of the builtin methods. While these are discoverable via demangling, the existing interface of mutateCallInst* does not provide easy ways to maintain these types during remapping of OpenCL builtins to SPIR-V builtins. This patch adds a BuiltinCallMutator that keeps track not only of the pointer element types discovered by demangling, but also of the LLVM attributes the arguments originally had, even as arguments are inserted or deleted for OpenCL<->SPIR-V builtin conversions. Additionally, the interface is intended to be more forgiving of the varieties of lambdas needed to do the conversions: for example, there is no need for different entry points to indicate a type-aware mapping, or even to indicate a function whose return type changes. It is intended that all of the existing calls to mutateCallInst be replaced with this new interface, so as to be able to eliminate the getPointerElementType call in the existing implementation of that function. However, the actual migration of the calls will be done in future patches to keep the maximum size of patches relatively small. Original commit: KhronosGroup/SPIRV-LLVM-Translator@139b08b
This pull request is stale because it has been open 180 days with no activity. Remove stale label or comment or this will be automatically closed in 30 days. |
This pull request was closed because it has been stalled for 30 days with no activity. |
Signed-off-by: Mariusz Merecki [email protected]