-
Notifications
You must be signed in to change notification settings - Fork 14
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
End to end support for bfp16 mac/mul intrinsics #279
Conversation
case Intrinsic::aie2p_BFP576_BFP1152_ACC2048_mul_conf: | ||
case Intrinsic::aie2p_BFP576_BFP1152_ACC2048_negmul_conf: | ||
MI = MIB.buildInstr(TII.getOpCode(I), {DstReg}, | ||
{Src1Reg, Src2Reg, /*Mode*/ I.getOperand(8).getReg()}); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Clang-tidy can help with those comments, like: /*Mode=*/I.getOperand(8).getReg()}
.addImm(AIE2P::sub_bfp576_lo) | ||
.addReg(Src2RegSub1) | ||
.addImm(AIE2P::sub_bfp576_hi); | ||
constrainOperandRegClass(*MF, TRI, MRI, TII, RBI, *RegSeq1, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
nit: can we have a lambda where we pass something like constrainRegSeq(RegSeq1, AIE2P::VEC512RegClass, 1);
?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Done!
@@ -18,6 +18,8 @@ | |||
#include "AIE2PTargetMachine.h" | |||
#include "AIEBaseInstructionSelector.h" | |||
#include "MCTargetDesc/aie2p/AIE2PMCTargetDesc.h" | |||
#include "llvm/ADT/ArrayRef.h" |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Those includes are not necessary here.
879d252
to
94481ed
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Could you add register bank select tests for these intrinsics? No need to test every single type, only each intrinsic would be enough, as all vector types follow the same code path.
94481ed
to
97bbf52
Compare
Done! |
@@ -38827,6 +38827,376 @@ addmsc_elem_64_conf(v64bfloat16 a, int sgn_x, v64bfloat16 b, int sgn_y, | |||
conf); | |||
}; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Nit: 2025 for this file?
@@ -301,6 +301,45 @@ def : PatInaccessibleMem<(int_aie2p_I512_I512_ACC512_bf_mac_conf VEC512:$s1, VEC | |||
def : PatInaccessibleMem<(int_aie2p_I512_I512_ACC512_bf_mul_conf VEC512:$s1, VEC512:$s2, eR:$acc), | |||
(EXTRACT_SUBREG (VMUL_f_vmul_bf_vmul_bf_core_X_X VEC512:$s1, VEC512:$s2, eR:$acc), sub_512_acc_lo)>; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Nit: 2024-2025 in the header
ConstrainRegSeq(*RegSeq2, AIE2P::EXPVEC64RegClass, 3); | ||
ConstrainRegSeq(*RegSeq3, AIE2P::VEC512RegClass, 1); | ||
ConstrainRegSeq(*RegSeq3, AIE2P::EXPVEC64RegClass, 3); | ||
ConstrainRegSeq(*RegSeq4, AIE2P::mEXaRegClass, 1); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Nit: VEC576 instead of mEXa. You could add this if not already there.
// BFP16 MAC MUL | ||
class AIE2PMUL_BFP16V64BFP16 | ||
: Intrinsic<[llvm_v64i32_ty], [llvm_v64i8_ty, llvm_v8i8_ty, llvm_v64i8_ty, llvm_v8i8_ty, llvm_i32_ty], | ||
[IntrNoMem]>; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
These intrinsics read control registers. To model this correctly, we set the following attributes [IntrReadMem, IntrInaccessibleMemOnly]
Could you update?
@@ -2973,6 +2984,88 @@ AIE2PInstructionSelector::getCombinedOpcodeSRSUPS( | |||
return {}; | |||
} | |||
|
|||
bool AIE2PInstructionSelector ::selectVMAC_BFP(MachineInstr &I, | |||
MachineRegisterInfo &MRI) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We can have patterns for all these intrinsics. I tried this one and it works after defining the 2 higher composed sub-register indices. See below:
def : PatInaccessibleMem<(int_aie2p_BFP576_BFP1152_ACC2048_mul_conf VEC512:$s1, EXPVEC64:$s2, VEC512:$s3, VEC512:$s4, EXPVEC64:$s5, EXPVEC64:$s6, eR:$acc),
(VMUL_f_vmul_bfp_vmul_bfp_core_EX_EY
(REG_SEQUENCE mEXv, VEC512:$s1, sub_bfp16_x, EXPVEC64:$s2, sub_bfp16_e),
(REG_SEQUENCE eEY,
VEC512:$s3, sub_bfp16_x, EXPVEC64:$s5, sub_bfp16_e, VEC512:$s4, sub_bfp576_hi_bfp16_x, EXPVEC64:$s6, sub_bfp576_hi_bfp16_e),
eR:$acc)>;
In AIE2PRegisterInfo.td
:
def sub_bfp576_hi_bfp16_e : ComposedSubRegIndex<sub_bfp576_hi, sub_bfp16_e>;
def sub_bfp576_hi_bfp16_x : ComposedSubRegIndex<sub_bfp576_hi, sub_bfp16_x>;
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@khallouh This is really nice. I could not make it working with REG_SEQUENCE
. Thanks for spending time on it.
08814d5
to
050b95c
Compare
050b95c
to
535eba3
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Could you add the intrinsics to the isAccIntrinsic
in RegBankSelect? And to test this store the result of the MAC operations instead of copying it into an accumulator register.
The rest looks good, thanks for addressing the comments.
535eba3
to
e66e24e
Compare
Done. For the register bank tests instead of copying to accumulator register, returning the virtual register. This will do the trick. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM!
No description provided.