-
Notifications
You must be signed in to change notification settings - Fork 25
[AIE2P] Combine VST.PUSH.CONV #351
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
c96965c
to
9effa8e
Compare
for (MachineInstr &MI : MBB) { | ||
if (AIEII->isFifoStoreConvOpcode(MI.getOpcode())) { | ||
const Register DstReg = MI.getOperand(0).getReg(); | ||
Impl(DstReg); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Do we also have to check, that there is only one user of DstReg, i.e. nobody is using the intermediate results of the conversion except for VST_PUSH?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes we have to check, Iam checking that in AIE2PInstructionSelector.cpp
if (!canCombineCONV(StoreI, *ConvOp) ||
StoreI.getParent() != ConvOp->getParent() || !MRI.hasOneUse(ConvResult))
return false;
%2:ptrregbank(p0) = IMPLICIT_DEF | ||
%3:fiforegbank(<32 x s32>) = IMPLICIT_DEF | ||
%4:gprregbank(s32) = IMPLICIT_DEF | ||
%5:vregbank(<64 x s8>), %6:gprregbank(<8 x s8>) = G_INTRINSIC_W_SIDE_EFFECTS intrinsic(@llvm.aie2p.v64accfloat.to.v64bfp16ebs8), %1(<64 x s32>) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This seems to be testing the same intrinsic as the test below. I guess you meant to test aie2p.v64accfloat.to.v64bfp16ebs16
instead? Maybe adjust the test name to reflect the change as well.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We have different stores in both tests aie2p.fifo.st.push.576.bfp16
and aie2p.fifo.st.push.544.bfp16
with same conv
%8:mstfifo, %7:mpfs, %9:mr26_fifo_st = VST_PUSH_544_CONV_bfp16ebs16_fp32 %3, %1, %2, %4, implicit-def $srf2bflags, implicit-def $srfifo_of, implicit $crf2bmask, implicit $crrnd | ||
%10:em = IMPLICIT_DEF | ||
%11:mstfifo, %12:mpfs, %13:mr26_fifo_st = VST_FLUSH_512_fifo_1d_flush %8, %7, %9, %10, implicit-def $srfifo_of | ||
%14:mstfifo, %15:mpfs, %16:mr26_fifo_st = VST_FLUSH_512_normal_flush %11, %12, %13, implicit-def $srfifo_of |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Do we need the VST_FLUSH_512_normal_flush
in the 1d/2d/3d tests? I feel this is only makes the test more complex
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
There is case in Conv_bfp where VST_FLUSH_512_normal_flush is followed by VST_FLUSH_512_2D
llvm/test/CodeGen/AIE/aie2p/GlobalIsel/post-select-optimize-vst.flush.conv.mir
Show resolved
Hide resolved
9effa8e
to
19288c1
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM
Add VST.PUSH.CONV combine in AIE2PInstructionSelector
Modify VST.FLUSH to VST.FLUSH.CONV in AIEPostSelectOptimize
Corresponding MIR tests were added