Skip to content

[AIE2P] Combine VST.PUSH.CONV #351

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 2 commits into from
Feb 19, 2025

Conversation

abhinay-anubola
Copy link
Collaborator

Add VST.PUSH.CONV combine in AIE2PInstructionSelector
Modify VST.FLUSH to VST.FLUSH.CONV in AIEPostSelectOptimize
Corresponding MIR tests were added

@abhinay-anubola abhinay-anubola force-pushed the sanubola.combine.fifo.store.conv branch 3 times, most recently from c96965c to 9effa8e Compare February 17, 2025 05:56
for (MachineInstr &MI : MBB) {
if (AIEII->isFifoStoreConvOpcode(MI.getOpcode())) {
const Register DstReg = MI.getOperand(0).getReg();
Impl(DstReg);
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Do we also have to check, that there is only one user of DstReg, i.e. nobody is using the intermediate results of the conversion except for VST_PUSH?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes we have to check, Iam checking that in AIE2PInstructionSelector.cpp

if (!canCombineCONV(StoreI, *ConvOp) ||
      StoreI.getParent() != ConvOp->getParent() || !MRI.hasOneUse(ConvResult))
    return false;

%2:ptrregbank(p0) = IMPLICIT_DEF
%3:fiforegbank(<32 x s32>) = IMPLICIT_DEF
%4:gprregbank(s32) = IMPLICIT_DEF
%5:vregbank(<64 x s8>), %6:gprregbank(<8 x s8>) = G_INTRINSIC_W_SIDE_EFFECTS intrinsic(@llvm.aie2p.v64accfloat.to.v64bfp16ebs8), %1(<64 x s32>)
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This seems to be testing the same intrinsic as the test below. I guess you meant to test aie2p.v64accfloat.to.v64bfp16ebs16 instead? Maybe adjust the test name to reflect the change as well.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We have different stores in both tests aie2p.fifo.st.push.576.bfp16 and aie2p.fifo.st.push.544.bfp16 with same conv

%8:mstfifo, %7:mpfs, %9:mr26_fifo_st = VST_PUSH_544_CONV_bfp16ebs16_fp32 %3, %1, %2, %4, implicit-def $srf2bflags, implicit-def $srfifo_of, implicit $crf2bmask, implicit $crrnd
%10:em = IMPLICIT_DEF
%11:mstfifo, %12:mpfs, %13:mr26_fifo_st = VST_FLUSH_512_fifo_1d_flush %8, %7, %9, %10, implicit-def $srfifo_of
%14:mstfifo, %15:mpfs, %16:mr26_fifo_st = VST_FLUSH_512_normal_flush %11, %12, %13, implicit-def $srfifo_of
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Do we need the VST_FLUSH_512_normal_flush in the 1d/2d/3d tests? I feel this is only makes the test more complex

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

There is case in Conv_bfp where VST_FLUSH_512_normal_flush is followed by VST_FLUSH_512_2D

@abhinay-anubola abhinay-anubola force-pushed the sanubola.combine.fifo.store.conv branch from 9effa8e to 19288c1 Compare February 18, 2025 06:34
Copy link
Collaborator

@gbossu gbossu left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

@abhinay-anubola abhinay-anubola merged commit a54c2fe into aie-public Feb 19, 2025
8 checks passed
@abhinay-anubola abhinay-anubola deleted the sanubola.combine.fifo.store.conv branch February 19, 2025 06:51
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

5 participants