Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[RFC][AIE2][AA] Disambiguate pointers updated by intrinsics #139

Merged
merged 2 commits into from
Sep 2, 2024

Conversation

andcarminati
Copy link
Collaborator

@andcarminati andcarminati commented Jul 30, 2024

This is RFC. Soon we will need more AA support, and this PR tries to clear a bit the way.

Now we can disambiguate some cases where aie2_add_2d and aie2_add_3d are used to update pointers.

No QoR failures.
Gains for Reduce* family up to 50% (instruction count).
Average gain is ~2% (~350 benchmarks).

Updated to include post-RA AA support.

@andcarminati andcarminati force-pushed the andreu.aa.intrinsic.dis branch from 2874f8d to 48cfc8f Compare August 1, 2024 09:02
@andcarminati andcarminati force-pushed the andreu.aa.intrinsic.dis branch from 48cfc8f to e6fa65d Compare August 7, 2024 13:28
@andcarminati andcarminati marked this pull request as draft August 7, 2024 13:53
@andcarminati andcarminati force-pushed the andreu.aa.intrinsic.dis branch 2 times, most recently from 66a8326 to aab392c Compare August 16, 2024 08:54
} else {
if (auto *PHI = dyn_cast<PHINode>(V)) {
// Reached final destination.
return {CountIntrinsicCall, ArgSet};
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Should we check that this PHI node corresponds to PhiA/PhiB in aliasAIEIntrinsic?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We can assume the same as the comment below

@andcarminati andcarminati force-pushed the andreu.aa.intrinsic.dis branch from aab392c to d3a0a58 Compare August 20, 2024 12:39

// We can reach a pointer definition starting from a counter.
static bool isDirectReachable(SmallVector<const Value *, 5> CounterArgSet,
const Value *V) {
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nit: maybe

static bool isAddressingIntrinsicOutput(ArrayRef<const Value*> AddressingOutputs, const Value *InitialAddress);

@andcarminati andcarminati force-pushed the andreu.aa.intrinsic.dis branch 2 times, most recently from eb57536 to c12d012 Compare August 26, 2024 07:16
}
// This information is crucial to calculate the number
// of pointer updates before TargetPointer.
if (LastUpdate == TargetPointer)
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Should we actually check skipCast(LastUpdate) == TargetPointer?

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Could you maybe add a test where there is a pointer cast right before the addressing intrinsic?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done! @with_addrspace_cast.

return true;
}

bool notOverlap(const IntrinsicChainInfo &Other) const {
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Super-nit: I'd find overlaps() clearer, i.e. avoiding double negations.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I guess that would have to be mayOverlap()

@andcarminati andcarminati marked this pull request as ready for review August 26, 2024 13:30
};

struct IntrinsicChainInfo {
// Calls to an specific point.
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit: upto a specific

struct IntrinsicChainInfo {
// Calls to an specific point.
unsigned IntrinsicCalls = 0;
// Total calls accross the loop.
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit: across


struct IntrinsicArgsInfo {
SmallVector<const Value *, 5> ArgSet;
SmallVector<const Value *, 5> CounterArgSet;
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think they are ordered. 'Set' suggests they are not.

return true;
}

bool notOverlap(const IntrinsicChainInfo &Other) const {
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I guess that would have to be mayOverlap()

Intrinsic::ID ID = IntrinsicCall->getIntrinsicID();
IntrinsicArgsInfo Args;
unsigned InPtrIdx;
isAIEPtrAddIntrinsic(ID, InPtrIdx);
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This looks unused?

// the same value
LastUpdate = GEP->getPointerOperand();
} else if (auto *PHI = dyn_cast<PHINode>(LastUpdate)) {
// Reached final destination.
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I would make this the first case in the if-then-elseif ladder.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I changed a bit the code and the current position matters, because now the intrinsic part is in the else block.

if (!ValA)
return AliasResult::MayAlias;

const Value *BaseA = getUnderlyingObjectAIE(ValA);
Copy link
Collaborator

@martien-de-jong martien-de-jong Aug 26, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

factor into std::optional<ValueBasePair> lambda?

// Non uniform address update accross the loop.
// * one can have more updates or,
// * non-uniform updates were found.
if (!ATracking || !BTracking)
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I would check ATracking before trying to compute BTracking.

@andcarminati andcarminati force-pushed the andreu.aa.intrinsic.dis branch from c12d012 to 7ca270d Compare August 27, 2024 11:59
%10 = extractvalue { ptr, i20 } %8, 0
load i8, ptr %10

%12 = tail call { ptr, i20 } @llvm.aie2.add.2d(ptr %7, i20 0, i20 1, i20 2, i20 %6)
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is there any way to get the alias info for %7? Maybe by inserting a store to %7 on top of the one for %ptrcast?

I'm suspecting that the alias info will be wrong for %7, because in the code below, I think that LastUpdate will point to %ptrcast after visiting the instrinsic, which is different from %7, so the update of LastCalls will not trigger.

%ptrcast = addrspacecast ptr %7 to ptr addrspace(3)
%12 = tail call { ptr, i20 } @llvm.aie2.add.2d(ptr %7, i20 0, i20 1, i20 2, i20 %6)

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The cast situation is handled by the @with_addrspace_cast_phi test.

@andcarminati andcarminati force-pushed the andreu.aa.intrinsic.dis branch 2 times, most recently from 2b9be93 to d421e71 Compare August 28, 2024 13:05
// This case is interesting because the first store from the 3rd iteration
// will alias with the 3rd load of the second iteration since we are not
// updating the pointer after the last load and store (post. inc.).
ASSERT_TRUE(AIE::aliasAcrossVirtualUnrolls(Store1, Load3, 3, 2) ==
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nit: EXPECT_TRUE is better because it doesn't stop the test execution at the first failure.

Copy link
Collaborator Author

@andcarminati andcarminati Aug 28, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Interesting, I was not aware! I think I will leave as is this time to not interrupt CI again ;-).

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I considered this! Already updated.

gbossu
gbossu previously approved these changes Aug 28, 2024
Copy link
Collaborator

@gbossu gbossu left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM, great work!

if (LastCalls) {
ChainInfo.IntrinsicCalls = ChainInfo.TotalIntrinsicCalls - *LastCalls;
return ChainInfo;
} else {
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

no else after return?

Now we can disambiguate some cases where aie2_add_2d and aie2_add_3d
are used to update pointers.
Now we can check instructions across virtual iterations.
}

/// This gives all indexes counter operands.
static SmallVector<unsigned, 5> getAddIntrinsicCounterOps(Intrinsic::ID ID) {
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit: it looks as if getAddIntriscOps is complementary to getAddIntrinsicCounterOps, so can be computed as allInputOperands - getAddIntrinsicCounterOps

@andcarminati andcarminati merged commit 5492f45 into aie-public Sep 2, 2024
8 checks passed
@andcarminati andcarminati deleted the andreu.aa.intrinsic.dis branch September 2, 2024 15:51
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants