-
Notifications
You must be signed in to change notification settings - Fork 14
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[RFC][AIE2][AA] Disambiguate pointers updated by intrinsics #139
Conversation
7f688c0
to
2874f8d
Compare
2874f8d
to
48cfc8f
Compare
48cfc8f
to
e6fa65d
Compare
66a8326
to
aab392c
Compare
} else { | ||
if (auto *PHI = dyn_cast<PHINode>(V)) { | ||
// Reached final destination. | ||
return {CountIntrinsicCall, ArgSet}; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Should we check that this PHI node corresponds to PhiA
/PhiB
in aliasAIEIntrinsic
?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We can assume the same as the comment below
aab392c
to
d3a0a58
Compare
|
||
// We can reach a pointer definition starting from a counter. | ||
static bool isDirectReachable(SmallVector<const Value *, 5> CounterArgSet, | ||
const Value *V) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Nit: maybe
static bool isAddressingIntrinsicOutput(ArrayRef<const Value*> AddressingOutputs, const Value *InitialAddress);
eb57536
to
c12d012
Compare
} | ||
// This information is crucial to calculate the number | ||
// of pointer updates before TargetPointer. | ||
if (LastUpdate == TargetPointer) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Should we actually check skipCast(LastUpdate) == TargetPointer
?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Could you maybe add a test where there is a pointer cast right before the addressing intrinsic?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Done! @with_addrspace_cast
.
return true; | ||
} | ||
|
||
bool notOverlap(const IntrinsicChainInfo &Other) const { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Super-nit: I'd find overlaps()
clearer, i.e. avoiding double negations.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I guess that would have to be mayOverlap()
}; | ||
|
||
struct IntrinsicChainInfo { | ||
// Calls to an specific point. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
nit: upto a specific
struct IntrinsicChainInfo { | ||
// Calls to an specific point. | ||
unsigned IntrinsicCalls = 0; | ||
// Total calls accross the loop. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
nit: across
|
||
struct IntrinsicArgsInfo { | ||
SmallVector<const Value *, 5> ArgSet; | ||
SmallVector<const Value *, 5> CounterArgSet; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think they are ordered. 'Set' suggests they are not.
return true; | ||
} | ||
|
||
bool notOverlap(const IntrinsicChainInfo &Other) const { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I guess that would have to be mayOverlap()
Intrinsic::ID ID = IntrinsicCall->getIntrinsicID(); | ||
IntrinsicArgsInfo Args; | ||
unsigned InPtrIdx; | ||
isAIEPtrAddIntrinsic(ID, InPtrIdx); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This looks unused?
// the same value | ||
LastUpdate = GEP->getPointerOperand(); | ||
} else if (auto *PHI = dyn_cast<PHINode>(LastUpdate)) { | ||
// Reached final destination. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I would make this the first case in the if-then-elseif ladder.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I changed a bit the code and the current position matters, because now the intrinsic part is in the else block.
if (!ValA) | ||
return AliasResult::MayAlias; | ||
|
||
const Value *BaseA = getUnderlyingObjectAIE(ValA); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
factor into std::optional<ValueBasePair>
lambda?
// Non uniform address update accross the loop. | ||
// * one can have more updates or, | ||
// * non-uniform updates were found. | ||
if (!ATracking || !BTracking) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I would check ATracking before trying to compute BTracking.
c12d012
to
7ca270d
Compare
%10 = extractvalue { ptr, i20 } %8, 0 | ||
load i8, ptr %10 | ||
|
||
%12 = tail call { ptr, i20 } @llvm.aie2.add.2d(ptr %7, i20 0, i20 1, i20 2, i20 %6) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Is there any way to get the alias info for %7
? Maybe by inserting a store to %7
on top of the one for %ptrcast
?
I'm suspecting that the alias info will be wrong for %7
, because in the code below, I think that LastUpdate
will point to %ptrcast
after visiting the instrinsic, which is different from %7
, so the update of LastCalls
will not trigger.
%ptrcast = addrspacecast ptr %7 to ptr addrspace(3)
%12 = tail call { ptr, i20 } @llvm.aie2.add.2d(ptr %7, i20 0, i20 1, i20 2, i20 %6)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The cast situation is handled by the @with_addrspace_cast_phi
test.
2b9be93
to
d421e71
Compare
// This case is interesting because the first store from the 3rd iteration | ||
// will alias with the 3rd load of the second iteration since we are not | ||
// updating the pointer after the last load and store (post. inc.). | ||
ASSERT_TRUE(AIE::aliasAcrossVirtualUnrolls(Store1, Load3, 3, 2) == |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Nit: EXPECT_TRUE
is better because it doesn't stop the test execution at the first failure.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Interesting, I was not aware! I think I will leave as is this time to not interrupt CI again ;-).
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I considered this! Already updated.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM, great work!
if (LastCalls) { | ||
ChainInfo.IntrinsicCalls = ChainInfo.TotalIntrinsicCalls - *LastCalls; | ||
return ChainInfo; | ||
} else { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
no else after return?
Now we can disambiguate some cases where aie2_add_2d and aie2_add_3d are used to update pointers.
Now we can check instructions across virtual iterations.
d421e71
to
1312139
Compare
} | ||
|
||
/// This gives all indexes counter operands. | ||
static SmallVector<unsigned, 5> getAddIntrinsicCounterOps(Intrinsic::ID ID) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
nit: it looks as if getAddIntriscOps
is complementary to getAddIntrinsicCounterOps
, so can be computed as allInputOperands - getAddIntrinsicCounterOps
This is RFC. Soon we will need more AA support, and this PR tries to clear a bit the way.
Now we can disambiguate some cases where aie2_add_2d and aie2_add_3d are used to update pointers.
No QoR failures.
Gains for Reduce* family up to 50% (instruction count).
Average gain is ~2% (~350 benchmarks).
Updated to include post-RA AA support.