-
Notifications
You must be signed in to change notification settings - Fork 30
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[SplitLogicalObjFifo] Fix split-logicalobjfifo pass to analyse unique producers/consumers ObjFifos #1060
base: main
Are you sure you want to change the base?
Conversation
compiler/plugins/target/AMD-AIE/iree-amd-aie/Transforms/test/split_logicalobjfifos.mlir
Show resolved
Hide resolved
compiler/plugins/target/AMD-AIE/iree-amd-aie/Transforms/AMDAIESplitLogicalObjFifos.cpp
Outdated
Show resolved
Hide resolved
compiler/plugins/target/AMD-AIE/iree-amd-aie/Transforms/AMDAIESplitLogicalObjFifos.cpp
Outdated
Show resolved
Hide resolved
...er/plugins/target/AMD-AIE/iree-amd-aie/Transforms/Utils/AMDAIELogicalObjFifoSplittingUtils.h
Outdated
Show resolved
Hide resolved
compiler/plugins/target/AMD-AIE/iree-amd-aie/Transforms/AMDAIESplitLogicalObjFifos.cpp
Outdated
Show resolved
Hide resolved
compiler/plugins/target/AMD-AIE/iree-amd-aie/Transforms/AMDAIESplitLogicalObjFifos.cpp
Outdated
Show resolved
Hide resolved
compiler/plugins/target/AMD-AIE/iree-amd-aie/Transforms/AMDAIESplitLogicalObjFifos.cpp
Outdated
Show resolved
Hide resolved
compiler/plugins/target/AMD-AIE/iree-amd-aie/Transforms/AMDAIESplitLogicalObjFifos.cpp
Outdated
Show resolved
Hide resolved
compiler/plugins/target/AMD-AIE/iree-amd-aie/Transforms/AMDAIESplitLogicalObjFifos.cpp
Outdated
Show resolved
Hide resolved
def3a83
to
0904a94
Compare
.../plugins/target/AMD-AIE/iree-amd-aie/Transforms/Utils/AMDAIELogicalObjFifoSplittingUtils.cpp
Outdated
Show resolved
Hide resolved
.../plugins/target/AMD-AIE/iree-amd-aie/Transforms/Utils/AMDAIELogicalObjFifoSplittingUtils.cpp
Outdated
Show resolved
Hide resolved
...er/plugins/target/AMD-AIE/iree-amd-aie/Transforms/Utils/AMDAIELogicalObjFifoSplittingUtils.h
Outdated
Show resolved
Hide resolved
26ef401
to
1703271
Compare
compiler/plugins/target/AMD-AIE/iree-amd-aie/IR/AMDAIELogicalObjFifoOpInterface.cpp
Outdated
Show resolved
Hide resolved
compiler/plugins/target/AMD-AIE/iree-amd-aie/Transforms/AMDAIESplitLogicalObjFifos.cpp
Outdated
Show resolved
Hide resolved
compiler/plugins/target/AMD-AIE/iree-amd-aie/Transforms/AMDAIESplitLogicalObjFifos.cpp
Outdated
Show resolved
Hide resolved
compiler/plugins/target/AMD-AIE/iree-amd-aie/Transforms/AMDAIESplitLogicalObjFifos.cpp
Outdated
Show resolved
Hide resolved
531ee50
to
da1c62d
Compare
compiler/plugins/target/AMD-AIE/iree-amd-aie/Transforms/AMDAIESplitLogicalObjFifos.cpp
Outdated
Show resolved
Hide resolved
compiler/plugins/target/AMD-AIE/iree-amd-aie/Transforms/AMDAIESplitLogicalObjFifos.cpp
Outdated
Show resolved
Hide resolved
compiler/plugins/target/AMD-AIE/iree-amd-aie/Transforms/AMDAIESplitLogicalObjFifos.cpp
Outdated
Show resolved
Hide resolved
2e57be9
to
8c3f762
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This revision is much more concise and cleaner. I have more comments on the test.
compiler/plugins/target/AMD-AIE/iree-amd-aie/Transforms/AMDAIESplitLogicalObjFifos.cpp
Outdated
Show resolved
Hide resolved
compiler/plugins/target/AMD-AIE/iree-amd-aie/Transforms/test/split_logicalobjfifos.mlir
Outdated
Show resolved
Hide resolved
compiler/plugins/target/AMD-AIE/iree-amd-aie/Transforms/test/split_logicalobjfifos.mlir
Outdated
Show resolved
Hide resolved
compiler/plugins/target/AMD-AIE/iree-amd-aie/Transforms/test/split_logicalobjfifos.mlir
Outdated
Show resolved
Hide resolved
compiler/plugins/target/AMD-AIE/iree-amd-aie/Transforms/test/split_logicalobjfifos.mlir
Show resolved
Hide resolved
3594447
to
42868e6
Compare
compiler/plugins/target/AMD-AIE/iree-amd-aie/Transforms/AMDAIESplitLogicalObjFifos.cpp
Outdated
Show resolved
Hide resolved
2de4496
to
f7bcd0d
Compare
compiler/plugins/target/AMD-AIE/iree-amd-aie/Transforms/test/split_logicalobjfifos.mlir
Outdated
Show resolved
Hide resolved
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM, just one nit and also update the PR title to producers/consumers instead of L1/L2.
compiler/plugins/target/AMD-AIE/iree-amd-aie/Transforms/AMDAIESplitLogicalObjFifos.cpp
Outdated
Show resolved
Hide resolved
…SplitLogicalObjFifos.cpp Co-authored-by: Jorn Tuyls <[email protected]>
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks for the changes! I have some final comments.
// Although we have 8 columns, L2 LHS buffers needs to be split into only 1, L2 RHS into 2 and | ||
// L2 OUT into 1. | ||
// This is because we decide the split factor for the L2 ObjectFifo depending on :- | ||
// GCD(unique producer/consumer for the respective ObjectFifos being split, number of columns) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The description is much better. But I think if the split factor is 1, it's not splitting. Besides, you could emphasize the purpose of test. Something like:
This test demonstrates the case when the factor is not simply decided by the number of columns but the number of unique producers/consumers. In the example, although we are using 8 AIE columns, L2 LHS and output buffers are not split because there's only one producer/consumer, while L2 RHS buffer is split into 2 because there are 2 producers/consumers.
// CHECK: amdaie.dma_cpy_nd(%{{.*}}[0, 0] [256, 512] [4096, 1], %[[LOF_OUT_L2]][0, 0, 0, 0] [8, 32, 16, 32] [1024, 32, 8192, 1]) : | ||
// CHECK: } | ||
#executable_target_amdaie_pdi_fb = #hal.executable.target<"amd-aie", "amdaie-pdi-fb", {num_cols = 8 : i32, num_rows = 4 : i32, target_device = "npu4", ukernels = "none"}> | ||
#translation = #iree_codegen.translation_info<pipeline = Custom> |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Nit: #translation can be removed.
In order to decide the split factor we were basing the inference solely on the number of columns available.
Because of this, for 4x8 array and for the new pipeline, we were splitting L2 buffers :-
As a result, the tiles being assigned were :-
This causes exhaustion of DMA channels. Refer to this thread for the discussion thread.
The L2 buffer split should be :-
So that later on when the tiles are being assigned, the expected no. of tile assignments for LHS/RHS/OUT matches the corresponding L2 buffers.
This PR aims to analyse the number of unique producers/consumers ObjFifos for the ObjFifo being split..
e2e CI test for Matmul both with/without ukernel via
pack-peel-4-level-tiling
pipeline targeting 4x8 array on Strix have been added.Signed-off-by: Abhishek Varma [email protected]