forked from nod-ai/iree-amd-aie
-
Notifications
You must be signed in to change notification settings - Fork 0
Commit
This commit does not belong to any branch on this repository, and may belong to a fork outside of the repository.
Support matmul 4d inputs on pack-peel-4-level-tiling (nod-ai#1098)
This PR adds basic support for matmul with 4d inputs and output on pack-peel-4-level-tiling. In theory, the tiling strategy and pipeline should support different layouts of matmul 4d operations. However, the order of the input dims (both inner and outer dims) are crucial for the correct compilation and results. To ensure the correctness and for comparison purpose, this PR only adds a test that corresponds to the standard matmul, which means the order of the input dims for this operation corresponds to the L2 shapes of the matmul op after the first level packing, i.e., ```C += matmul4d(A,B) where A:MxKxM0xK0, B:NxKxK0xN0, C:NxMxM0xN0``` The test class and instance added in run.py is preliminary and for experimental purpose. Generalization of the test class will be addressed as follow-ups. Runtime comparison: On Phoenix runner `matmul_512_4096_512_bf16_f32 : 1141 us vs matmul4d_16_128_8_bf16_f32: 998 us` On Strix runner `matmul_512_4096_512_i8_i32 : 806 us vs matmul4d_16_128_8_i8_i32: 491 us`
- Loading branch information
Showing
12 changed files
with
282 additions
and
97 deletions.
There are no files selected for viewing
17 changes: 17 additions & 0 deletions
17
build_tools/ci/cpu_comparison/matmul_template/matmul4d_MxKxM0xK0_NxKxK0xN0.mlir
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,17 @@ | ||
// input ${M}x${K}x32x64x${TYPE1} | ||
// input ${N}x${K}x64x32x${TYPE1} | ||
|
||
func.func @matmul4d(%arg0: tensor<${M}x${K}x32x64x${TYPE1}>, %arg1: tensor<${N}x${K}x64x32x${TYPE1}>) -> tensor<${N}x${M}x32x32x${TYPE2}> { | ||
%cst = arith.constant ${ZERO} : ${TYPE2} | ||
%0 = tensor.empty() : tensor<${N}x${M}x32x32x${TYPE2}> | ||
%1 = linalg.fill ins(%cst : ${TYPE2}) outs(%0 : tensor<${N}x${M}x32x32x${TYPE2}>) -> tensor<${N}x${M}x32x32x${TYPE2}> | ||
%2 = linalg.generic {indexing_maps = [affine_map<(d0, d1, d2, d3, d4, d5) -> (d0, d2, d3, d5)>, affine_map<(d0, d1, d2, d3, d4, d5) -> (d1, d2, d5, d4)>, affine_map<(d0, d1, d2, d3, d4, d5) -> (d1, d0, d3, d4)>], iterator_types = ["parallel", "parallel", "reduction", "parallel", "parallel", "reduction"]} ins(%arg0, %arg1 : tensor<${M}x${K}x32x64x${TYPE1}>, tensor<${N}x${K}x64x32x${TYPE1}>) outs(%1 : tensor<${N}x${M}x32x32x${TYPE2}>) { | ||
^bb0(%in: ${TYPE1}, %in_1: ${TYPE1}, %out: ${TYPE2}): | ||
%12 = ${EXT} %in : ${TYPE1} to ${TYPE2} | ||
%13 = ${EXT} %in_1 : ${TYPE1} to ${TYPE2} | ||
%14 = ${MUL} %12, %13 : ${TYPE2} | ||
%15 = ${ADD} %out, %14 : ${TYPE2} | ||
linalg.yield %15 : ${TYPE2} | ||
} -> tensor<${N}x${M}x32x32x${TYPE2}> | ||
return %2 : tensor<${N}x${M}x32x32x${TYPE2}> | ||
} |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Oops, something went wrong.