You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
To fix the issue that in certain gemm cases, ConvertLayoutOp requires oversized shared local memory. In #2312, we use Atomic with mma layout to eliminate ConvertLayoutOp.
In this issue, we need to estimate the cost of Atomic ops with the MMA layout, compare it with the cost of ConvertLayout + AtomicRMW. With the knowledge about the performance difference, we can decide if a cost module needed or not in that pass.
The text was updated successfully, but these errors were encountered:
To address the comment #2312 (comment).
To fix the issue that in certain gemm cases, ConvertLayoutOp requires oversized shared local memory. In #2312, we use Atomic with mma layout to eliminate ConvertLayoutOp.
In this issue, we need to estimate the cost of Atomic ops with the MMA layout, compare it with the cost of
ConvertLayout
+AtomicRMW
. With the knowledge about the performance difference, we can decide if a cost module needed or not in that pass.The text was updated successfully, but these errors were encountered: