Skip to content

Commit 92785e4

Browse files
[DOC] Clarify what the numbers in the subgroup view mean in the re-arch documentation (#619)
## Summary Clarifies what the numbers in the subgroup view mean in the re-arch documentation. Copy-pasted Peter's comments from #607. Please revise this PR directly, if needed. Thanks! --------- Co-authored-by: Peter Caday <[email protected]>
1 parent aecfb09 commit 92785e4

File tree

1 file changed

+4
-1
lines changed

1 file changed

+4
-1
lines changed

media/docs/cpp/xe_rearchitecture.md

Lines changed: 4 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -299,6 +299,9 @@ Now that we have the basic thread mapping rule, let's apply it to a simple block
299299
\end{array}
300300
\end{array}
301301
```
302+
The subgroup view shows the data that the entire subgroup owns. The idea here is that the subgroup owns 32 values, enumerated in the order shown. These indices represent the order of elements in registers.
303+
Recall that Intel GPUs have no notion of a "register owned by a thread." Registers belong to subgroups, because it is a SIMD architecture.
304+
302305
(Following CuTe convention, `TxVy` means thread `x`, value `y`.)
303306

304307
An individual DPAS atom's A matrix follows the same pattern, with height ranging from 1 to 8, and width equal to 8 (tf32), 16 (f16/bf16), or 32 (s8/u8). The DPAS C matrix is also organized this way, except that its width is always 16.
@@ -573,4 +576,4 @@ gemm_device(ATensor const& A, // (M,K)
573576
574577
## New Collective MMAs
575578
576-
... coming later!
579+
... coming later!

0 commit comments

Comments
 (0)