Releases: ROCm/TransferBench
Releases · ROCm/TransferBench
rocm-7.1.1
ROCm release v7.1.1
TransferBench v1.65.00
v1.65.00
Added
- Added warp-level dispatch support via GFX_SE_TYPE environment variable
- GFX_SE_TYPE=0 (default): Threadblock-level dispatch, each subexecutor is a threadblock
- GFX_SE_TYPE=1: Warp-level dispatch, each subexecutor is a single warp
rocm-7.1.0
ROCm release v7.1.0
rocm-7.0.2
ROCm release v7.0.2
rocm-6.4.4
ROCm release v6.4.4
rocm-7.0.1
ROCm release v7.0.1
rocm-7.0.0
ROCm release v7.0.0
TransferBench v1.64.00
v1.64.00
Added
- Added BLOCKSIZES to a2asweep preset to allow also sweeping over threadblock sizes
- Added FILL_COMPRESS to allow more control over input data pattern
- FILL_COMPRESS takes in a comma-separated list of integer percentages (that must add up to 100)
that sets the percentages of 64B lines to be filled by random/1B0/2B0/4B0/32B0 data patterns- Bins:
- 0 - random
- 1 - 1B0 upper 1 byte of each aligned 2 bytes is 0
- 2 - 2B0 upper 2 bytes of each aligned 4 bytes is 0
- 3 - 4B0 upper 4 bytes of each aligned 8 bytes is 0
- 4 - 32B0 upper 32 bytes of each aligned 64-byte line are 0
- Bins:
- FILL_PATTERN will be ignored if FILL_COMPRESS is specified
- FILL_COMPRESS takes in a comma-separated list of integer percentages (that must add up to 100)
- Additional details about data patterns generated will be printed if the debug env var DUMP_LINES is
set to a non-zero value, which also corresponds to how many 64 byte lines will be printed
Modified
- Increased GFX_BLOCKSIZE limit from 512 to 1024 (still requires multiple of 64)
Fixed
- Fixed bug when using BYTE_OFFSET
TransferBench v1.63.00
v1.63.00
Added
- Added
gfx950,gfx1150, andgfx1151to default GPU targets list in CMake builds
Modified
- Removing self-GPU check for DMA engine copies
- Switched to amdclang++ as primary compiler
- healthcheck preset adds HBM testing and support for more MI3XX variants
Fixed
- Fixed issue when using "P" memory type and specific DMA subengines
- Fixed issue with subiteration timing reports
rocm-6.4.3
ROCm release v6.4.3