Skip to content

Releases: ROCm/Tensile

Tensile 4.41.0 for ROCm 6.2.4

06 Nov 19:55
81ae953
Compare
Choose a tag to compare

Tensile code for ROCm 6.2.4 did not change. The library was rebuilt for the updated ROCm 6.2.4 stack.

Tensile 4.42.0 for ROCm 6.3.3

19 Feb 17:47
aca95d1
Compare
Choose a tag to compare

Tensile code for ROCm 6.3.3 did not change. The library was rebuilt for the updated ROCm 6.3.3 stack.

Tensile 4.42.0 for ROCm 6.3.2

28 Jan 15:43
aca95d1
Compare
Choose a tag to compare

Tensile code for ROCm 6.3.2 did not change. The library was rebuilt for the updated ROCm 6.3.2 stack.

Tensile 4.42.0 for ROCm 6.3.1

20 Dec 16:12
aca95d1
Compare
Choose a tag to compare

Tensile code for ROCm 6.3.1 did not change. The library was rebuilt for the updated ROCm 6.3.1 stack.

Tensile 4.42.0 for ROCm 6.3.0

03 Dec 19:49
aca95d1
Compare
Choose a tag to compare

Additions

  • add contributor and developer guide
  • add testing and documentation for MasterSolutionLibrary.ArchitectureIndexMap and remapSolutionIndicesStartingFrom
  • add gfx12 support
  • add functions for writing master file
  • add tPrint and reconciles printing options
  • add Python unit test coverage report
  • add factor embed library logic into function and test
  • add clang++ as cxx-compiler option for windows
  • add logic to cope with different compilers
  • add generateManifest fxn and rename generateManifest to toFile and move to Utilities
  • add profiling CI job
  • add support for amdclang and use defaults
  • add architecture management functions to TensileCreateLibrary
  • add TensileCreateLibrary cli reference docs
  • add new documentation (sphinx prototype, build out skeleton)

Optimizations

  • add prediction model for optimal number of Stream-K tiles to run
  • use analytical grid size prediction model for Stream-K
  • remap XCC-based workgroup for Stream-K kernels
  • add two-tile algorithm with Stream-K after DP
  • add atomic 2-tile Stream-K and clean-up tuning parameters

Changes

  • improve rocBLAS build output by allowing warning suppression, ignoring only developer warnings, progress bar and quiet printing
  • reorder extensions for Windows in which function
  • remove deprecated flag from CI profiling job
  • update amdclang++ and asm directories
  • update duplicate marking tests with mocks
  • remove diagnostic print, and restore print ordering, and add missing print option
  • bump rocm-docs-core from 1.2.0 to 1.5.0 in /docs/sphinx
  • refactor kernel duplicate matching
  • refactor generateLogicDataAndSolutions
  • remove globals from prepAsm
  • restrict XCC mapping to gfx942
  • refactor argument parsing in TensileCreateLibrary
  • disable failing rhel9 tests
  • change line length for formatting to 100 characters
  • change YAML operations to use C libyaml backend
  • improve warning wording
  • remove deprecated package-library option
  • update clang support for Windows
  • update supportedCompiler fxn
  • use conditional choices and defaults
  • remove duplicate which function and minor cleanup
  • refactor sanity check in TensileCreateLibrary
  • factor client config logic from TensileCreateLibrary main into createClientConfig
  • use glob to find logic files in TensileCreateLibrary
  • use function to confirm supported compiler rather than raw logic
  • update verifyManifest in TensileCreateLibrary
  • update RTD configs
  • cleanup the CMake to prevent redundant work in client builds
  • update Stream-K debug settings

Fixes

  • fix Stream-K XCC configs for gfx942
  • update WMMA capability command for ISA 10+
  • fix progress bar character encoding error on Windows
  • fix solution redundancy removal
  • fix tuning imports for pyyaml
  • fix printing ASM capabilities for ROCm < 6.3
  • fix code objects by filtering kernels with build errors and unprocessed kernels
  • fix fully qualify std::get in contraction solutions
  • fix add -v flag and change system invocation
  • use conditional imports for new dependencies to fix yaml CSafe load and dump import, and to fix rich terminal print import
  • fix comments on scalarStaticDivideAndRemainder

Tensile 4.40.0 for ROCm 6.1.5

12 Mar 18:30
bf05992
Compare
Choose a tag to compare

Tensile code for ROCm 6.1.5 did not change. The library was rebuilt for the updated ROCm 6.1.5 stack.

Tensile 4.40.0 for ROCm 6.1.2

04 Jun 16:52
bf05992
Compare
Choose a tag to compare

Tensile code for ROCm 6.1.2 did not change. The library was rebuilt for the updated ROCm 6.1.2 stack.

Tensile 4.40.0 for ROCm 6.1.1

08 May 17:59
bf05992
Compare
Choose a tag to compare

Tensile code for ROCm 6.1.1 did not change. The library was rebuilt for the updated ROCm 6.1.1 stack.

Tensile 4.41.0 for ROCm 6.2.2

27 Sep 16:01
dbc2062
Compare
Choose a tag to compare

Tensile code for ROCm 6.2.2 did not change. The library was rebuilt for the updated ROCm 6.2.2 stack.

Tensile 4.41.0 for ROCm 6.2.1

20 Sep 19:57
dbc2062
Compare
Choose a tag to compare

Tensile code for ROCm 6.2.1 did not change. The library was rebuilt for the updated ROCm 6.2.1 stack.