Skip to content

musa: handle __hgt2_mask (available starting from MUSA SDK rc4.3.0) #15413

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 1 commit into from
Aug 19, 2025

Conversation

yeahdongcn
Copy link
Collaborator

@yeahdongcn yeahdongcn commented Aug 19, 2025

Make sure to read the contributing guidelines before submitting a PR

__hgt2_mask will be available starting from MUSA SDK rc4.3.0.

The following code snippet shows how MUSART_VERSION is calculated:

#define __MUSA_API_VER_MAJOR__      4
#define __MUSA_API_VER_MINOR__      2
#define __MUSA_API_VER_PATCH__      0
#define __MUSA_API_VER_TWEAK__      
#define __MUSART_API_VERSION ((__MUSA_API_VER_MAJOR__ * 10000) + (__MUSA_API_VER_MINOR__ * 100) + __MUSA_API_VER_PATCH__)
#define MUSART_VERSION              __MUSART_API_VERSION

Testing Done

Extra build test on HIP with -DGGML_HIP_ROCWMMA_FATTN=ON passed:

HIPCXX="$(hipconfig -l)/clang" HIP_PATH="$(hipconfig -R)" \
    cmake -S . -B build -DGGML_HIP=ON -DGGML_HIP_ROCWMMA_FATTN=ON -DCMAKE_BUILD_TYPE=Release \
    && cmake --build build --config Release -- -j $(nproc)

@github-actions github-actions bot added Nvidia GPU Issues specific to Nvidia GPUs ggml changes relating to the ggml tensor library for machine learning labels Aug 19, 2025
@yeahdongcn yeahdongcn force-pushed the xd/__hgt2_mask branch 3 times, most recently from f995cdc to b6b6203 Compare August 19, 2025 03:22
@yeahdongcn yeahdongcn marked this pull request as ready for review August 19, 2025 05:44
@IMbackK IMbackK merged commit 67f09a3 into ggml-org:master Aug 19, 2025
43 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
ggml changes relating to the ggml tensor library for machine learning Nvidia GPU Issues specific to Nvidia GPUs
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants