Skip to content

SME1 based direct kernel (with alpha and beta) for cblas_sgemm level 3 #5380

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged

Conversation

matcraje
Copy link
Contributor

This PR contains support for sgemm_direct kernel (with support for alpha and beta) based on SME1 architecture.

@martin-frbg
Copy link
Collaborator

Thanks - this looks like an interesting addition (if not competitor) to the present sgemm_direct kernel. The CI error suggests that perhaps the entire kernel needs to be guarded with the __ARM_FEATURE_SME define (or the ...2VLx2VL function should have an empty alternative for when that feature macro is undefined) ?
Also, could you please review the attribution at the top of the kernel file - you are certainly welcome to put your copyright statement there, but I expect having a "confidential and proprietary" warning in the middle of a BSD-licensed code base would be confusing

@matcraje matcraje force-pushed the topic/sgemm_direct_sme1_alpha_beta branch from 38540ea to 442273d Compare July 16, 2025 05:24
@matcraje
Copy link
Contributor Author

Hi Martin. Thanks for your quick review. It’s very helpful.

I have updated the PR to address the two issues. (Add empty alternative for when feature macro is undefined. Modify the copyright statement)

@matcraje matcraje force-pushed the topic/sgemm_direct_sme1_alpha_beta branch from 442273d to 366deb1 Compare July 16, 2025 09:00
@martin-frbg
Copy link
Collaborator

seems now we have AppleClang acting up over things in its own arm_sme header file

@matcraje matcraje force-pushed the topic/sgemm_direct_sme1_alpha_beta branch from 366deb1 to 831c4e3 Compare July 17, 2025 06:19
@matcraje
Copy link
Contributor Author

matcraje commented Jul 17, 2025

seems now we have AppleClang acting up over things in its own arm_sme header file

From the error log, I understand that error coming from clang-15.0.0 not supporting 'arm_streaming_' attributes. I tried to reproduce locally but found sme isn't supported in clang-15.

For mitigation, I am adding extra guard along with __ARM_FEATURE_SME.

I have pushed below update.
#if defined(__ARM_FEATURE_SME) && defined(__ARM_FEATURE_LOCALLY_STREAMING)

If the above doesn't work, I can think of explicitly checking clang version.
#if defined(__ARM_FEATURE_SME) && defined(clang) && clang_major >= 16

@matcraje matcraje force-pushed the topic/sgemm_direct_sme1_alpha_beta branch from 831c4e3 to 70ef30c Compare July 17, 2025 09:10
@matcraje matcraje force-pushed the topic/sgemm_direct_sme1_alpha_beta branch from 70ef30c to eae0abf Compare July 17, 2025 10:49
@matcraje
Copy link
Contributor Author

@martin-frbg
Two checks are still failing. Can you please let me know how to resolve these failures.

@martin-frbg
Copy link
Collaborator

Both are unrelated - the loongarch job ran out of time and the IBM-Z build on Jenkins failed to access github (and still does today)

@martin-frbg martin-frbg added this to the 0.3.31 milestone Jul 18, 2025
@martin-frbg martin-frbg merged commit 39c90f9 into OpenMathLib:develop Jul 18, 2025
94 of 95 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants