-
Notifications
You must be signed in to change notification settings - Fork 23
Add threading support for Apple Accelerate #169
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: main
Are you sure you want to change the base?
Conversation
These APIs are new in macOS 15, but only allow saying if you want single threaded or multi-threaded operation, not specifying the actual number of threads to use in a multi-threaded setup.
27f9d5b to
f94f2ab
Compare
|
How would I examine the value of such a variable? |
|
I think it might be a function. I was just talking with @giordano and he tried this on a machine he has with 16 cores and got this: julia> @ccall AppleAccelerate.libacc.APPLE_NTHREADS()::Int
16If you can give that a try and see what it says, that would be nice. |
|
I am on an M2 Max, and this is what I get: Do we know when it was introduced? |
|
For the record, I tested on M4 Max |
Nope, I haven't been able to find anything about it other than a bunch of tbd symbol lists on GitHub, and one person who wrote a stub function for it. It was basically a shot in the dark that we could even call it properly. In LBT we have to lookup the symbol in the symbol table anyway, so we should be safe to use it no matter the macOS version (basically just call it if we find it, otherwise report a default value). |
|
Uhm, we should be able to find out by bisecting the SDKs hosted on GitHub. |
I'd need to double check, but I seem to remember our julia> Sys.CPU_THREADS
12
julia> @ccall "/System/Library/Frameworks/Accelerate.framework/Accelerate".APPLE_NTHREADS()::Int
16 |
According to Wikipedia, the M2 Max has 8 performance cores and 4 efficiency cores (although I think ARM calls them big/little). So the |
|
|
% git clone --depth=1 https://github.com/nickfnblum/MacOSX-SDKs
% grep -r '_APPLE_NTHREADS' --include='*.tbd' MacOSX-SDKs
MacOSX-SDKs/MacOSX11.3.sdk/System/Library/Frameworks/Accelerate.framework/Versions/A/Frameworks/vecLib.framework/Versions/A/vecLib.tbd: _APPLE_NTHREADS, _ATLU_DestroyThreadMemory, _BLASStateRelease,
MacOSX-SDKs/MacOSX11.3.sdk/System/Library/Frameworks/Accelerate.framework/Versions/A/Frameworks/vecLib.framework/Versions/A/libBLAS.tbd: _APPLE_NTHREADS, _ATLU_DestroyThreadMemory, _BLASStateRelease,
MacOSX-SDKs/MacOSX11.3.sdk/System/Library/Frameworks/Accelerate.framework/Versions/A/Accelerate.tbd: _APPLE_NTHREADS, _ATLU_DestroyThreadMemory, _BLASStateRelease,
MacOSX-SDKs/MacOSX11.1.sdk/System/Library/Frameworks/Accelerate.framework/Versions/A/Frameworks/vecLib.framework/Versions/A/vecLib.tbd: _APPLE_NTHREADS, _ATLU_DestroyThreadMemory, _BLASStateRelease,
MacOSX-SDKs/MacOSX11.1.sdk/System/Library/Frameworks/Accelerate.framework/Versions/A/Frameworks/vecLib.framework/Versions/A/libBLAS.tbd: _APPLE_NTHREADS, _ATLU_DestroyThreadMemory, _BLASStateRelease,
MacOSX-SDKs/MacOSX11.1.sdk/System/Library/Frameworks/Accelerate.framework/Versions/A/Accelerate.tbd: _APPLE_NTHREADS, _ATLU_DestroyThreadMemory, _BLASStateRelease,
MacOSX-SDKs/MacOSX11.0.sdk/System/Library/Frameworks/Accelerate.framework/Versions/A/Frameworks/vecLib.framework/Versions/A/vecLib.tbd: _APPLE_NTHREADS, _ATLU_DestroyThreadMemory, _BLASStateRelease,
MacOSX-SDKs/MacOSX11.0.sdk/System/Library/Frameworks/Accelerate.framework/Versions/A/Frameworks/vecLib.framework/Versions/A/libBLAS.tbd: _APPLE_NTHREADS, _ATLU_DestroyThreadMemory, _BLASStateRelease,
MacOSX-SDKs/MacOSX11.0.sdk/System/Library/Frameworks/Accelerate.framework/Versions/A/Accelerate.tbd: _APPLE_NTHREADS, _ATLU_DestroyThreadMemory, _BLASStateRelease,
MacOSX-SDKs/MacOSX10.15.sdk/System/Library/Frameworks/Accelerate.framework/Versions/A/Frameworks/vecLib.framework/Versions/A/vecLib.tbd: _APPLE_NTHREADS, _ATLU_DestroyThreadMemory, _CAXPY, _CAXPY_,
MacOSX-SDKs/MacOSX10.15.sdk/System/Library/Frameworks/Accelerate.framework/Versions/A/Frameworks/vecLib.framework/Versions/A/libBLAS.tbd: _APPLE_NTHREADS, _ATLU_DestroyThreadMemory, _CAXPY, _CAXPY_,
MacOSX-SDKs/MacOSX10.15.sdk/System/Library/Frameworks/Accelerate.framework/Versions/A/Accelerate.tbd: _APPLE_NTHREADS, _ATLU_DestroyThreadMemory, _CAXPY, _CAXPY_,
MacOSX-SDKs/MacOSX10.14.sdk/System/Library/Frameworks/Accelerate.framework/Versions/A/Frameworks/vecLib.framework/Versions/A/libBLAS.tbd: _APL_sgemm_LU, _APL_sgemm_QR, _APL_strsm, _APPLE_NTHREADS,
MacOSX-SDKs/MacOSX10.13.sdk/System/Library/Frameworks/Accelerate.framework/Versions/A/Frameworks/vecLib.framework/Versions/A/libBLAS.tbd: _APL_sgemm_LU, _APL_sgemm_QR, _APL_strsm, _APPLE_NTHREADS,
MacOSX-SDKs/MacOSX10.12.sdk/System/Library/Frameworks/Accelerate.framework/Versions/A/Frameworks/vecLib.framework/Versions/A/libBLAS.tbd: _APL_sgemm_LU, _APL_sgemm_QR, _APL_strsm, _APPLE_NTHREADS,
MacOSX-SDKs/MacOSX10.11.sdk/System/Library/Frameworks/Accelerate.framework/Versions/A/Frameworks/vecLib.framework/Versions/A/libBLAS.tbd: _APPLE_NTHREADS, _ATLU_DestroyThreadMemory, _CAXPY,
MacOSX-SDKs/MacOSX10.10.sdk/System/Library/Frameworks/Accelerate.framework/Versions/A/Frameworks/vecLib.framework/Versions/A/libBLAS.tbd: _APPLE_NTHREADS, _ATLU_DestroyThreadMemory, _CAXPY, I'd say this was introduced in macOS 10.10. |
|
Mirrored |
src/threading.c
Outdated
| if(nthreads == ACCELERATE_BLAS_THREADING_MULTI_THREADED) { | ||
| // This number is arbitrary right now, but greater than 1 to mean multi-threaded. | ||
| // TODO: Can we guestimate the number of threads from the APPLE_NTHREADS symbol in accelerate? | ||
| max_threads = 2; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I went with APPLE_NTHREADS in AppleAccelerate.jl
6e84867 to
6fee1ab
Compare
6fee1ab to
006da93
Compare
|
Ok, added some tests for Apple Accelerate now (both loading the symbols and calling this threading API). This should be good for review and merge. After merge, I want to hold off tagging until PRs #168 and #167 are merged, then we can do one release with all three items. (I think I am close to finishing those). |
|
What happens on macos intel, where both MKL (installed) and Accelerate are available? |
This is dependent on if someone loads either the Loading |
These APIs are new in macOS 15, but only allow saying if you want single threaded or multi-threaded operation, not specifying the actual number of threads to use in a multi-threaded setup.
There is a symbol called
APPLE_NTHREADSin the accelerate library, which might be able to say how many threads it will use in multi-threaded mode? However, I can't find any information about that anywhere online (just it listed in various symbol tables for the library). If someone with a macOS 15+ machine can examine that symbol to see what it does/looks like, we could update the TODO in this PR potentially.This is the LBT implementation of JuliaLinearAlgebra/AppleAccelerate.jl#88. Fixes #115.
cc @ViralBShah @giordano