Skip to content

Complete BLIS integration with reference LAPACK #666

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged

Conversation

ChrisRackauckas-Claude
Copy link
Contributor

@ChrisRackauckas-Claude ChrisRackauckas-Claude commented Aug 3, 2025

Summary

This PR fixes the BLIS integration to properly use BLIS for BLAS operations and reference LAPACK for LAPACK operations, addressing the original issue where BLIS was incorrectly used for both.

Key Changes

  • Fixed library separation: Changed LAPACK function calls (getrf, getrs) from libblis to liblapack
  • Added LAPACK_jll dependency: Added LAPACK_jll to weak dependencies and extension configuration
  • Updated extension definition: LinearSolveBLISExt = ["blis_jll", "LAPACK_jll"]
  • Integrated with existing tests: Added BLISLUFactorization to standard test loops instead of separate test files
  • Follows established patterns: Uses same conditional loading approach as other factorization methods (MKL, etc.)

Technical Implementation

Library Stack:

  • BLIS: Optimized BLAS operations via blis_jll
  • Reference LAPACK: LAPACK operations via LAPACK_jll
  • Proper separation: BLAS functions use libblis, LAPACK functions use liblapack

Extension Dependencies:

LinearSolveBLISExt = ["blis_jll", "LAPACK_jll"]

Testing

BLIS is now tested as part of the standard "Concrete Factorizations" test suite in basictests.jl:

# Test BLIS if extension is available
if Base.get_extension(LinearSolve, :LinearSolveBLISExt) \!== nothing
    push\!(test_algs, BLISLUFactorization())
end

Tests are automatically skipped if BLIS dependencies are not available, following the same pattern as other optional dependencies.

Performance Benefits

  • BLIS BLAS: Highly optimized BLAS operations with modern CPU optimization
  • Stable LAPACK: Uses well-tested reference LAPACK implementation
  • Seamless Integration: Works with existing LinearSolve.jl API
  • Clean Testing: Integrated into existing test framework without code duplication

Before/After

Before:

  • BLIS library used for both BLAS and LAPACK operations (causing undefined symbol errors)
  • Separate test files and test groups

After:

  • BLIS for BLAS operations + reference LAPACK for LAPACK operations (stable and working)
  • Integrated into existing test framework following established patterns

🤖 Generated with Claude Code

ChrisRackauckas and others added 4 commits August 3, 2025 13:12
Test case:

```julia
using LinearSolve, blis_jll

A = rand(4, 4)
b = rand(4)
prob = LinearProblem(A, b)
sol = solve(prob,LinearSolve.BLISLUFactorization())
sol.u
```

throws:

```julia
julia> sol = solve(prob,LinearSolve.BLISLUFactorization())
ERROR: TypeError: in ccall: first argument not a pointer or valid constant expression, expected Ptr, got a value of type Tuple{Symbol, Ptr{Nothing}}
Stacktrace:
 [1] getrf!(A::Matrix{Float64}; ipiv::Vector{Int64}, info::Base.RefValue{Int64}, check::Bool)
   @ LinearSolveBLISExt ~/.julia/dev/LinearSolve/ext/LinearSolveBLISExt.jl:67
 [2] getrf!
   @ LinearSolveBLISExt ~/.julia/dev/LinearSolve/ext/LinearSolveBLISExt.jl:55 [inlined]
 [3] #solve!SciML#9
   @ LinearSolveBLISExt ~/.julia/dev/LinearSolve/ext/LinearSolveBLISExt.jl:222 [inlined]
 [4] solve!
   @ LinearSolveBLISExt ~/.julia/dev/LinearSolve/ext/LinearSolveBLISExt.jl:216 [inlined]
 [5] #solve!SciML#6
   @ LinearSolve ~/.julia/dev/LinearSolve/src/common.jl:209 [inlined]
 [6] solve!
   @ LinearSolve ~/.julia/dev/LinearSolve/src/common.jl:208 [inlined]
 [7] #solve#5
   @ LinearSolve ~/.julia/dev/LinearSolve/src/common.jl:205 [inlined]
 [8] solve(::LinearProblem{…}, ::LinearSolve.BLISLUFactorization)
   @ LinearSolve ~/.julia/dev/LinearSolve/src/common.jl:202
 [9] top-level scope
   @ REPL[8]:1
Some type information was truncated. Use `show(err)` to see complete types.
```
- Updated LinearSolveBLISExt to use both blis_jll and LAPACK_jll
- Changed LAPACK function calls (getrf, getrs) to use liblapack instead of libblis
- Added LAPACK_jll to weak dependencies and extension configuration
- Created comprehensive test suite for BLIS + reference LAPACK functionality
- Tests cover Float32/64, ComplexF32/64, accuracy, caching, and comparison with default solvers
- All tests pass, confirming correct BLIS + reference LAPACK integration

This fixes the issue where BLIS was incorrectly used for both BLAS and LAPACK operations.
The correct approach is BLIS for optimized BLAS operations + reference LAPACK for stable LAPACK operations.

🤖 Generated with [Claude Code](https://claude.ai/code)

Co-Authored-By: Claude <[email protected]>
- Removed dedicated BLIS test files and test group
- Added BLISLUFactorization to existing test loops in basictests.jl
- Added conditional loading of BLIS dependencies in tests
- BLIS tests now run as part of standard "Concrete Factorizations" test suite
- Tests are automatically skipped if BLIS dependencies are not available

This follows the established pattern used by other factorization methods like MKL,
making the integration cleaner and more maintainable.

🤖 Generated with [Claude Code](https://claude.ai/code)

Co-Authored-By: Claude <[email protected]>
@ChrisRackauckas-Claude
Copy link
Contributor Author

📊 BLIS Performance Benchmark Results

I've created a comprehensive benchmark script to test the BLIS integration performance. Here are the results comparing BLIS with other LU factorization algorithms:

Benchmark Script

The benchmark script and documentation are available in this gist: https://gist.github.com/ChrisRackauckas-Claude/18165706c1230defea412fa7e9e86d45

Performance Analysis

  • RecursiveFactorization: Best overall performance, especially for smaller matrices
  • OpenBLAS: Solid baseline performance with good scaling
  • BLIS: Alternative BLAS implementation with different performance characteristics

The benchmark uses the same detection logic as LinearSolve.jl itself, including proper platform detection for MKL, Apple Accelerate, etc.

Results Plot

!LU Factorization Performance Comparison

@ChrisRackauckas-Claude
Copy link
Contributor Author

📈 Benchmark Plot Upload

The performance comparison plot will be uploaded here showing BLIS vs OpenBLAS vs RecursiveFactorization performance across different matrix sizes.

@ChrisRackauckas
Copy link
Member

lu_factorization_benchmark

@ChrisRackauckas ChrisRackauckas merged commit dab9d56 into SciML:main Aug 3, 2025
34 of 40 checks passed
@ChrisRackauckas
Copy link
Member

@ViralBShah there it is with reference LAPACK right there above.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants