Skip to content

Multiple issues installing for AMD GPU (Radeon RX7600XT) #1519

@mcondarelli

Description

@mcondarelli

System Info

I am under Linux Mint Xia (based on Ubuntu 24.04).
CPU: AMD Ryzen 9 5950X 16-Core Processor with 64GiB RAM.
GPU: Advanced Micro Devices, Inc. [AMD/ATI] Navi 33 [Radeon RX 7600/7600 XT/7600M XT/7600S/7700S / PRO W7600] (rev c0) (Actually: RX 7600 XT, if it matters)
Python: Python 3.10.16
Application: I am testing with InvokeAI

Reproduction

I tried following recipe but I found several errors:

  • project has been converted to .toml and thus pip install -r requirements-dev.txt won't work.
  • cmake -DCOMPUTE_BACKEND=hip -S . && make completes with no errors (just a few "kernels.hip:2857:17: warning: loop not unrolled:...".
  • Installing in venv did not complain but usage resulted in hard error:
    Could not load bitsandbytes native library: /home/mcon/tmp/t/invoke/lib/python3.10/site-packages/bitsandbytes/libbitsandbytes_rocm63.so: undefined symbol: _Z36__device_stub__kOptimizer32bit1StateI12hip_bfloat16Li2EEvPT_S2_PfS3_ffffffiffbi
    Traceback (most recent call last):
    File "/home/mcon/tmp/t/invoke/lib/python3.10/site-packages/bitsandbytes/cextension.py", line 107, in <module>
      lib = get_native_library()
    File "/home/mcon/tmp/t/invoke/lib/python3.10/site-packages/bitsandbytes/cextension.py", line 86, in get_native_library
      dll = ct.cdll.LoadLibrary(str(binary_path))
    File "/usr/local/lib/python3.10/ctypes/__init__.py", line 452, in LoadLibrary
      return self._dlltype(name)
    File "/usr/local/lib/python3.10/ctypes/__init__.py", line 374, in __init__
      self._handle = _dlopen(self._name, mode)
    OSError: /home/mcon/tmp/t/invoke/lib/python3.10/site-packages/bitsandbytes/libbitsandbytes_rocm63.so: undefined symbol: _Z36__device_stub__kOptimizer32bit1StateI12hip_bfloat16Li2EEvPT_S2_PfS3_ffffffiffbi
    
    ROCm Setup failed despite ROCm being available. Please run the following command to get more information:
    
    python -m bitsandbytes
    
    Inspect the output of the command and see if you can locate ROCm libraries. You might need to add them
    to your LD_LIBRARY_PATH. If you suspect a bug, please take the information from python -m bitsandbytes
    and open an issue at: https://github.com/bitsandbytes-foundation/bitsandbytes/issues
    

I also tried:

(invoke) mcon@ikea:~/tmp/t$ ROCM_PATH=/opt/rocm LD_LIBRARY_PATH=/opt/rocm/lib python -m bitsandbytes
Could not load bitsandbytes native library: /home/mcon/tmp/t/invoke/lib/python3.10/site-packages/bitsandbytes/libbitsandbytes_rocm63.so: undefined symbol: _Z36__device_stub__kOptimizer32bit1StateI12hip_bfloat16Li2EEvPT_S2_PfS3_ffffffiffbi
Traceback (most recent call last):
  File "/home/mcon/tmp/t/invoke/lib/python3.10/site-packages/bitsandbytes/cextension.py", line 107, in <module>
    lib = get_native_library()
  File "/home/mcon/tmp/t/invoke/lib/python3.10/site-packages/bitsandbytes/cextension.py", line 86, in get_native_library
    dll = ct.cdll.LoadLibrary(str(binary_path))
  File "/usr/local/lib/python3.10/ctypes/__init__.py", line 452, in LoadLibrary
    return self._dlltype(name)
  File "/usr/local/lib/python3.10/ctypes/__init__.py", line 374, in __init__
    self._handle = _dlopen(self._name, mode)
OSError: /home/mcon/tmp/t/invoke/lib/python3.10/site-packages/bitsandbytes/libbitsandbytes_rocm63.so: undefined symbol: _Z36__device_stub__kOptimizer32bit1StateI12hip_bfloat16Li2EEvPT_S2_PfS3_ffffffiffbi

ROCm Setup failed despite ROCm being available. Please run the following command to get more information:

python -m bitsandbytes

Inspect the output of the command and see if you can locate ROCm libraries. You might need to add them
to your LD_LIBRARY_PATH. If you suspect a bug, please take the information from python -m bitsandbytes
and open an issue at: https://github.com/bitsandbytes-foundation/bitsandbytes/issues

++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++
++++++++++++++++++ BUG REPORT INFORMATION ++++++++++++++++++
++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++
++++++++++++++++++++++++++ OTHER +++++++++++++++++++++++++++
ROCm specs: rocm_version_string='63', rocm_version_tuple=(6, 3)
PyTorch settings found: ROCM_VERSION=63
++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++
++++++++++++++++++++++ DEBUG INFO END ++++++++++++++++++++++
++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++
Checking that the library is importable and ROCm is callable...
Couldn't load the bitsandbytes library, likely due to missing binaries.
Please ensure bitsandbytes is properly installed.

For source installations, compile the binaries with `cmake -DCOMPUTE_BACKEND=hip -S .`.
See the documentation for more details if needed.

Trying a simple check anyway, but this will likely fail...
Traceback (most recent call last):
  File "/home/mcon/tmp/t/invoke/lib/python3.10/site-packages/bitsandbytes/diagnostics/main.py", line 73, in main
    sanity_check()
  File "/home/mcon/tmp/t/invoke/lib/python3.10/site-packages/bitsandbytes/diagnostics/main.py", line 37, in sanity_check
    p1 = p.data.sum().item()
RuntimeError: HIP error: invalid device function
HIP kernel errors might be asynchronously reported at some other API call, so the stacktrace below might be incorrect.
For debugging consider passing AMD_SERIALIZE_KERNEL=3
Compile with `TORCH_USE_HIP_DSA` to enable device-side assertions.

Above we output some debug information.
Please provide this info when creating an issue via https://github.com/TimDettmers/bitsandbytes/issues/new/choose
WARNING: Please be sure to sanitize sensitive info from the output before posting it.

Expected behavior

I expected to be able to use bitsandbytes.

Metadata

Metadata

Labels

DocumentationImprovements or additions to documentationHigh Priority(first issues that will be worked on)Low RiskRisk of bugs in transformers and other librariesROCm

Type

Projects

No projects

Milestone

No milestone

Relationships

None yet

Development

No branches or pull requests

Issue actions