-
-
Notifications
You must be signed in to change notification settings - Fork 748
Open
Labels
DocumentationImprovements or additions to documentationImprovements or additions to documentationHigh Priority(first issues that will be worked on)(first issues that will be worked on)Low RiskRisk of bugs in transformers and other librariesRisk of bugs in transformers and other librariesROCm
Description
System Info
I am under Linux Mint Xia (based on Ubuntu 24.04).
CPU: AMD Ryzen 9 5950X 16-Core Processor with 64GiB RAM.
GPU: Advanced Micro Devices, Inc. [AMD/ATI] Navi 33 [Radeon RX 7600/7600 XT/7600M XT/7600S/7700S / PRO W7600] (rev c0) (Actually: RX 7600 XT, if it matters)
Python: Python 3.10.16
Application: I am testing with InvokeAI
Reproduction
I tried following recipe but I found several errors:
- project has been converted to
.toml
and thuspip install -r requirements-dev.txt
won't work. cmake -DCOMPUTE_BACKEND=hip -S . && make
completes with no errors (just a few "kernels.hip:2857:17: warning: loop not unrolled:...".- Installing in venv did not complain but usage resulted in hard error:
Could not load bitsandbytes native library: /home/mcon/tmp/t/invoke/lib/python3.10/site-packages/bitsandbytes/libbitsandbytes_rocm63.so: undefined symbol: _Z36__device_stub__kOptimizer32bit1StateI12hip_bfloat16Li2EEvPT_S2_PfS3_ffffffiffbi Traceback (most recent call last): File "/home/mcon/tmp/t/invoke/lib/python3.10/site-packages/bitsandbytes/cextension.py", line 107, in <module> lib = get_native_library() File "/home/mcon/tmp/t/invoke/lib/python3.10/site-packages/bitsandbytes/cextension.py", line 86, in get_native_library dll = ct.cdll.LoadLibrary(str(binary_path)) File "/usr/local/lib/python3.10/ctypes/__init__.py", line 452, in LoadLibrary return self._dlltype(name) File "/usr/local/lib/python3.10/ctypes/__init__.py", line 374, in __init__ self._handle = _dlopen(self._name, mode) OSError: /home/mcon/tmp/t/invoke/lib/python3.10/site-packages/bitsandbytes/libbitsandbytes_rocm63.so: undefined symbol: _Z36__device_stub__kOptimizer32bit1StateI12hip_bfloat16Li2EEvPT_S2_PfS3_ffffffiffbi ROCm Setup failed despite ROCm being available. Please run the following command to get more information: python -m bitsandbytes Inspect the output of the command and see if you can locate ROCm libraries. You might need to add them to your LD_LIBRARY_PATH. If you suspect a bug, please take the information from python -m bitsandbytes and open an issue at: https://github.com/bitsandbytes-foundation/bitsandbytes/issues
I also tried:
(invoke) mcon@ikea:~/tmp/t$ ROCM_PATH=/opt/rocm LD_LIBRARY_PATH=/opt/rocm/lib python -m bitsandbytes
Could not load bitsandbytes native library: /home/mcon/tmp/t/invoke/lib/python3.10/site-packages/bitsandbytes/libbitsandbytes_rocm63.so: undefined symbol: _Z36__device_stub__kOptimizer32bit1StateI12hip_bfloat16Li2EEvPT_S2_PfS3_ffffffiffbi
Traceback (most recent call last):
File "/home/mcon/tmp/t/invoke/lib/python3.10/site-packages/bitsandbytes/cextension.py", line 107, in <module>
lib = get_native_library()
File "/home/mcon/tmp/t/invoke/lib/python3.10/site-packages/bitsandbytes/cextension.py", line 86, in get_native_library
dll = ct.cdll.LoadLibrary(str(binary_path))
File "/usr/local/lib/python3.10/ctypes/__init__.py", line 452, in LoadLibrary
return self._dlltype(name)
File "/usr/local/lib/python3.10/ctypes/__init__.py", line 374, in __init__
self._handle = _dlopen(self._name, mode)
OSError: /home/mcon/tmp/t/invoke/lib/python3.10/site-packages/bitsandbytes/libbitsandbytes_rocm63.so: undefined symbol: _Z36__device_stub__kOptimizer32bit1StateI12hip_bfloat16Li2EEvPT_S2_PfS3_ffffffiffbi
ROCm Setup failed despite ROCm being available. Please run the following command to get more information:
python -m bitsandbytes
Inspect the output of the command and see if you can locate ROCm libraries. You might need to add them
to your LD_LIBRARY_PATH. If you suspect a bug, please take the information from python -m bitsandbytes
and open an issue at: https://github.com/bitsandbytes-foundation/bitsandbytes/issues
++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++
++++++++++++++++++ BUG REPORT INFORMATION ++++++++++++++++++
++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++
++++++++++++++++++++++++++ OTHER +++++++++++++++++++++++++++
ROCm specs: rocm_version_string='63', rocm_version_tuple=(6, 3)
PyTorch settings found: ROCM_VERSION=63
++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++
++++++++++++++++++++++ DEBUG INFO END ++++++++++++++++++++++
++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++
Checking that the library is importable and ROCm is callable...
Couldn't load the bitsandbytes library, likely due to missing binaries.
Please ensure bitsandbytes is properly installed.
For source installations, compile the binaries with `cmake -DCOMPUTE_BACKEND=hip -S .`.
See the documentation for more details if needed.
Trying a simple check anyway, but this will likely fail...
Traceback (most recent call last):
File "/home/mcon/tmp/t/invoke/lib/python3.10/site-packages/bitsandbytes/diagnostics/main.py", line 73, in main
sanity_check()
File "/home/mcon/tmp/t/invoke/lib/python3.10/site-packages/bitsandbytes/diagnostics/main.py", line 37, in sanity_check
p1 = p.data.sum().item()
RuntimeError: HIP error: invalid device function
HIP kernel errors might be asynchronously reported at some other API call, so the stacktrace below might be incorrect.
For debugging consider passing AMD_SERIALIZE_KERNEL=3
Compile with `TORCH_USE_HIP_DSA` to enable device-side assertions.
Above we output some debug information.
Please provide this info when creating an issue via https://github.com/TimDettmers/bitsandbytes/issues/new/choose
WARNING: Please be sure to sanitize sensitive info from the output before posting it.
Expected behavior
I expected to be able to use bitsandbytes
.
astefanutti, Asherathe and tarkh
Metadata
Metadata
Assignees
Labels
DocumentationImprovements or additions to documentationImprovements or additions to documentationHigh Priority(first issues that will be worked on)(first issues that will be worked on)Low RiskRisk of bugs in transformers and other librariesRisk of bugs in transformers and other librariesROCm