[bug]: Invoke refuses to use my RX 7600 XT GPU #7574

mcondarelli · 2025-01-20T11:24:44Z

Is there an existing issue for this problem?

I have searched the existing issues

Operating system

Linux

GPU vendor

AMD (ROCm)

GPU model

RX 7600 XT

GPU VRAM

16GB

Version number

5.5.0

Browser

Firefox 134.0

Python dependencies

{
  "accelerate": "1.0.1",
  "compel": "2.0.2",
  "cuda": null,
  "diffusers": "0.31.0",
  "numpy": "1.26.3",
  "opencv": "4.9.0.80",
  "onnx": "1.16.1",
  "pillow": "10.2.0",
  "python": "3.11.11",
  "torch": "2.4.1+rocm6.1",
  "torchvision": "0.19.1+rocm6.1",
  "transformers": "4.46.3",
  "xformers": null
}

What happened

Every time I try to generate an image I get error:

Server Error
RuntimeError: HIP error: invalid device function HIP kernel errors might be asynchronously reported at some other API call, so the stacktrace below might be incorrect. For debugging consider passing AMD_SERIALIZE_KERNEL=3 Compile with `TORCH_USE_HIP_DSA` to...

What you expected to happen

I expected image generation to start.

How to reproduce the problem

In my setup all image generation attempts produce this error.
Using a CPU-only, no-GPU configuration works as expected... and, as expected, is very slow.

Additional context

I have seen several bug reports mentioning ROCm, but I didn't find anything really comparable.
Notice I'm a completely newbie at AI hosting so I might be missing something pretty basic.

Full specs of my server are:

root@ikea:~# lshw -short
H/W path              Device          Class          Description
================================================================
                                      system         MS-7C91 (To be filled by O.E.M.)
/0                                    bus            MPG B550 GAMING EDGE WIFI (MS-7C91)
/0/0                                  memory         64KiB BIOS
/0/10                                 memory         32GiB System Memory
/0/10/0                               memory         2667 MHz (0.4 ns) [empty]
/0/10/1                               memory         16GiB DIMM DDR4 Synchronous Unbuffered (Unregistered) 2667 MHz (0.4 ns)
/0/10/2                               memory         2667 MHz (0.4 ns) [empty]
/0/10/3                               memory         16GiB DIMM DDR4 Synchronous Unbuffered (Unregistered) 2667 MHz (0.4 ns)
/0/13                                 memory         1MiB L1 cache
/0/14                                 memory         8MiB L2 cache
/0/15                                 memory         64MiB L3 cache
/0/16                                 processor      AMD Ryzen 9 5950X 16-Core Processor
/0/100                                bridge         Starship/Matisse Root Complex
/0/100/0.2                            generic        Starship/Matisse IOMMU
/0/100/1.1                            bridge         Starship/Matisse GPP Bridge
/0/100/1.1/0          /dev/nvme0      storage        CT2000P2SSD8
/0/100/1.1/0/0        hwmon0          disk           NVMe disk
/0/100/1.1/0/2        /dev/ng0n1      disk           NVMe disk
/0/100/1.1/0/1        /dev/nvme0n1    disk           2TB NVMe disk
/0/100/1.1/0/1/1      /dev/nvme0n1p1  volume         511MiB Windows FAT volume
/0/100/1.1/0/1/2      /dev/nvme0n1p2  volume         201GiB EXT4 volume
/0/100/1.1/0/1/3      /dev/nvme0n1p3  volume         1023MiB Linux swap volume
/0/100/1.1/0/1/4      /dev/nvme0n1p4  volume         1660GiB EXT4 volume
/0/100/1.2                            bridge         Starship/Matisse GPP Bridge
/0/100/1.2/0                          bus            500 Series Chipset USB 3.1 XHCI Controller
/0/100/1.2/0/0        usb1            bus            xHCI Host Controller
/0/100/1.2/0/0/2                      bus            USB2.0 Hub
/0/100/1.2/0/0/8      input6          input          MSI MYSTIC LIGHT
/0/100/1.2/0/0/9                      communication  AX200 Bluetooth
/0/100/1.2/0/1        usb2            bus            xHCI Host Controller
/0/100/1.2/0.1                        storage        500 Series Chipset SATA Controller
/0/100/1.2/0.2                        bridge         500 Series Chipset Switch Upstream Port
/0/100/1.2/0.2/8                      bridge         Advanced Micro Devices, Inc. [AMD]
/0/100/1.2/0.2/8/0    wlo1            network        Wi-Fi 6 AX200
/0/100/1.2/0.2/9                      bridge         Advanced Micro Devices, Inc. [AMD]
/0/100/1.2/0.2/9/0    enp42s0         network        RTL8125 2.5GbE Controller
/0/100/3.1                            bridge         Starship/Matisse GPP Bridge
/0/100/3.1/0                          bridge         Navi 10 XL Upstream Port of PCI Express Switch
/0/100/3.1/0/0        /dev/fb0        bridge         Navi 10 XL Downstream Port of PCI Express Switch
/0/100/3.1/0/0/0      /dev/fb0        display        Navi 33 [Radeon RX 7600/7600 XT/7600M XT/7600S/7700S / PRO W7600]
/0/100/3.1/0/0/0.1    card0           multimedia     Navi 31 HDMI/DP Audio
/0/100/3.1/0/0/0.1/0  input10         input          HDA ATI HDMI HDMI/DP,pcm=3
/0/100/3.1/0/0/0.1/1  input11         input          HDA ATI HDMI HDMI/DP,pcm=7
/0/100/3.1/0/0/0.1/2  input12         input          HDA ATI HDMI HDMI/DP,pcm=8
/0/100/3.1/0/0/0.1/3  input13         input          HDA ATI HDMI HDMI/DP,pcm=9
/0/100/7.1                            bridge         Starship/Matisse Internal PCIe GPP Bridge 0 to bus[E:B]
/0/100/7.1/0                          generic        Starship/Matisse PCIe Dummy Function
/0/100/8.1                            bridge         Starship/Matisse Internal PCIe GPP Bridge 0 to bus[E:B]
/0/100/8.1/0                          generic        Starship/Matisse Reserved SPP
/0/100/8.1/0.1                        generic        Starship/Matisse Cryptographic Coprocessor PSPCPP
/0/100/8.1/0.3                        bus            Matisse USB 3.0 Host Controller
/0/100/8.1/0.3/0      usb3            bus            xHCI Host Controller
/0/100/8.1/0.3/0/1    input0          input          CX 2.4G Receiver System Control
/0/100/8.1/0.3/1      usb4            bus            xHCI Host Controller
/0/100/8.1/0.4        card1           multimedia     Starship/Matisse HD Audio Controller
/0/100/8.1/0.4/0      input14         input          HDA Digital PCBeep
/0/100/8.1/0.4/1      input15         input          HD-Audio Generic Rear Mic
/0/100/8.1/0.4/2      input16         input          HD-Audio Generic Front Mic
/0/100/8.1/0.4/3      input17         input          HD-Audio Generic Line
/0/100/8.1/0.4/4      input18         input          HD-Audio Generic Line Out Front
/0/100/8.1/0.4/5      input19         input          HD-Audio Generic Line Out Surround
/0/100/8.1/0.4/6      input20         input          HD-Audio Generic Line Out CLFE
/0/100/8.1/0.4/7      input21         input          HD-Audio Generic Front Headphone
/0/100/14                             bus            FCH SMBus Controller
/0/100/14.3                           bridge         FCH LPC Bridge
/0/100/14.3/0                         system         PnP device PNP0c01
/0/100/14.3/1                         system         PnP device PNP0c02
/0/100/14.3/2                         system         PnP device PNP0b00
/0/100/14.3/3                         system         PnP device PNP0c02
/0/100/14.3/4                         system         PnP device PNP0c02
/0/101                                bridge         Starship/Matisse PCIe Dummy Host Bridge
/0/102                                bridge         Starship/Matisse PCIe Dummy Host Bridge
/0/103                                bridge         Starship/Matisse PCIe Dummy Host Bridge
/0/104                                bridge         Starship/Matisse PCIe Dummy Host Bridge
/0/105                                bridge         Starship/Matisse PCIe Dummy Host Bridge
/0/106                                bridge         Starship/Matisse PCIe Dummy Host Bridge
/0/107                                bridge         Starship/Matisse PCIe Dummy Host Bridge
/0/108                                bridge         Matisse/Vermeer Data Fabric: Device 18h; Function 0
/0/109                                bridge         Matisse/Vermeer Data Fabric: Device 18h; Function 1
/0/10a                                bridge         Matisse/Vermeer Data Fabric: Device 18h; Function 2
/0/10b                                bridge         Matisse/Vermeer Data Fabric: Device 18h; Function 3
/0/10c                                bridge         Matisse/Vermeer Data Fabric: Device 18h; Function 4
/0/10d                                bridge         Matisse/Vermeer Data Fabric: Device 18h; Function 5
/0/10e                                bridge         Matisse/Vermeer Data Fabric: Device 18h; Function 6
/0/10f                                bridge         Matisse/Vermeer Data Fabric: Device 18h; Function 7
/1                    input7          input          Power Button
/2                    input8          input          Power Button
/3                    input9          input          PC Speaker
root@ikea:~#

Discord username

mcon

The text was updated successfully, but these errors were encountered:

SherLock707 · 2025-02-02T18:09:17Z

I have the same issue with my rx 6700xt on Arch.

[180412:0202/233234.609809:ERROR:gl_surface_presentation_helper.cc(260)] GetVSyncParametersIfAvailable() failed for 1 times!
[180412:0202/233242.280897:ERROR:gl_surface_presentation_helper.cc(260)] GetVSyncParametersIfAvailable() failed for 2 times!
[180412:0202/233242.281545:ERROR:gl_surface_presentation_helper.cc(260)] GetVSyncParametersIfAvailable() failed for 3 times!
Starting up...

Started Invoke process with PID: 180577

amdgpu.ids: No such file or directory

Could not load bitsandbytes native library: 'NoneType' object has no attribute 'split'
Traceback (most recent call last):
  File "/run/media/itachi/DATA_SATA_4TB/SD2/InvokeAI/.venv/lib/python3.11/site-packages/bitsandbytes/cextension.py", line 85, in <module>
    lib = get_native_library()
          ^^^^^^^^^^^^^^^^^^^^
  File "/run/media/itachi/DATA_SATA_4TB/SD2/InvokeAI/.venv/lib/python3.11/site-packages/bitsandbytes/cextension.py", line 64, in get_native_library
    cuda_specs = get_cuda_specs()
                 ^^^^^^^^^^^^^^^^
  File "/run/media/itachi/DATA_SATA_4TB/SD2/InvokeAI/.venv/lib/python3.11/site-packages/bitsandbytes/cuda_specs.py", line 39, in get_cuda_specs
    cuda_version_string=(get_cuda_version_string()),
                         ^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/run/media/itachi/DATA_SATA_4TB/SD2/InvokeAI/.venv/lib/python3.11/site-packages/bitsandbytes/cuda_specs.py", line 29, in get_cuda_version_string
    major, minor = get_cuda_version_tuple()
                   ^^^^^^^^^^^^^^^^^^^^^^^^
  File "/run/media/itachi/DATA_SATA_4TB/SD2/InvokeAI/.venv/lib/python3.11/site-packages/bitsandbytes/cuda_specs.py", line 24, in get_cuda_version_tuple
    major, minor = map(int, torch.version.cuda.split("."))
                            ^^^^^^^^^^^^^^^^^^^^^^^^
AttributeError: 'NoneType' object has no attribute 'split'

CUDA Setup failed despite CUDA being available. Please run the following command to get more information:

python -m bitsandbytes

Inspect the output of the command and see if you can locate CUDA libraries. You might need to add them
to your LD_LIBRARY_PATH. If you suspect a bug, please take the information from python -m bitsandbytes
and open an issue at: https://github.com/bitsandbytes-foundation/bitsandbytes/issues


>> patchmatch.patch_match: ERROR - patchmatch failed to load or compile (libvtkFiltersTexture.so.1: cannot open shared object file: No such file or directory).
>> patchmatch.patch_match: INFO - Refer to https://invoke-ai.github.io/InvokeAI/installation/060_INSTALL_PATCHMATCH/ for installation instructions.

[2025-02-02 23:33:15,760]::[InvokeAI]::INFO --> Patchmatch not loaded (nonfatal)

[2025-02-02 23:33:16,528]::[InvokeAI]::INFO --> Using torch device: AMD Radeon Graphics

[2025-02-02 23:33:16,665]::[InvokeAI]::INFO --> cuDNN version: 3001000

[2025-02-02 23:33:16,784]::[InvokeAI]::INFO --> InvokeAI version 5.6.0
[2025-02-02 23:33:16,784]::[InvokeAI]::INFO --> Root directory = /run/media/itachi/DATA_SATA_4TB/SD2/InvokeAI

[2025-02-02 23:33:16,785]::[InvokeAI]::INFO --> Initializing database at /run/media/itachi/DATA_SATA_4TB/SD2/InvokeAI/databases/invokeai.db

[2025-02-02 23:33:16,818]::[ModelManagerService]::INFO --> [MODEL CACHE] Calculated model RAM cache size: 9200.00 MB. Heuristics applied: [1, 3].

[2025-02-02 23:33:16,905]::[InvokeAI]::INFO --> Pruned 1 finished queue items

[2025-02-02 23:33:19,957]::[InvokeAI]::INFO --> Cleaned database (freed 0.04MB)
[2025-02-02 23:33:19,957]::[InvokeAI]::INFO --> Invoke running on http://127.0.0.1:9090 (Press CTRL+C to quit)

[2025-02-02 23:33:19,961]::[InvokeAI]::INFO --> Executing queue item 2, session 57837bd5-451a-4b7d-98cf-77af221ee952

[2025-02-02 23:33:57,539]::[ModelManagerService]::INFO --> [MODEL CACHE] Loaded model '907a4c90-54e0-467d-9346-879f2c70d47a:unet' (UNet2DConditionModel) onto cuda device in 32.53s. Total model size: 4897.05MB, VRAM: 4897.05MB (100.0%)

[2025-02-02 23:33:57,924]::[ModelManagerService]::INFO --> [MODEL CACHE] Loaded model '907a4c90-54e0-467d-9346-879f2c70d47a:scheduler' (DDPMScheduler) onto cuda device in 0.00s. Total model size: 0.00MB, VRAM: 0.00MB (0.0%)

[2025-02-02 23:33:58,448]::[InvokeAI]::ERROR --> Error while invoking session 57837bd5-451a-4b7d-98cf-77af221ee952, invocation d372c6e3-d7e1-4f1f-8f27-3a277ceba8a6 (denoise_latents): HIP error: invalid device function
HIP kernel errors might be asynchronously reported at some other API call, so the stacktrace below might be incorrect.
For debugging consider passing AMD_SERIALIZE_KERNEL=3
Compile with `TORCH_USE_HIP_DSA` to enable device-side assertions.

[2025-02-02 23:33:58,448]::[InvokeAI]::ERROR --> Traceback (most recent call last):
  File "/run/media/itachi/DATA_SATA_4TB/SD2/InvokeAI/.venv/lib/python3.11/site-packages/invokeai/app/services/session_processor/session_processor_default.py", line 129, in run_node
    output = invocation.invoke_internal(context=context, services=self._services)
             ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/run/media/itachi/DATA_SATA_4TB/SD2/InvokeAI/.venv/lib/python3.11/site-packages/invokeai/app/invocations/baseinvocation.py", line 300, in invoke_internal
    output = self.invoke(context)
             ^^^^^^^^^^^^^^^^^^^^
  File "/run/media/itachi/DATA_SATA_4TB/SD2/InvokeAI/.venv/lib/python3.11/site-packages/invokeai/app/invocations/denoise_latents.py", line 824, in invoke
    return self._old_invoke(context)
           ^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/run/media/itachi/DATA_SATA_4TB/SD2/InvokeAI/.venv/lib/python3.11/site-packages/torch/utils/_contextlib.py", line 116, in decorate_context
    return func(*args, **kwargs)
           ^^^^^^^^^^^^^^^^^^^^^
  File "/home/itachi/.local/share/uv/python/cpython-3.11.11-linux-x86_64-gnu/lib/python3.11/contextlib.py", line 81, in inner
    return func(*args, **kwds)
           ^^^^^^^^^^^^^^^^^^^
  File "/run/media/itachi/DATA_SATA_4TB/SD2/InvokeAI/.venv/lib/python3.11/site-packages/invokeai/app/invocations/denoise_latents.py", line 1078, in _old_invoke
    timesteps, init_timestep, scheduler_step_kwargs = self.init_scheduler(
                                                      ^^^^^^^^^^^^^^^^^^^^
  File "/run/media/itachi/DATA_SATA_4TB/SD2/InvokeAI/.venv/lib/python3.11/site-packages/invokeai/app/invocations/denoise_latents.py", line 729, in init_scheduler
    t_start_idx = len(list(filter(lambda ts: ts >= t_start_val, _timesteps)))
                      ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/run/media/itachi/DATA_SATA_4TB/SD2/InvokeAI/.venv/lib/python3.11/site-packages/invokeai/app/invocations/denoise_latents.py", line 729, in <lambda>
    t_start_idx = len(list(filter(lambda ts: ts >= t_start_val, _timesteps)))
                                             ^^^^^^^^^^^^^^^^^
RuntimeError: HIP error: invalid device function
HIP kernel errors might be asynchronously reported at some other API call, so the stacktrace below might be incorrect.
For debugging consider passing AMD_SERIALIZE_KERNEL=3
Compile with `TORCH_USE_HIP_DSA` to enable device-side assertions.

.............


[2025-02-02 23:35:12,417]::[InvokeAI]::INFO --> Executing queue item 5, session a3cea2be-230e-47a3-a75b-07fd01150a82

[2025-02-02 23:35:12,447]::[ModelManagerService]::INFO --> [MODEL CACHE] Loaded model '907a4c90-54e0-467d-9346-879f2c70d47a:unet' (UNet2DConditionModel) onto cuda device in 0.00s. Total model size: 4897.05MB, VRAM: 4897.05MB (100.0%)

[2025-02-02 23:35:12,449]::[ModelManagerService]::INFO --> [MODEL CACHE] Loaded model '907a4c90-54e0-467d-9346-879f2c70d47a:scheduler' (DDPMScheduler) onto cuda device in 0.00s. Total model size: 0.00MB, VRAM: 0.00MB (0.0%)

[2025-02-02 23:35:12,459]::[InvokeAI]::ERROR --> Error while invoking session a3cea2be-230e-47a3-a75b-07fd01150a82, invocation 2dfa2473-3dca-46d9-a2be-288795f10772 (denoise_latents): HIP error: invalid device function
HIP kernel errors might be asynchronously reported at some other API call, so the stacktrace below might be incorrect.
For debugging consider passing AMD_SERIALIZE_KERNEL=3
Compile with `TORCH_USE_HIP_DSA` to enable device-side assertions.

[2025-02-02 23:35:12,459]::[InvokeAI]::ERROR --> Traceback (most recent call last):
  File "/run/media/itachi/DATA_SATA_4TB/SD2/InvokeAI/.venv/lib/python3.11/site-packages/invokeai/app/services/session_processor/session_processor_default.py", line 129, in run_node
    output = invocation.invoke_internal(context=context, services=self._services)
             ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/run/media/itachi/DATA_SATA_4TB/SD2/InvokeAI/.venv/lib/python3.11/site-packages/invokeai/app/invocations/baseinvocation.py", line 300, in invoke_internal
    output = self.invoke(context)
             ^^^^^^^^^^^^^^^^^^^^
  File "/run/media/itachi/DATA_SATA_4TB/SD2/InvokeAI/.venv/lib/python3.11/site-packages/invokeai/app/invocations/denoise_latents.py", line 824, in invoke
    return self._old_invoke(context)
           ^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/run/media/itachi/DATA_SATA_4TB/SD2/InvokeAI/.venv/lib/python3.11/site-packages/torch/utils/_contextlib.py", line 116, in decorate_context
    return func(*args, **kwargs)
           ^^^^^^^^^^^^^^^^^^^^^
  File "/home/itachi/.local/share/uv/python/cpython-3.11.11-linux-x86_64-gnu/lib/python3.11/contextlib.py", line 81, in inner
    return func(*args, **kwds)
           ^^^^^^^^^^^^^^^^^^^
  File "/run/media/itachi/DATA_SATA_4TB/SD2/InvokeAI/.venv/lib/python3.11/site-packages/invokeai/app/invocations/denoise_latents.py", line 1078, in _old_invoke
    timesteps, init_timestep, scheduler_step_kwargs = self.init_scheduler(
                                                      ^^^^^^^^^^^^^^^^^^^^
  File "/run/media/itachi/DATA_SATA_4TB/SD2/InvokeAI/.venv/lib/python3.11/site-packages/invokeai/app/invocations/denoise_latents.py", line 729, in init_scheduler
    t_start_idx = len(list(filter(lambda ts: ts >= t_start_val, _timesteps)))
                      ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/run/media/itachi/DATA_SATA_4TB/SD2/InvokeAI/.venv/lib/python3.11/site-packages/invokeai/app/invocations/denoise_latents.py", line 729, in <lambda>
    t_start_idx = len(list(filter(lambda ts: ts >= t_start_val, _timesteps)))
                                             ^^^^^^^^^^^^^^^^^
RuntimeError: HIP error: invalid device function
HIP kernel errors might be asynchronously reported at some other API call, so the stacktrace below might be incorrect.
For debugging consider passing AMD_SERIALIZE_KERNEL=3
Compile with `TORCH_USE_HIP_DSA` to enable device-side assertions.



[2025-02-02 23:35:12,818]::[InvokeAI]::INFO --> Graph stats: a3cea2be-230e-47a3-a75b-07fd01150a82
                          Node   Calls   Seconds  VRAM Used
             sdxl_model_loader       1    0.000s     4.881G
            sdxl_compel_prompt       2    0.001s     4.881G
                       collect       2    0.001s     4.881G
                         noise       1    0.016s     4.881G
               denoise_latents       1    0.015s     4.882G
TOTAL GRAPH EXECUTION TIME:   0.032s
TOTAL GRAPH WALL TIME:   0.035s
RAM used by InvokeAI process: 5.91G (+0.000G)
RAM used to load models: 4.78G
VRAM in use: 4.881G
RAM cache statistics:
   Model cache hits: 2
   Model cache misses: 0
   Models cached: 4
   Models cleared from cache: 0
   Cache high water mark: 6.31/0.00G

mcondarelli added the bug Something isn't working label Jan 20, 2025

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[bug]: Invoke refuses to use my RX 7600 XT GPU #7574

[bug]: Invoke refuses to use my RX 7600 XT GPU #7574

mcondarelli commented Jan 20, 2025 •

edited

Loading

SherLock707 commented Feb 2, 2025

[bug]: Invoke refuses to use my RX 7600 XT GPU #7574

[bug]: Invoke refuses to use my RX 7600 XT GPU #7574

Comments

mcondarelli commented Jan 20, 2025 • edited Loading

Is there an existing issue for this problem?

Operating system

GPU vendor

GPU model

GPU VRAM

Version number

Browser

Python dependencies

What happened

What you expected to happen

How to reproduce the problem

Additional context

Discord username

SherLock707 commented Feb 2, 2025

mcondarelli commented Jan 20, 2025 •

edited

Loading