You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
As of 1df9f51, the CK version used as a submodule in aiter doesn't seem to be able to compile FP16 attention:
In file included from /home/feep/aiter/aiter/jit/build/mha_fwd_fp16_nbias_nmask_nlse_ndropout/build/srcs/fmha_fwd_d64_fp16_batch_b128x64x32x64x32x64_r4x1x1_r4x1x1_w32x32x16_w32x32x16_qr_async_vr_psddv_nbias_nmask_nlse_ndropout_nsquant.hip:6:
In file included from /home/feep/aiter/aiter/jit/build/mha_fwd_fp16_nbias_nmask_nlse_ndropout/build/include/fmha_fwd_hip.hpp:7:
In file included from /home/feep/aiter/aiter/jit/build/ck/include/ck_tile/core_hip.hpp:12:
/home/feep/aiter/aiter/jit/build/ck/include/ck_tile/core/arch/amd_buffer_addressing_hip.hpp:234:26: error: invalid operand for instruction
234 | asm volatile("v_cmpx_le_u32 exec, 1, %4\n"
| ^
<inline asm>:1:16: note: instantiated into assembly here
1 | v_cmpx_le_u32 exec, 1, v2
| ^
In file included from /home/feep/aiter/aiter/jit/build/mha_fwd_fp16_nbias_nmask_nlse_ndropout/build/srcs/fmha_fwd_d64_fp16_batch_b128x64x32x64x32x64_r4x1x1_r4x1x1_w32x32x16_w32x32x16_qr_async_vr_psddv_nbias_nmask_nlse_ndropout_nsquant.hip:6:
In file included from /home/feep/aiter/aiter/jit/build/mha_fwd_fp16_nbias_nmask_nlse_ndropout/build/include/fmha_fwd_hip.hpp:7:
In file included from /home/feep/aiter/aiter/jit/build/ck/include/ck_tile/core_hip.hpp:12:
/home/feep/aiter/aiter/jit/build/ck/include/ck_tile/core/arch/amd_buffer_addressing_hip.hpp:234:26: error: invalid operand for instruction
234 | asm volatile("v_cmpx_le_u32 exec, 1, %4\n"
| ^
<inline asm>:1:16: note: instantiated into assembly here
1 | v_cmpx_le_u32 exec, 1, v2
| ^
In file included from /home/feep/aiter/aiter/jit/build/mha_fwd_fp16_nbias_nmask_nlse_ndropout/build/srcs/fmha_fwd_d64_fp16_batch_b128x64x32x64x32x64_r4x1x1_r4x1x1_w32x32x16_w32x32x16_qr_async_vr_psddv_nbias_nmask_nlse_ndropout_nsquant.hip:6:
In file included from /home/feep/aiter/aiter/jit/build/mha_fwd_fp16_nbias_nmask_nlse_ndropout/build/include/fmha_fwd_hip.hpp:7:
In file included from /home/feep/aiter/aiter/jit/build/ck/include/ck_tile/core_hip.hpp:12:
/home/feep/aiter/aiter/jit/build/ck/include/ck_tile/core/arch/amd_buffer_addressing_hip.hpp:234:26: error: invalid operand for instruction
234 | asm volatile("v_cmpx_le_u32 exec, 1, %4\n"
| ^
<inline asm>:1:16: note: instantiated into assembly here
1 | v_cmpx_le_u32 exec, 1, v7
Operating System
Ubuntu 24.10
CPU
AMD Ryzen 9 7950X3D
GPU
AMD Radeon RX 7900 XTX
ROCm Version
ROCm 6.3.4
ROCm Component
composable_kernel
Steps to Reproduce
python3 setup.py develop --user
Then try to do anything involving aiter.flash_attn_func.
AMD, you recently asked "which GPU should we support in ROCm?" Have you considered "any consumer GPUs whatsoever"?
(Optional for Linux users) Output of /opt/rocm/bin/rocminfo --support
No response
Additional Information
No response
The text was updated successfully, but these errors were encountered:
@FeepingCreature thanks for reaching out. Navi3 is not the top priority of the project, but we can keep this issue open and give our internal team as feedback for the feature request
@FeepingCreature thanks for reaching out. Navi3 is not the top priority of the project, but we can keep this issue open and give our internal team as feedback for the feature request
Feels to me like supporting any desktop AMD GPU is just not a priority, let alone a top priority.. The amount of work, trial and error and fixes from 3rd party devs to get a relatively working system has been extremely aggravating, especially when after all that a NV gpu that's 2 gen's older still runs circles around it.
Problem Description
As of 1df9f51, the CK version used as a submodule in aiter doesn't seem to be able to compile FP16 attention:
Operating System
Ubuntu 24.10
CPU
AMD Ryzen 9 7950X3D
GPU
AMD Radeon RX 7900 XTX
ROCm Version
ROCm 6.3.4
ROCm Component
composable_kernel
Steps to Reproduce
python3 setup.py develop --user
Then try to do anything involving
aiter.flash_attn_func
.AMD, you recently asked "which GPU should we support in ROCm?" Have you considered "any consumer GPUs whatsoever"?
(Optional for Linux users) Output of /opt/rocm/bin/rocminfo --support
No response
Additional Information
No response
The text was updated successfully, but these errors were encountered: