Skip to content

[Issue]: CK attention still doesn't work on 7900 XTX? #254

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Open
FeepingCreature opened this issue Mar 30, 2025 · 4 comments
Open

[Issue]: CK attention still doesn't work on 7900 XTX? #254

FeepingCreature opened this issue Mar 30, 2025 · 4 comments

Comments

@FeepingCreature
Copy link

FeepingCreature commented Mar 30, 2025

Problem Description

As of 1df9f51, the CK version used as a submodule in aiter doesn't seem to be able to compile FP16 attention:

In file included from /home/feep/aiter/aiter/jit/build/mha_fwd_fp16_nbias_nmask_nlse_ndropout/build/srcs/fmha_fwd_d64_fp16_batch_b128x64x32x64x32x64_r4x1x1_r4x1x1_w32x32x16_w32x32x16_qr_async_vr_psddv_nbias_nmask_nlse_ndropout_nsquant.hip:6:
In file included from /home/feep/aiter/aiter/jit/build/mha_fwd_fp16_nbias_nmask_nlse_ndropout/build/include/fmha_fwd_hip.hpp:7:
In file included from /home/feep/aiter/aiter/jit/build/ck/include/ck_tile/core_hip.hpp:12:
/home/feep/aiter/aiter/jit/build/ck/include/ck_tile/core/arch/amd_buffer_addressing_hip.hpp:234:26: error: invalid operand for instruction
  234 |             asm volatile("v_cmpx_le_u32 exec, 1, %4\n"
      |                          ^
<inline asm>:1:16: note: instantiated into assembly here
    1 |         v_cmpx_le_u32 exec, 1, v2
      |                       ^
In file included from /home/feep/aiter/aiter/jit/build/mha_fwd_fp16_nbias_nmask_nlse_ndropout/build/srcs/fmha_fwd_d64_fp16_batch_b128x64x32x64x32x64_r4x1x1_r4x1x1_w32x32x16_w32x32x16_qr_async_vr_psddv_nbias_nmask_nlse_ndropout_nsquant.hip:6:
In file included from /home/feep/aiter/aiter/jit/build/mha_fwd_fp16_nbias_nmask_nlse_ndropout/build/include/fmha_fwd_hip.hpp:7:
In file included from /home/feep/aiter/aiter/jit/build/ck/include/ck_tile/core_hip.hpp:12:
/home/feep/aiter/aiter/jit/build/ck/include/ck_tile/core/arch/amd_buffer_addressing_hip.hpp:234:26: error: invalid operand for instruction
  234 |             asm volatile("v_cmpx_le_u32 exec, 1, %4\n"
      |                          ^
<inline asm>:1:16: note: instantiated into assembly here
    1 |         v_cmpx_le_u32 exec, 1, v2
      |                       ^
In file included from /home/feep/aiter/aiter/jit/build/mha_fwd_fp16_nbias_nmask_nlse_ndropout/build/srcs/fmha_fwd_d64_fp16_batch_b128x64x32x64x32x64_r4x1x1_r4x1x1_w32x32x16_w32x32x16_qr_async_vr_psddv_nbias_nmask_nlse_ndropout_nsquant.hip:6:
In file included from /home/feep/aiter/aiter/jit/build/mha_fwd_fp16_nbias_nmask_nlse_ndropout/build/include/fmha_fwd_hip.hpp:7:
In file included from /home/feep/aiter/aiter/jit/build/ck/include/ck_tile/core_hip.hpp:12:
/home/feep/aiter/aiter/jit/build/ck/include/ck_tile/core/arch/amd_buffer_addressing_hip.hpp:234:26: error: invalid operand for instruction
  234 |             asm volatile("v_cmpx_le_u32 exec, 1, %4\n"
      |                          ^
<inline asm>:1:16: note: instantiated into assembly here
    1 |         v_cmpx_le_u32 exec, 1, v7

Operating System

Ubuntu 24.10

CPU

AMD Ryzen 9 7950X3D

GPU

AMD Radeon RX 7900 XTX

ROCm Version

ROCm 6.3.4

ROCm Component

composable_kernel

Steps to Reproduce

python3 setup.py develop --user

Then try to do anything involving aiter.flash_attn_func.

AMD, you recently asked "which GPU should we support in ROCm?" Have you considered "any consumer GPUs whatsoever"?

(Optional for Linux users) Output of /opt/rocm/bin/rocminfo --support

No response

Additional Information

No response

@FeepingCreature
Copy link
Author

See also ROCm/composable_kernel#1958

@carlushuang
Copy link
Collaborator

@FeepingCreature thanks for reaching out. Navi3 is not the top priority of the project, but we can keep this issue open and give our internal team as feedback for the feature request

@FeepingCreature
Copy link
Author

Fwiw, I know it's not the top priority (don't I know that...) but a whole bunch of libraries are starting to block on CK's lack of support.

@leovanalphen
Copy link

@FeepingCreature thanks for reaching out. Navi3 is not the top priority of the project, but we can keep this issue open and give our internal team as feedback for the feature request

Feels to me like supporting any desktop AMD GPU is just not a priority, let alone a top priority.. The amount of work, trial and error and fixes from 3rd party devs to get a relatively working system has been extremely aggravating, especially when after all that a NV gpu that's 2 gen's older still runs circles around it.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants