Enable llama fp8 masked_flash_attention 8 #984

AmosLewis · 2025-02-19T17:53:48Z

My local change for debug input/numerics/ppl of #907

dan-garvey added 6 commits February 3, 2025 16:10

add llm ver

43f04c1

fix global issues

521135d

wip hell

fd5cdcb

not mergeable as-is

f224fbf

more hell

f3c8545

fixes

96a19f1

AmosLewis force-pushed the enable_kernel_fp8_attn8 branch from 638d288 to 96a19f1 Compare February 19, 2025 20:13

Provide feedback