[Attention Performance] FlashAttention2 backward get to 80%~90% of XeTLA #2159

Dewei-Wang-sh · 2024-09-09T02:55:54Z

this serves as an umbrella issue.
things to start with, will add more when dive into the backward code.

Dewei-Wang-sh · 2024-11-04T06:40:15Z

dive into the backward algorithm, trying to split the task to separate issues.

Dewei-Wang-sh · 2024-11-12T02:34:57Z

two way to make it work, need more discussion.

rewrite the code to block-pointer and then add support for backward related feature.
keep the non-block pointer way, and follow along what nv does.

Dewei-Wang-sh self-assigned this Sep 9, 2024

Dewei-Wang-sh added this to the 4.0 [Performance] Core milestone Sep 9, 2024

Dewei-Wang-sh added the codegen: attention label Sep 9, 2024

vlad-penkin added umbrella performance labels Sep 9, 2024

aregm assigned LiyangLingIntel and unassigned Dewei-Wang-sh Feb 10, 2025

Provide feedback