-
Notifications
You must be signed in to change notification settings - Fork 10.8k
Add support for sage attention 3 in comfyui, enable via new cli arg #11026
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: master
Are you sure you want to change the base?
Conversation
--use-sage-attiention3
| ) | ||
| B, H, L, D = q_s.shape | ||
|
|
||
| if dim_head >= 256 or N <= 2048: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
AFAICT dim_head is undefined if skip_reshape=True
| q_s, k_s, v_s = map( | ||
| lambda t: t.view(B, -1, heads, dim_head).permute(0, 2, 1, 3).contiguous(), | ||
| (q, k, v), | ||
| ) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
You should avoid doing this contiguous until after the below possible fallback, as the below attention_pytorch does not need these reshaped tensors.
| try: | ||
| out = sageattn3_blackwell(q_s, k_s, v_s, is_causal=False) | ||
| except Exception as e: | ||
| logging.error("Error running SageAttention3: %s, falling back to pytorch attention.", e) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
just set a flag here:
do_oom_fallback=True
if do_oom_fallback:
return attention_pytorch(...
The reason is when you catch an exception like this, the exception itself refs all local variables. This means that sage attention local variables including any large tensors are strong reffed by the exception. You want to get out of the except: block before calling attention_pytorch.
del q_s, k_s, v_s before falling back.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I fixed this in original sage here:
This is basic Sage Attention 3 support. Because it is still unstable and differs significantly from previous versions of Sage Attention, a separate switch --use-sage-attiention3 is provided to enable or disable it. You need to install Sage Attention 3 in your environment before enabling it.