Skip to content

Conversation

cfgfung
Copy link

@cfgfung cfgfung commented Oct 6, 2025

A repo for PyTorch SDPA Forward Upstreaming

Functionalities:

  • BF16/FP16
  • Now supports BSHD layout
  • Output LSE for training
  • Custom softmax scale
  • Is causal

WIP:

  • Solve accuracy issue when seq_len_kv % BLK_N is not fully divisible
  • Potential performance optimizations

@rolandschulz
Copy link

This should be based on #547

@cfgfung cfgfung force-pushed the sdpa_fwd_upstream branch from f99c157 to 2c56c1e Compare October 9, 2025 18:09
@cfgfung cfgfung changed the title First version of SDPA Fwd First version of SDPA Fwd - No need to review Oct 12, 2025
@cfgfung
Copy link
Author

cfgfung commented Oct 12, 2025

This should be based on #547

Hi Roland,

Thanks for the review. Right now it is more like for internal use and PyTorch Integration. Please skip the review for now.

@Antonyvance Antonyvance added the redesign required Implementation require a redesign label Oct 17, 2025
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

redesign required Implementation require a redesign

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants