Skip to content

Conversation

sunjiweiswift
Copy link

@sunjiweiswift sunjiweiswift commented Sep 10, 2025

image

Missing features

  1. GQA optimization, allowing multiple Qs and a single K to be computed in the same GEMM
    2. Slide window -- done

@sunjiweiswift
Copy link
Author

@rolandschulz @tdeng5 @jiyang1011 please review

@tdeng5 tdeng5 self-requested a review September 10, 2025 09:23
@sunjiweiswift sunjiweiswift force-pushed the flash_chunk_prefill branch 8 times, most recently from 0505aed to 8c5d3ce Compare September 17, 2025 02:48
@sunjiweiswift
Copy link
Author

@rolandschulz @tdeng5 @jiyang1011 please review

@sunjiweiswift
Copy link
Author

@rolandschulz pls review

Valentine233 and others added 9 commits September 30, 2025 23:40
This change imports `SYCLCompat` to cutlass-sycl repo as `compat`.
Previous dependencies on `syclcompat` are changed to `compat`.
This PR also fix some failures of `SYCLCompat` in oneapi 2025.2.

---------

Co-authored-by: Roland Schulz <[email protected]>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

5 participants