SageMoore

Follow

Sage Moore SageMoore

Follow

Software Engineer @neuralmagic

6 followers · 0 following

@neuralmagic

Achievements

Achievements

Popular repositories Loading

flash-attention Public

Forked from vllm-project/flash-attention

Fast and memory-efficient exact attention

Python

83 contributions in the last year

Learn how we count contributions

Less

More

Contribution activity

April 2025

Created 1 commit in 1 repository

vllm-project/vllm 1 commit

Created a pull request in vllm-project/vllm that received 3 comments

[ROCM] Bind triton version to 3.2 in requirements-built.txt

See above. This should fix the baremetal rocm build. Another package is pulling in triton version 3.3 which doesn't match with the version that tor…

+1 −0 lines changed • 3 comments

Reviewed 13 pull requests in 1 repository

vllm-project/vllm 13 pull requests

[FEAT] [ROCm]: Support AITER Linear
This contribution was made on Apr 16
[Quantization][FP8] Add support for FP8 models with input_scale for output projection and QK quantization
This contribution was made on Apr 16
[Kernel][Triton][FP8] Adding fp8 and variable length sequence support to Triton FAv2 kernel
This contribution was made on Apr 15
[Kernel][ROCM] Upstream prefix prefill speed up for vLLM V1
This contribution was made on Apr 14
[Misc][ROCm] Restrict Aiter moe to specific models.
This contribution was made on Apr 14
[rocm][V0] fix selection logic for custom PA in V0
This contribution was made on Apr 11
[Performance][ROCm] Add skinny gemms for unquantized linear on ROCm
This contribution was made on Apr 10
[ROCm][Kernel] Using platform dependent num_threads in paged attention
This contribution was made on Apr 8
[Bug] [ROCm] Fix Llama 4 Enablement Bug on ROCm: V0 ROCmFlashAttentionImpl and Triton Fused MoE bugs
This contribution was made on Apr 8
[Bugfix] Handle process_weights_after_loading for QKVCrossParallelLinear
This contribution was made on Apr 1
[ROCm][Build][Bugfix] Bring the base dockerfile in sync with the ROCm fork
This contribution was made on Apr 1
[ROCm][Bugfix] Use platform specific FP8 dtype
This contribution was made on Apr 1
[ROCm][Bugfix] Bring back fallback to eager mode removed in #14917, but for ROCm only
This contribution was made on Apr 1