Popular repositories Loading
-
flash-attention
flash-attention PublicForked from vllm-project/flash-attention
Fast and memory-efficient exact attention
Python
83 contributions in the last year
Day of Week | April Apr | May May | June Jun | July Jul | August Aug | September Sep | October Oct | November Nov | December Dec | January Jan | February Feb | March Mar | April Apr | ||||||||||||||||||||||||||||||||||||||||
Sunday Sun | |||||||||||||||||||||||||||||||||||||||||||||||||||||
Monday Mon | |||||||||||||||||||||||||||||||||||||||||||||||||||||
Tuesday Tue | |||||||||||||||||||||||||||||||||||||||||||||||||||||
Wednesday Wed | |||||||||||||||||||||||||||||||||||||||||||||||||||||
Thursday Thu | |||||||||||||||||||||||||||||||||||||||||||||||||||||
Friday Fri | |||||||||||||||||||||||||||||||||||||||||||||||||||||
Saturday Sat |
Less
No contributions.
Low contributions.
Medium-low contributions.
Medium-high contributions.
High contributions.
More
Contribution activity
April 2025
Created 1 commit in 1 repository
Created a pull request in vllm-project/vllm that received 3 comments
[ROCM] Bind triton version to 3.2 in requirements-built.txt
See above. This should fix the baremetal rocm build. Another package is pulling in triton version 3.3 which doesn't match with the version that tor…
+1
−0
lines changed
•
3
comments
Reviewed 13 pull requests in 1 repository
vllm-project/vllm
13 pull requests
-
[FEAT] [ROCm]: Support AITER Linear
This contribution was made on Apr 16
-
[Quantization][FP8] Add support for FP8 models with input_scale for output projection and QK quantization
This contribution was made on Apr 16
-
[Kernel][Triton][FP8] Adding fp8 and variable length sequence support to Triton FAv2 kernel
This contribution was made on Apr 15
-
[Kernel][ROCM] Upstream prefix prefill speed up for vLLM V1
This contribution was made on Apr 14
-
[Misc][ROCm] Restrict Aiter moe to specific models.
This contribution was made on Apr 14
-
[rocm][V0] fix selection logic for custom PA in V0
This contribution was made on Apr 11
-
[Performance][ROCm] Add skinny gemms for unquantized linear on ROCm
This contribution was made on Apr 10
-
[ROCm][Kernel] Using platform dependent num_threads in paged attention
This contribution was made on Apr 8
-
[Bug] [ROCm] Fix Llama 4 Enablement Bug on ROCm: V0 ROCmFlashAttentionImpl and Triton Fused MoE bugs
This contribution was made on Apr 8
-
[Bugfix] Handle
process_weights_after_loading
forQKVCrossParallelLinear
This contribution was made on Apr 1 -
[ROCm][Build][Bugfix] Bring the base dockerfile in sync with the ROCm fork
This contribution was made on Apr 1
-
[ROCm][Bugfix] Use platform specific FP8 dtype
This contribution was made on Apr 1
-
[ROCm][Bugfix] Bring back fallback to eager mode removed in #14917, but for ROCm only
This contribution was made on Apr 1