-
Notifications
You must be signed in to change notification settings - Fork 577
Pull requests: NVIDIA/TransformerEngine
Author
Label
Projects
Milestones
Reviews
Assignee
Sort
Pull requests list
[PyTorch] Use autograd.backward to capture cudagraph backward
#2518
opened Dec 16, 2025 by
buptzyb
Loading…
13 tasks
Remove test skip logic for GEMM-AR tests
#2516
opened Dec 16, 2025 by
vcherepanov-nv
Loading…
4 of 13 tasks
[common] Add support for cuBLASLt GEMM for GroupedTensor
MoE
#2502
opened Dec 10, 2025 by
pggPL
Loading…
8 tasks done
Add logic for block-scaled tensors with GEMM swizzled scales
enhancement
New feature or request
MoE
performance
Performance issues
refactor
#2486
opened Dec 6, 2025 by
timmoon10
Loading…
14 of 19 tasks
[JAX] Remove unused TE DPA module dtype which fixes cuDNN backend detection to properly use input dtypes
#2485
opened Dec 5, 2025 by
jberchtold-nvidia
Loading…
8 of 13 tasks
[JAX] Estimate post-RHT amax using regular amax
fp4
#2479
opened Dec 4, 2025 by
jberchtold-nvidia
•
Draft
13 tasks
Add support for SWA (left, right) with FusedAttention
2.11.0
#2477
opened Dec 4, 2025 by
sudhakarsingh27
Loading…
22 of 28 tasks
[PyTorch] Documentation for op fuser API
documentation
Improvements or additions to documentation
#2447
opened Dec 3, 2025 by
timmoon10
Loading…
8 of 13 tasks
Fix transformer 2.9.0 (torch 2.9.1 used by SGLang 0.5.5) build
#2445
opened Dec 2, 2025 by
yiakwy-xpu-ml-framework-team
Loading…
13 tasks
Add ccache support to TE and use it in GitHub actions
build
Build system
#2444
opened Dec 2, 2025 by
ptrendx
Loading…
1 of 6 tasks
[JAX] Better error message when Q, K, V are sharded differently
#2440
opened Dec 2, 2025 by
jberchtold-nvidia
Loading…
8 of 13 tasks
[JAX] Add tutorial for integrating TE/JAX quantization into an existing framework
#2423
opened Nov 26, 2025 by
jberchtold-nvidia
Loading…
8 of 13 tasks
Previous Next
ProTip!
What’s not been updated in a month: updated:<2025-11-15.