Skip to content

Actions: NVIDIA/TransformerEngine

Documentation

Actions

Loading...
Loading

Show workflow options

Create status badge

Loading
3,022 workflow run results
3,022 workflow run results

Filter by Event

Filter by Status

Filter by Branch

Filter by Actor

[PyTorch] Fix autocast deprecation warnings
Documentation #5535: Pull request #1277 synchronize by yaox12
November 5, 2024 02:17 1m 1s yaox12:xiny/fix_autocast_warning
November 5, 2024 02:17 1m 1s
[PyTorch] Debug CUDA graph support with operation-based API
Documentation #5534: Pull request #1117 synchronize by timmoon10
November 5, 2024 00:56 57s timmoon10:cuda-graph-ops
November 5, 2024 00:56 57s
[PyTorch] Debug checkpointing with operation-based API
Documentation #5533: Pull request #1063 synchronize by timmoon10
November 5, 2024 00:55 56s timmoon10:ops-checkpointing
November 5, 2024 00:55 56s
TP communication overlap: enable the overlap between GEMM chunk at Ho…
Documentation #5532: Pull request #1311 opened by erhoo82
November 4, 2024 17:28 1m 5s erhoo82:tp_rs_bf16
November 4, 2024 17:28 1m 5s
[TE/JAX] XLA FFI calls for three cast transpose functions
Documentation #5531: Pull request #1310 synchronize by pre-commit-ci bot
November 4, 2024 17:06 1m 10s huanghua1994:xla-ffi-act-trans
November 4, 2024 17:06 1m 10s
[TE/JAX] XLA FFI calls for layer norm and RMS norm
Documentation #5529: Pull request #1290 synchronize by phu0ngng
November 4, 2024 15:59 1m 19s huanghua1994:xla-custom-call-ffi
November 4, 2024 15:59 1m 19s
[PyTorch] Fix autocast deprecation warnings
Documentation #5528: Pull request #1277 synchronize by yaox12
November 4, 2024 09:48 1m 18s yaox12:xiny/fix_autocast_warning
November 4, 2024 09:48 1m 18s
[PyTorch] Userbuffers support in operation-based API
Documentation #5519: Pull request #1142 synchronize by timmoon10
November 1, 2024 21:17 1m 7s timmoon10:ub-ops
November 1, 2024 21:17 1m 7s
[JAX] Expose cp params to jax DPA api
Documentation #5517: Pull request #1292 synchronize by mgoldfarb-nvidia
November 1, 2024 20:40 1m 3s kocchop:faysal/expose-cp-to-jax-dpa
November 1, 2024 20:40 1m 3s
[JAX] Fix for Disable FusedAttn with FFI by default
Documentation #5516: Pull request #1304 synchronize by phu0ngng
November 1, 2024 19:43 1m 11s phu0ngng:fused_attn_ffi
November 1, 2024 19:43 1m 11s
[JAX] Fix for Disable FusedAttn with FFI by default
Documentation #5515: Pull request #1304 opened by phu0ngng
November 1, 2024 15:49 1m 0s phu0ngng:fused_attn_ffi
November 1, 2024 15:49 1m 0s
[PyTorch] Make FP8 MHA work with RoPE when CP is on
Documentation #5514: Pull request #1297 synchronize by yaox12
November 1, 2024 04:32 1m 0s yaox12:xiny/fp8_mha_with_rope_cp
November 1, 2024 04:32 1m 0s
[PyTorch] Userbuffers support in operation-based API
Documentation #5511: Pull request #1142 synchronize by pre-commit-ci bot
October 31, 2024 23:05 1m 10s timmoon10:ub-ops
October 31, 2024 23:05 1m 10s
[PyTorch] Userbuffers support in operation-based API
Documentation #5510: Pull request #1142 synchronize by timmoon10
October 31, 2024 23:04 1m 12s timmoon10:ub-ops
October 31, 2024 23:04 1m 12s
[JAX] Expose cp params to jax DPA api
Documentation #5509: Pull request #1292 synchronize by mgoldfarb-nvidia
October 31, 2024 22:21 1m 38s kocchop:faysal/expose-cp-to-jax-dpa
October 31, 2024 22:21 1m 38s
[PyTorch] Add heuristics for intializing FP8 params
Documentation #5508: Pull request #1300 synchronize by timmoon10
October 31, 2024 21:54 56s timmoon10:fp8-heuristic
October 31, 2024 21:54 56s
Support using fp16 master weights and fp16/fp8 optimizer states in FusedAdam
Documentation #5507: Pull request #1078 synchronize by timmoon10
October 31, 2024 20:46 58s kunlunl:mx_fp16
October 31, 2024 20:46 58s