Skip to content

Actions: NVIDIA/TransformerEngine

Deploy nightly docs

Actions

Loading...
Loading

Show workflow options

Create status badge

Loading
533 workflow runs
533 workflow runs

Filter by Event

Filter by Status

Filter by Branch

Filter by Actor

[PyTorch] Fix FP8 activation recompute (#1254)
Deploy nightly docs #675: Commit a518151 pushed by ksivaman
October 16, 2024 15:27 1m 40s main
October 16, 2024 15:27 1m 40s
Upgrade pylint to 3.3.1 (#1257)
Deploy nightly docs #674: Commit 6e90fcb pushed by ksivaman
October 16, 2024 15:27 1m 8s main
October 16, 2024 15:27 1m 8s
[PyTorch] Drop FA as an installation requirement (#1226)
Deploy nightly docs #673: Commit 161b1d9 pushed by cyanguwa
October 16, 2024 02:35 1m 5s main
October 16, 2024 02:35 1m 5s
fix assertion bug for SWA API in TE-JAX (#1242)
Deploy nightly docs #672: Commit 43b9e1e pushed by phu0ngng
October 16, 2024 01:00 1m 4s main
October 16, 2024 01:00 1m 4s
[PyTorch] Build custom ORT ops before running ONNX export tests (#1252)
Deploy nightly docs #671: Commit f6b766b pushed by timmoon10
October 16, 2024 00:34 1m 36s main
October 16, 2024 00:34 1m 36s
Create README.md for examples/ (#1221)
Deploy nightly docs #670: Commit 54aa12a pushed by ksivaman
October 15, 2024 15:41 1m 8s main
October 15, 2024 15:41 1m 8s
Check for backend support in Jax context parallel fused attention tes…
Deploy nightly docs #669: Commit 20c55e4 pushed by phu0ngng
October 15, 2024 14:39 1m 6s main
October 15, 2024 14:39 1m 6s
Do not link against CUDA driver when building (#1240)
Deploy nightly docs #668: Commit 86f07be pushed by timmoon10
October 14, 2024 17:46 1m 5s main
October 14, 2024 17:46 1m 5s
[PyTorch] Let Fused RoPE support CP with THD format (#1238)
Deploy nightly docs #667: Commit 55dcbb4 pushed by xrennvidia
October 12, 2024 05:18 1m 4s main
October 12, 2024 05:18 1m 4s
Add FlashAttention3 to CP implementations (#1232)
Deploy nightly docs #666: Commit b36bd0a pushed by cyanguwa
October 11, 2024 18:41 1m 31s main
October 11, 2024 18:41 1m 31s
Fix bug in torch compile and seqdim is integer (#1217)
Deploy nightly docs #665: Commit 9ee2dbd pushed by ksivaman
October 11, 2024 17:59 1m 27s main
October 11, 2024 17:59 1m 27s
Small fixes to Float8Tensor (#1225)
Deploy nightly docs #664: Commit 3b89c36 pushed by ptrendx
October 10, 2024 17:48 1m 12s main
October 10, 2024 17:48 1m 12s
[JAX] Expose sliding window attn to TE-JAX API (#1205)
Deploy nightly docs #663: Commit 85e60e6 pushed by huanghua1994
October 10, 2024 16:37 1m 7s main
October 10, 2024 16:37 1m 7s
[PyTorch] Improve get_qkv_layout (#1214)
Deploy nightly docs #662: Commit 5b6546c pushed by cyanguwa
October 9, 2024 16:48 1m 12s main
October 9, 2024 16:48 1m 12s
[PyTorch] Add documentation for FP8 attention checkpointing (#1223)
Deploy nightly docs #661: Commit 2d87552 pushed by cyanguwa
October 9, 2024 16:47 1m 15s main
October 9, 2024 16:47 1m 15s
[PyTorch] Debug dtype casting in operation-based API (#1202)
Deploy nightly docs #660: Commit 5b89f1a pushed by timmoon10
October 9, 2024 03:58 1m 14s main
October 9, 2024 03:58 1m 14s
[PyTorch] Miscellaneous fixes for FA3 attention (#1174)
Deploy nightly docs #659: Commit e762592 pushed by cyanguwa
October 8, 2024 18:06 1m 3s main
October 8, 2024 18:06 1m 3s
Fix cuDNN sliding window size (#1212)
Deploy nightly docs #658: Commit c3b3cd2 pushed by cyanguwa
October 7, 2024 21:31 1m 43s main
October 7, 2024 21:31 1m 43s
Hierarchical CP implementation (Ulysses + Ring) (#1209)
Deploy nightly docs #657: Commit c24a4c4 pushed by cyanguwa
October 7, 2024 21:14 1m 3s main
October 7, 2024 21:14 1m 3s
Tests for distributed (#1196)
Deploy nightly docs #656: Commit 60f738f pushed by ptrendx
October 7, 2024 16:44 1m 12s main
October 7, 2024 16:44 1m 12s
[PyTorch] remove duplicate code (#1215)
Deploy nightly docs #655: Commit f8eb799 pushed by ksivaman
October 6, 2024 15:18 1m 16s main
October 6, 2024 15:18 1m 16s
[PyTorch] Minor optimizations to reduce CPU overheads in modules (#1191)
Deploy nightly docs #654: Commit 9d976bc pushed by timmoon10
October 4, 2024 03:13 1m 7s main
October 4, 2024 03:13 1m 7s
[PyTorch] Move block_table argument to FA varlen function (#1222)
Deploy nightly docs #653: Commit 10cceae pushed by cyanguwa
October 3, 2024 15:58 1m 13s main
October 3, 2024 15:58 1m 13s
Removed the unused options from GroupedLinear docs and fixed the bug …
Deploy nightly docs #652: Commit fb74961 pushed by ksivaman
October 1, 2024 02:33 1m 9s main
October 1, 2024 02:33 1m 9s
[PyTorch] Fix distributed testing (#1219)
Deploy nightly docs #651: Commit 46075b9 pushed by ksivaman
October 1, 2024 02:33 1m 6s main
October 1, 2024 02:33 1m 6s