Commit 7e593c3
Add num_splits support for FA3 backend (#2380)
* [Common] Deleted unused header (#2324)
Deleted unused header
Signed-off-by: Oleg Goncharov <[email protected]>
Signed-off-by: Peter Dykas <[email protected]>
* [JAX] L1_jax_distributed_test suit with individual executions (#2321)
* L1 rework
Signed-off-by: Phuong Nguyen <[email protected]>
* comment out test_multi_process_grouped_gemm for now
Signed-off-by: Phuong Nguyen <[email protected]>
* rm e5m2 from test norm + MXFP8
Signed-off-by: Phuong Nguyen <[email protected]>
---------
Signed-off-by: Phuong Nguyen <[email protected]>
Signed-off-by: Peter Dykas <[email protected]>
* for branch
Signed-off-by: Peter Dykas <[email protected]>
* clean up and tests
Signed-off-by: Peter Dykas <[email protected]>
* change tests
Signed-off-by: Peter Dykas <[email protected]>
* [pre-commit.ci] auto fixes from pre-commit.com hooks
for more information, see https://pre-commit.ci
Signed-off-by: Peter Dykas <[email protected]>
* [PyTorch debug] Fixes to debug tests failures (#2268)
* code drop
Signed-off-by: Pawel Gadzinski <[email protected]>
* [pre-commit.ci] auto fixes from pre-commit.com hooks
for more information, see https://pre-commit.ci
* fix
Signed-off-by: Pawel Gadzinski <[email protected]>
* fix:
Signed-off-by: Pawel Gadzinski <[email protected]>
* [pre-commit.ci] auto fixes from pre-commit.com hooks
for more information, see https://pre-commit.ci
* fix
Signed-off-by: Pawel Gadzinski <[email protected]>
* [pre-commit.ci] auto fixes from pre-commit.com hooks
for more information, see https://pre-commit.ci
* fix
Signed-off-by: Pawel Gadzinski <[email protected]>
* fix
Signed-off-by: Pawel Gadzinski <[email protected]>
---------
Signed-off-by: Pawel Gadzinski <[email protected]>
Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com>
Signed-off-by: Peter Dykas <[email protected]>
* [PyTorch Debug] Add max_blockwise_dynamic_range stats (#2137)
* code drop
Signed-off-by: Pawel Gadzinski <[email protected]>
* [pre-commit.ci] auto fixes from pre-commit.com hooks
for more information, see https://pre-commit.ci
* fix
Signed-off-by: Pawel Gadzinski <[email protected]>
* [pre-commit.ci] auto fixes from pre-commit.com hooks
for more information, see https://pre-commit.ci
* fix
Signed-off-by: Pawel Gadzinski <[email protected]>
* [pre-commit.ci] auto fixes from pre-commit.com hooks
for more information, see https://pre-commit.ci
* fixes
Signed-off-by: Pawel Gadzinski <[email protected]>
* [pre-commit.ci] auto fixes from pre-commit.com hooks
for more information, see https://pre-commit.ci
* fix
Signed-off-by: Pawel Gadzinski <[email protected]>
* [pre-commit.ci] auto fixes from pre-commit.com hooks
for more information, see https://pre-commit.ci
* fix
Signed-off-by: Pawel Gadzinski <[email protected]>
* fix
Signed-off-by: Pawel Gadzinski <[email protected]>
* [pre-commit.ci] auto fixes from pre-commit.com hooks
for more information, see https://pre-commit.ci
* fix
Signed-off-by: Pawel Gadzinski <[email protected]>
* fix
Signed-off-by: Pawel Gadzinski <[email protected]>
* [pre-commit.ci] auto fixes from pre-commit.com hooks
for more information, see https://pre-commit.ci
* fix
Signed-off-by: Pawel Gadzinski <[email protected]>
* fix
Signed-off-by: Pawel Gadzinski <[email protected]>
* [pre-commit.ci] auto fixes from pre-commit.com hooks
for more information, see https://pre-commit.ci
* fix
Signed-off-by: Pawel Gadzinski <[email protected]>
---------
Signed-off-by: Pawel Gadzinski <[email protected]>
Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com>
Signed-off-by: Peter Dykas <[email protected]>
* [JAX] Fix bug with pre scale bias (#2300)
* fix
Signed-off-by: Pawel Gadzinski <[email protected]>
* fix
Signed-off-by: Pawel Gadzinski <[email protected]>
* fix
Signed-off-by: Pawel Gadzinski <[email protected]>
* fix
Signed-off-by: Pawel Gadzinski <[email protected]>
---------
Signed-off-by: Pawel Gadzinski <[email protected]>
Signed-off-by: Peter Dykas <[email protected]>
* [JAX] Try to use pre-downloaded dataset artifacts first (#2345)
* Try to use pre-downloaded dataset artifacts first
Signed-off-by: Jeremy Berchtold <[email protected]>
* Set HF_HUB_OFFLINE to disable any network calls to HF when the
pre-downloaded dataset is available
Signed-off-by: Jeremy Berchtold <[email protected]>
---------
Signed-off-by: Jeremy Berchtold <[email protected]>
Signed-off-by: Peter Dykas <[email protected]>
* Fix out of bounds access in the FP4 dequantize kernel (#2346)
Signed-off-by: Przemek Tredak <[email protected]>
Signed-off-by: Peter Dykas <[email protected]>
* Make FP8 weights compatible with older MCore version (#2342)
* Make cast_master_weights_to_fp8 compatible with older MCore version
Signed-off-by: kunlunl <[email protected]>
* Rename keep_columnwise to manual_post_all_gather_processing & Optimize unit test
Signed-off-by: kunlunl <[email protected]>
* [pre-commit.ci] auto fixes from pre-commit.com hooks
for more information, see https://pre-commit.ci
* Remove redundant _test_mini_optimizer()
Signed-off-by: kunlunl <[email protected]>
---------
Signed-off-by: kunlunl <[email protected]>
Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com>
Co-authored-by: Tim Moon <[email protected]>
Signed-off-by: Peter Dykas <[email protected]>
* [JAX] Add test to check jaxpr that amax is reused for nvfp4 recipe (#2348)
* Add test to check jaxpr that amax is reused for nvfp4 recipe
Signed-off-by: Jeremy Berchtold <[email protected]>
* Move test to test_helper.py and rename file
Signed-off-by: Jeremy Berchtold <[email protected]>
* [pre-commit.ci] auto fixes from pre-commit.com hooks
for more information, see https://pre-commit.ci
---------
Signed-off-by: Jeremy Berchtold <[email protected]>
Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com>
Signed-off-by: Peter Dykas <[email protected]>
* Fix sharding of segment position to match id in ring attention. (#2349)
Signed-off-by: Peter Dykas <[email protected]>
* Disable cuDNN attention for known IMA and NaNs (#2344)
* Fix cuDNN backend selection for more case. Add CG as a option as well
Signed-off-by: Kirthi Shankar Sivamani <[email protected]>
* fix logic
Signed-off-by: Kirthi Shankar Sivamani <[email protected]>
* Fix cuDNN checks
Signed-off-by: Kirthi Shankar Sivamani <[email protected]>
* Add more checks
Signed-off-by: Kirthi Shankar Sivamani <[email protected]>
* Fix cuddn version
Signed-off-by: Kirthi Shankar Sivamani <[email protected]>
* Fix error message
Signed-off-by: Kirthi Shankar Sivamani <[email protected]>
* Add check for window size
Signed-off-by: Kirthi Shankar Sivamani <[email protected]>
---------
Signed-off-by: Kirthi Shankar Sivamani <[email protected]>
Signed-off-by: Peter Dykas <[email protected]>
* [JAX] Default to fused attention in JAX DPA (#2363)
* Default to fused attention in JAX DPA
Signed-off-by: Kshitij Lakhani <[email protected]>
* Consolidate documentation for DPA in JAX
Co-authored-by: greptile-apps[bot] <165735046+greptile-apps[bot]@users.noreply.github.com>
Signed-off-by: Kshitij Lakhani <[email protected]>
* Correctly update the documentation for defaults in JAX DPA
Co-authored-by: greptile-apps[bot] <165735046+greptile-apps[bot]@users.noreply.github.com>
Signed-off-by: Kshitij Lakhani <[email protected]>
---------
Signed-off-by: Kshitij Lakhani <[email protected]>
Signed-off-by: Kshitij Lakhani <[email protected]>
Co-authored-by: greptile-apps[bot] <165735046+greptile-apps[bot]@users.noreply.github.com>
Signed-off-by: Peter Dykas <[email protected]>
* Update cudnn frontend to v1.16.0 (#2362)
Signed-off-by: Kirthi Shankar Sivamani <[email protected]>
Signed-off-by: Peter Dykas <[email protected]>
* [common] Remove kvpacked and qkvpacked attention functions for every kernel type. (#2287)
* code drop
Signed-off-by: Pawel Gadzinski <[email protected]>
* [pre-commit.ci] auto fixes from pre-commit.com hooks
for more information, see https://pre-commit.ci
* fix
Signed-off-by: Pawel Gadzinski <[email protected]>
* [pre-commit.ci] auto fixes from pre-commit.com hooks
for more information, see https://pre-commit.ci
* fix
Signed-off-by: Pawel Gadzinski <[email protected]>
* [pre-commit.ci] auto fixes from pre-commit.com hooks
for more information, see https://pre-commit.ci
* fix
Signed-off-by: Pawel Gadzinski <[email protected]>
* depracted compile time warning + \warning -> \deprecated
Signed-off-by: Pawel Gadzinski <[email protected]>
* [pre-commit.ci] auto fixes from pre-commit.com hooks
for more information, see https://pre-commit.ci
* [pre-commit.ci] auto fixes from pre-commit.com hooks
for more information, see https://pre-commit.ci
---------
Signed-off-by: Pawel Gadzinski <[email protected]>
Signed-off-by: Charlene Yang <[email protected]>
Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com>
Co-authored-by: Charlene Yang <[email protected]>
Signed-off-by: Peter Dykas <[email protected]>
* Move Triton to common (#2359)
* move triton to common and change paths
Signed-off-by: tdophung <[email protected]>
* Formatting
Signed-off-by: tdophung <[email protected]>
---------
Signed-off-by: tdophung <[email protected]>
Signed-off-by: Peter Dykas <[email protected]>
* [JAX] Fused layers argument default values changed (#2347)
* Changing default activations in MLP, TransformerLayer, dropout rate after FC1 to 0, and return_layernorm_output to False
Signed-off-by: tdophung <[email protected]>
* Fixing the failing tests by hard coding arguments to the previous values instead of relying on newer default values
Signed-off-by: tdophung <[email protected]>
* [pre-commit.ci] auto fixes from pre-commit.com hooks
for more information, see https://pre-commit.ci
---------
Signed-off-by: tdophung <[email protected]>
Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com>
Signed-off-by: Peter Dykas <[email protected]>
* remove comment from gpt
Signed-off-by: Peter Dykas <[email protected]>
* [pre-commit.ci] auto fixes from pre-commit.com hooks
for more information, see https://pre-commit.ci
* minor changes for num_splits logic
Signed-off-by: Charlene Yang <[email protected]>
* replace None with 1 as default
Signed-off-by: Charlene Yang <[email protected]>
* [pre-commit.ci] auto fixes from pre-commit.com hooks
for more information, see https://pre-commit.ci
* fix last commit
Signed-off-by: Charlene Yang <[email protected]>
* fix docstring
Signed-off-by: Charlene Yang <[email protected]>
* fix dtype in pack/unpack when FP8
Signed-off-by: Charlene Yang <[email protected]>
* add fused_attn_supported constraint for some tests
Signed-off-by: Charlene Yang <[email protected]>
* update FA3 installation commands
Signed-off-by: Charlene Yang <[email protected]>
* update FA3 installation commands in DPA
Signed-off-by: Charlene Yang <[email protected]>
* separate fused fp8 and f16 flags in tests
Signed-off-by: Charlene Yang <[email protected]>
* initialize fused_attn_supported_f16
Signed-off-by: Charlene Yang <[email protected]>
* fix FA installation in L3 tests
Signed-off-by: Charlene Yang <[email protected]>
---------
Signed-off-by: Oleg Goncharov <[email protected]>
Signed-off-by: Peter Dykas <[email protected]>
Signed-off-by: Phuong Nguyen <[email protected]>
Signed-off-by: Pawel Gadzinski <[email protected]>
Signed-off-by: Jeremy Berchtold <[email protected]>
Signed-off-by: Przemek Tredak <[email protected]>
Signed-off-by: kunlunl <[email protected]>
Signed-off-by: Kirthi Shankar Sivamani <[email protected]>
Signed-off-by: Kshitij Lakhani <[email protected]>
Signed-off-by: Kshitij Lakhani <[email protected]>
Signed-off-by: Charlene Yang <[email protected]>
Signed-off-by: tdophung <[email protected]>
Co-authored-by: Oleg Goncharov <[email protected]>
Co-authored-by: Phuong Nguyen <[email protected]>
Co-authored-by: root <[email protected]>
Co-authored-by: Peter Dykas <[email protected]>
Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com>
Co-authored-by: Paweł Gadziński <[email protected]>
Co-authored-by: jberchtold-nvidia <[email protected]>
Co-authored-by: Przemyslaw Tredak <[email protected]>
Co-authored-by: Kunlun Li <[email protected]>
Co-authored-by: Tim Moon <[email protected]>
Co-authored-by: Michael Goldfarb <[email protected]>
Co-authored-by: Kirthi Shankar Sivamani <[email protected]>
Co-authored-by: Kshitij Lakhani <[email protected]>
Co-authored-by: greptile-apps[bot] <165735046+greptile-apps[bot]@users.noreply.github.com>
Co-authored-by: Teddy Do <[email protected]>
Co-authored-by: wdykas <[email protected]>1 parent 1df4a69 commit 7e593c3
File tree
6 files changed
+126
-48
lines changed- qa/L3_pytorch_FA_versions_test
- tests/pytorch
- attention
- transformer_engine/pytorch/attention/dot_product_attention
6 files changed
+126
-48
lines changed| Original file line number | Diff line number | Diff line change | |
|---|---|---|---|
| |||
30 | 30 | | |
31 | 31 | | |
32 | 32 | | |
33 | | - | |
| 33 | + | |
34 | 34 | | |
35 | 35 | | |
36 | | - | |
| 36 | + | |
37 | 37 | | |
38 | 38 | | |
39 | | - | |
| 39 | + | |
40 | 40 | | |
41 | 41 | | |
42 | 42 | | |
| |||
| Original file line number | Diff line number | Diff line change | |
|---|---|---|---|
| |||
117 | 117 | | |
118 | 118 | | |
119 | 119 | | |
120 | | - | |
| 120 | + | |
| 121 | + | |
| 122 | + | |
| 123 | + | |
| 124 | + | |
| 125 | + | |
| 126 | + | |
| 127 | + | |
121 | 128 | | |
122 | 129 | | |
123 | 130 | | |
| |||
308 | 315 | | |
309 | 316 | | |
310 | 317 | | |
| 318 | + | |
| 319 | + | |
| 320 | + | |
| 321 | + | |
| 322 | + | |
| 323 | + | |
| 324 | + | |
| 325 | + | |
| 326 | + | |
| 327 | + | |
| 328 | + | |
| 329 | + | |
| 330 | + | |
| 331 | + | |
| 332 | + | |
| 333 | + | |
| 334 | + | |
| 335 | + | |
| 336 | + | |
| 337 | + | |
| 338 | + | |
| 339 | + | |
| 340 | + | |
| 341 | + | |
| 342 | + | |
311 | 343 | | |
312 | 344 | | |
313 | 345 | | |
| |||
1152 | 1184 | | |
1153 | 1185 | | |
1154 | 1186 | | |
| 1187 | + | |
| 1188 | + | |
1155 | 1189 | | |
1156 | 1190 | | |
1157 | 1191 | | |
| |||
1786 | 1820 | | |
1787 | 1821 | | |
1788 | 1822 | | |
1789 | | - | |
1790 | | - | |
| 1823 | + | |
| 1824 | + | |
1791 | 1825 | | |
| 1826 | + | |
1792 | 1827 | | |
1793 | 1828 | | |
1794 | 1829 | | |
1795 | 1830 | | |
1796 | 1831 | | |
1797 | 1832 | | |
1798 | 1833 | | |
1799 | | - | |
1800 | | - | |
| 1834 | + | |
| 1835 | + | |
1801 | 1836 | | |
1802 | 1837 | | |
1803 | 1838 | | |
| |||
1809 | 1844 | | |
1810 | 1845 | | |
1811 | 1846 | | |
1812 | | - | |
1813 | | - | |
1814 | | - | |
1815 | | - | |
1816 | | - | |
1817 | | - | |
1818 | | - | |
| 1847 | + | |
| 1848 | + | |
| 1849 | + | |
| 1850 | + | |
| 1851 | + | |
| 1852 | + | |
| 1853 | + | |
| 1854 | + | |
1819 | 1855 | | |
1820 | | - | |
1821 | | - | |
1822 | | - | |
1823 | | - | |
| 1856 | + | |
| 1857 | + | |
| 1858 | + | |
| 1859 | + | |
| 1860 | + | |
| 1861 | + | |
| 1862 | + | |
| 1863 | + | |
1824 | 1864 | | |
1825 | 1865 | | |
1826 | 1866 | | |
1827 | 1867 | | |
1828 | | - | |
| 1868 | + | |
1829 | 1869 | | |
1830 | 1870 | | |
1831 | 1871 | | |
| |||
1838 | 1878 | | |
1839 | 1879 | | |
1840 | 1880 | | |
1841 | | - | |
1842 | | - | |
1843 | | - | |
1844 | | - | |
1845 | | - | |
1846 | | - | |
1847 | | - | |
1848 | | - | |
1849 | | - | |
1850 | | - | |
1851 | | - | |
1852 | | - | |
| 1881 | + | |
| 1882 | + | |
| 1883 | + | |
| 1884 | + | |
| 1885 | + | |
| 1886 | + | |
| 1887 | + | |
| 1888 | + | |
| 1889 | + | |
| 1890 | + | |
| 1891 | + | |
| 1892 | + | |
| 1893 | + | |
1853 | 1894 | | |
1854 | | - | |
1855 | | - | |
1856 | | - | |
1857 | | - | |
1858 | | - | |
1859 | | - | |
1860 | | - | |
1861 | | - | |
1862 | | - | |
1863 | | - | |
1864 | | - | |
1865 | | - | |
1866 | | - | |
| 1895 | + | |
| 1896 | + | |
| 1897 | + | |
| 1898 | + | |
| 1899 | + | |
| 1900 | + | |
| 1901 | + | |
| 1902 | + | |
| 1903 | + | |
| 1904 | + | |
| 1905 | + | |
| 1906 | + | |
| 1907 | + | |
1867 | 1908 | | |
1868 | 1909 | | |
1869 | 1910 | | |
| |||
| Original file line number | Diff line number | Diff line change | |
|---|---|---|---|
| |||
8 | 8 | | |
9 | 9 | | |
10 | 10 | | |
| 11 | + | |
11 | 12 | | |
12 | 13 | | |
13 | 14 | | |
| |||
210 | 211 | | |
211 | 212 | | |
212 | 213 | | |
| 214 | + | |
213 | 215 | | |
214 | 216 | | |
215 | 217 | | |
| |||
239 | 241 | | |
240 | 242 | | |
241 | 243 | | |
| 244 | + | |
242 | 245 | | |
243 | 246 | | |
244 | 247 | | |
| |||
321 | 324 | | |
322 | 325 | | |
323 | 326 | | |
| 327 | + | |
| 328 | + | |
| 329 | + | |
324 | 330 | | |
325 | 331 | | |
326 | 332 | | |
| |||
330 | 336 | | |
331 | 337 | | |
332 | 338 | | |
| 339 | + | |
| 340 | + | |
| 341 | + | |
| 342 | + | |
333 | 343 | | |
334 | 344 | | |
335 | 345 | | |
| |||
Lines changed: 2 additions & 0 deletions
| Original file line number | Diff line number | Diff line change | |
|---|---|---|---|
| |||
681 | 681 | | |
682 | 682 | | |
683 | 683 | | |
| 684 | + | |
684 | 685 | | |
685 | 686 | | |
686 | 687 | | |
| |||
957 | 958 | | |
958 | 959 | | |
959 | 960 | | |
| 961 | + | |
960 | 962 | | |
961 | 963 | | |
962 | 964 | | |
| |||
Lines changed: 7 additions & 0 deletions
| Original file line number | Diff line number | Diff line change | |
|---|---|---|---|
| |||
799 | 799 | | |
800 | 800 | | |
801 | 801 | | |
| 802 | + | |
802 | 803 | | |
803 | 804 | | |
804 | 805 | | |
| |||
973 | 974 | | |
974 | 975 | | |
975 | 976 | | |
| 977 | + | |
| 978 | + | |
| 979 | + | |
| 980 | + | |
976 | 981 | | |
977 | 982 | | |
978 | 983 | | |
| |||
1315 | 1320 | | |
1316 | 1321 | | |
1317 | 1322 | | |
| 1323 | + | |
1318 | 1324 | | |
1319 | 1325 | | |
1320 | 1326 | | |
| |||
1413 | 1419 | | |
1414 | 1420 | | |
1415 | 1421 | | |
| 1422 | + | |
1416 | 1423 | | |
1417 | 1424 | | |
1418 | 1425 | | |
| |||
Lines changed: 21 additions & 3 deletions
| Original file line number | Diff line number | Diff line change | |
|---|---|---|---|
| |||
135 | 135 | | |
136 | 136 | | |
137 | 137 | | |
138 | | - | |
| 138 | + | |
139 | 139 | | |
140 | 140 | | |
141 | 141 | | |
| |||
233 | 233 | | |
234 | 234 | | |
235 | 235 | | |
| 236 | + | |
| 237 | + | |
236 | 238 | | |
237 | 239 | | |
238 | 240 | | |
| |||
263 | 265 | | |
264 | 266 | | |
265 | 267 | | |
| 268 | + | |
266 | 269 | | |
267 | 270 | | |
268 | 271 | | |
| |||
338 | 341 | | |
339 | 342 | | |
340 | 343 | | |
| 344 | + | |
341 | 345 | | |
342 | 346 | | |
343 | 347 | | |
| |||
511 | 515 | | |
512 | 516 | | |
513 | 517 | | |
| 518 | + | |
| 519 | + | |
| 520 | + | |
| 521 | + | |
| 522 | + | |
| 523 | + | |
| 524 | + | |
| 525 | + | |
| 526 | + | |
| 527 | + | |
| 528 | + | |
| 529 | + | |
514 | 530 | | |
515 | 531 | | |
516 | 532 | | |
| |||
1566 | 1582 | | |
1567 | 1583 | | |
1568 | 1584 | | |
| 1585 | + | |
1569 | 1586 | | |
1570 | | - | |
| 1587 | + | |
1571 | 1588 | | |
1572 | 1589 | | |
1573 | 1590 | | |
| |||
1622 | 1639 | | |
1623 | 1640 | | |
1624 | 1641 | | |
| 1642 | + | |
1625 | 1643 | | |
1626 | | - | |
| 1644 | + | |
1627 | 1645 | | |
1628 | 1646 | | |
1629 | 1647 | | |
| |||
0 commit comments