Open
Description
For float8 training, the test_everything.sh script requires multiple GPUs for FSDP/TP tests, so we currently don't run in CI as it's not configured for multi-device jobs. We should figure out how to run these multi-device tests in CI. This would also be useful for some of our new MoE training parallelism tests.