Skip to content

Nvidia CI

Nvidia CI #6

Triggered via schedule June 2, 2026 03:28
Status Failure
Total duration 42s
Artifacts 1
Matrix: DeepSpeed CI / Setup
Matrix: Example CI / Setup
Matrix: Model CI / Setup
Matrix: Quantization CI / Setup
Matrix: Torch pipeline CI / Setup
Matrix: Trainer/FSDP CI / Setup
Matrix: DeepSpeed CI / Examples directory
Matrix: DeepSpeed CI / PyTorch pipelines
Matrix: DeepSpeed CI / Torch CUDA extension tests
Matrix: Example CI / Examples directory
Matrix: Example CI / PyTorch pipelines
Matrix: Example CI / Torch CUDA extension tests
Matrix: Model CI / Examples directory
Matrix: Model CI / PyTorch pipelines
Matrix: Model CI / Torch CUDA extension tests
Matrix: Quantization CI / Examples directory
Matrix: Quantization CI / PyTorch pipelines
Matrix: Quantization CI / Torch CUDA extension tests
Matrix: Torch pipeline CI / Examples directory
Matrix: Torch pipeline CI / PyTorch pipelines
Matrix: Torch pipeline CI / Torch CUDA extension tests
Matrix: Trainer/FSDP CI / Examples directory
Matrix: Trainer/FSDP CI / PyTorch pipelines
Matrix: Trainer/FSDP CI / Torch CUDA extension tests
Matrix: DeepSpeed CI / run_models_gpu
Waiting for pending jobs
Matrix: DeepSpeed CI / run_quantization_torch_gpu
Matrix: DeepSpeed CI / run_trainer_and_fsdp_gpu
Waiting for pending jobs
Matrix: Example CI / run_models_gpu
Waiting for pending jobs
Matrix: Example CI / run_quantization_torch_gpu
Matrix: Example CI / run_trainer_and_fsdp_gpu
Waiting for pending jobs
Matrix: Model CI / run_models_gpu
Waiting for pending jobs
Matrix: Model CI / run_quantization_torch_gpu
Matrix: Model CI / run_trainer_and_fsdp_gpu
Waiting for pending jobs
Matrix: Quantization CI / run_models_gpu
Waiting for pending jobs
Matrix: Quantization CI / run_quantization_torch_gpu
Matrix: Quantization CI / run_trainer_and_fsdp_gpu
Waiting for pending jobs
Matrix: Torch pipeline CI / run_models_gpu
Waiting for pending jobs
Matrix: Torch pipeline CI / run_quantization_torch_gpu
Matrix: Torch pipeline CI / run_trainer_and_fsdp_gpu
Waiting for pending jobs
Matrix: Trainer/FSDP CI / run_models_gpu
Waiting for pending jobs
Matrix: Trainer/FSDP CI / run_quantization_torch_gpu
Matrix: Trainer/FSDP CI / run_trainer_and_fsdp_gpu
Waiting for pending jobs
DeepSpeed CI  /  Extract warnings in CI artifacts
DeepSpeed CI / Extract warnings in CI artifacts
Example CI  /  Extract warnings in CI artifacts
0s
Example CI / Extract warnings in CI artifacts
Model CI  /  Extract warnings in CI artifacts
22s
Model CI / Extract warnings in CI artifacts
Quantization CI  /  Extract warnings in CI artifacts
0s
Quantization CI / Extract warnings in CI artifacts
Torch pipeline CI  /  Extract warnings in CI artifacts
0s
Torch pipeline CI / Extract warnings in CI artifacts
Trainer/FSDP CI  /  Extract warnings in CI artifacts
0s
Trainer/FSDP CI / Extract warnings in CI artifacts
DeepSpeed CI  /  ...  /  Send results to webhook
14s
DeepSpeed CI / Slack Report / Send results to webhook
Example CI  /  ...  /  Send results to webhook
14s
Example CI / Slack Report / Send results to webhook
Model CI  /  ...  /  Send results to webhook
13s
Model CI / Slack Report / Send results to webhook
Quantization CI  /  ...  /  Send results to webhook
16s
Quantization CI / Slack Report / Send results to webhook
Torch pipeline CI  /  ...  /  Send results to webhook
17s
Torch pipeline CI / Slack Report / Send results to webhook
Trainer/FSDP CI  /  ...  /  Send results to webhook
16s
Trainer/FSDP CI / Slack Report / Send results to webhook
DeepSpeed CI  /  ...  /  check_new_failures
DeepSpeed CI / Check new failures / check_new_failures
Example CI  /  ...  /  check_new_failures
Example CI / Check new failures / check_new_failures
Model CI  /  ...  /  check_new_failures
Model CI / Check new failures / check_new_failures
Quantization CI  /  ...  /  check_new_failures
Quantization CI / Check new failures / check_new_failures
Torch pipeline CI  /  ...  /  check_new_failures
Torch pipeline CI / Check new failures / check_new_failures
Trainer/FSDP CI  /  ...  /  check_new_failures
Trainer/FSDP CI / Check new failures / check_new_failures
Fit to window
Zoom out
Zoom in

Annotations

18 errors and 9 warnings
DeepSpeed CI / Torch CUDA extension tests (aws-g5-4xlarge-cache)
Required runner group 'aws-g5-4xlarge-cache' not found
Model CI / Setup (aws-g5-12xlarge-cache)
Required runner group 'aws-g5-12xlarge-cache' not found
Trainer/FSDP CI / Setup (aws-g5-12xlarge-cache)
The strategy configuration was canceled because "trainer-fsdp-ci.setup.aws-g5-4xlarge-cache" failed
Trainer/FSDP CI / Setup (aws-g5-4xlarge-cache)
Required runner group 'aws-g5-4xlarge-cache' not found
Torch pipeline CI / PyTorch pipelines (aws-g5-4xlarge-cache)
Required runner group 'aws-g5-4xlarge-cache' not found
Quantization CI / Setup (aws-g5-12xlarge-cache)
Required runner group 'aws-g5-12xlarge-cache' not found
Torch pipeline CI / PyTorch pipelines (aws-g5-12xlarge-cache)
Required runner group 'aws-g5-12xlarge-cache' not found
Example CI / Examples directory (aws-g5-4xlarge-cache)
Required runner group 'aws-g5-4xlarge-cache' not found
DeepSpeed CI / Torch CUDA extension tests (aws-g5-12xlarge-cache)
Required runner group 'aws-g5-12xlarge-cache' not found
Model CI / Setup (aws-g5-4xlarge-cache)
Required runner group 'aws-g5-4xlarge-cache' not found
Quantization CI / Setup (aws-g5-4xlarge-cache)
The strategy configuration was canceled because "quantization-ci.setup.aws-g5-12xlarge-cache" failed
DeepSpeed CI / Slack Report / Send results to webhook
Process completed with exit code 1.
Example CI / Slack Report / Send results to webhook
Process completed with exit code 1.
Trainer/FSDP CI / Slack Report / Send results to webhook
Process completed with exit code 1.
Quantization CI / Slack Report / Send results to webhook
Process completed with exit code 1.
Torch pipeline CI / Slack Report / Send results to webhook
Process completed with exit code 1.
Model CI / Extract warnings in CI artifacts
Process completed with exit code 2.
Model CI / Slack Report / Send results to webhook
Process completed with exit code 1.
Setup
Node.js 20 actions are deprecated. The following actions are running on Node.js 20 and may not work as expected: actions/upload-artifact@v4. Actions will be forced to run with Node.js 24 by default starting June 16th, 2026. Node.js 20 will be removed from the runner on September 16th, 2026. Please check if updated versions of these actions are available that support Node.js 24. To opt into Node.js 24 now, set the FORCE_JAVASCRIPT_ACTIONS_TO_NODE24=true environment variable on the runner or in your workflow file. Once Node.js 24 becomes the default, you can temporarily opt out by setting ACTIONS_ALLOW_USE_UNSECURE_NODE_VERSION=true. For more information see: https://github.blog/changelog/2025-09-19-deprecation-of-node-20-on-github-actions-runners/
DeepSpeed CI / Slack Report / Send results to webhook
Node.js 20 actions are deprecated. The following actions are running on Node.js 20 and may not work as expected: actions/checkout@v4, actions/download-artifact@v4. Actions will be forced to run with Node.js 24 by default starting June 16th, 2026. Node.js 20 will be removed from the runner on September 16th, 2026. Please check if updated versions of these actions are available that support Node.js 24. To opt into Node.js 24 now, set the FORCE_JAVASCRIPT_ACTIONS_TO_NODE24=true environment variable on the runner or in your workflow file. Once Node.js 24 becomes the default, you can temporarily opt out by setting ACTIONS_ALLOW_USE_UNSECURE_NODE_VERSION=true. For more information see: https://github.blog/changelog/2025-09-19-deprecation-of-node-20-on-github-actions-runners/
Example CI / Slack Report / Send results to webhook
Node.js 20 actions are deprecated. The following actions are running on Node.js 20 and may not work as expected: actions/checkout@v4, actions/download-artifact@v4. Actions will be forced to run with Node.js 24 by default starting June 16th, 2026. Node.js 20 will be removed from the runner on September 16th, 2026. Please check if updated versions of these actions are available that support Node.js 24. To opt into Node.js 24 now, set the FORCE_JAVASCRIPT_ACTIONS_TO_NODE24=true environment variable on the runner or in your workflow file. Once Node.js 24 becomes the default, you can temporarily opt out by setting ACTIONS_ALLOW_USE_UNSECURE_NODE_VERSION=true. For more information see: https://github.blog/changelog/2025-09-19-deprecation-of-node-20-on-github-actions-runners/
Trainer/FSDP CI / Slack Report / Send results to webhook
Node.js 20 actions are deprecated. The following actions are running on Node.js 20 and may not work as expected: actions/checkout@v4, actions/download-artifact@v4. Actions will be forced to run with Node.js 24 by default starting June 16th, 2026. Node.js 20 will be removed from the runner on September 16th, 2026. Please check if updated versions of these actions are available that support Node.js 24. To opt into Node.js 24 now, set the FORCE_JAVASCRIPT_ACTIONS_TO_NODE24=true environment variable on the runner or in your workflow file. Once Node.js 24 becomes the default, you can temporarily opt out by setting ACTIONS_ALLOW_USE_UNSECURE_NODE_VERSION=true. For more information see: https://github.blog/changelog/2025-09-19-deprecation-of-node-20-on-github-actions-runners/
Quantization CI / Slack Report / Send results to webhook
Node.js 20 actions are deprecated. The following actions are running on Node.js 20 and may not work as expected: actions/checkout@v4, actions/download-artifact@v4. Actions will be forced to run with Node.js 24 by default starting June 16th, 2026. Node.js 20 will be removed from the runner on September 16th, 2026. Please check if updated versions of these actions are available that support Node.js 24. To opt into Node.js 24 now, set the FORCE_JAVASCRIPT_ACTIONS_TO_NODE24=true environment variable on the runner or in your workflow file. Once Node.js 24 becomes the default, you can temporarily opt out by setting ACTIONS_ALLOW_USE_UNSECURE_NODE_VERSION=true. For more information see: https://github.blog/changelog/2025-09-19-deprecation-of-node-20-on-github-actions-runners/
Torch pipeline CI / Slack Report / Send results to webhook
Node.js 20 actions are deprecated. The following actions are running on Node.js 20 and may not work as expected: actions/checkout@v4, actions/download-artifact@v4. Actions will be forced to run with Node.js 24 by default starting June 16th, 2026. Node.js 20 will be removed from the runner on September 16th, 2026. Please check if updated versions of these actions are available that support Node.js 24. To opt into Node.js 24 now, set the FORCE_JAVASCRIPT_ACTIONS_TO_NODE24=true environment variable on the runner or in your workflow file. Once Node.js 24 becomes the default, you can temporarily opt out by setting ACTIONS_ALLOW_USE_UNSECURE_NODE_VERSION=true. For more information see: https://github.blog/changelog/2025-09-19-deprecation-of-node-20-on-github-actions-runners/
Model CI / Extract warnings in CI artifacts
Node.js 20 actions are deprecated. The following actions are running on Node.js 20 and may not work as expected: actions/checkout@v4, actions/download-artifact@v4, actions/upload-artifact@v4. Actions will be forced to run with Node.js 24 by default starting June 16th, 2026. Node.js 20 will be removed from the runner on September 16th, 2026. Please check if updated versions of these actions are available that support Node.js 24. To opt into Node.js 24 now, set the FORCE_JAVASCRIPT_ACTIONS_TO_NODE24=true environment variable on the runner or in your workflow file. Once Node.js 24 becomes the default, you can temporarily opt out by setting ACTIONS_ALLOW_USE_UNSECURE_NODE_VERSION=true. For more information see: https://github.blog/changelog/2025-09-19-deprecation-of-node-20-on-github-actions-runners/
Model CI / Extract warnings in CI artifacts
No files were found with the provided path: warnings_in_ci/selected_warnings.json. No artifacts will be uploaded.
Model CI / Slack Report / Send results to webhook
Node.js 20 actions are deprecated. The following actions are running on Node.js 20 and may not work as expected: actions/checkout@v4, actions/download-artifact@v4. Actions will be forced to run with Node.js 24 by default starting June 16th, 2026. Node.js 20 will be removed from the runner on September 16th, 2026. Please check if updated versions of these actions are available that support Node.js 24. To opt into Node.js 24 now, set the FORCE_JAVASCRIPT_ACTIONS_TO_NODE24=true environment variable on the runner or in your workflow file. Once Node.js 24 becomes the default, you can temporarily opt out by setting ACTIONS_ALLOW_USE_UNSECURE_NODE_VERSION=true. For more information see: https://github.blog/changelog/2025-09-19-deprecation-of-node-20-on-github-actions-runners/

Artifacts

Produced during runtime
Name Size Digest
setup_values
310 Bytes
sha256:569901bcadb954f05e7ba7aedfb886f1bcf0fabda1c38ba02d66180b20d9f097