cuda-perf

Merge branch 'pytorch:main' into main #1

Triggered via push May 8, 2026 13:45

luhenry

pushed f502947

main

Status Cancelled

Total duration 2d 0h 1m 42s

Artifacts –

cuda-perf.yml

on: push

set-parameters

13s

Matrix: export-models

Matrix: benchmark-cuda

upload-benchmark-results

1m 12s

Annotations

39 errors and 2 warnings

export-models (openai/whisper-large-v3-turbo, non-quantized, openai_whisper-large-v3-turbo, 50) / linux-job

The job has exceeded the maximum execution time while awaiting a runner for 24h0m0s

export-models (openai/whisper-large-v3-turbo, quantized-int4-weight-only, openai_whisper-large-v3... / linux-job

The job has exceeded the maximum execution time while awaiting a runner for 24h0m0s

export-models (mistralai/Voxtral-Mini-3B-2507, quantized-int4-tile-packed, mistralai_Voxtral-Mini... / linux-job

The job has exceeded the maximum execution time while awaiting a runner for 24h0m0s

export-models (mistralai/Voxtral-Mini-3B-2507, quantized-int4-weight-only, mistralai_Voxtral-Mini... / linux-job

The job has exceeded the maximum execution time while awaiting a runner for 24h0m0s

export-models (SocialLocalMobile/Qwen3.5-35B-A3B-HQQ-INT4, quantized-int4-tile-packed, SocialLoca... / linux-job

The job has exceeded the maximum execution time while awaiting a runner for 24h0m0s

export-models (nvidia/parakeet-tdt, non-quantized, nvidia_parakeet-tdt, 50) / linux-job

The job has exceeded the maximum execution time while awaiting a runner for 24h0m0s

export-models (openai/whisper-medium, non-quantized, openai_whisper-medium, 50) / linux-job

The job has exceeded the maximum execution time while awaiting a runner for 24h0m0s

export-models (nvidia/parakeet-tdt, quantized-int4-tile-packed, nvidia_parakeet-tdt, 50) / linux-job

The job has exceeded the maximum execution time while awaiting a runner for 24h0m0s

export-models (openai/whisper-medium, quantized-int4-tile-packed, openai_whisper-medium, 50) / linux-job

The job has exceeded the maximum execution time while awaiting a runner for 24h0m0s

export-models (openai/whisper-small, quantized-int4-weight-only, openai_whisper-small, 50) / linux-job

The job has exceeded the maximum execution time while awaiting a runner for 24h0m0s

export-models (openai/whisper-large-v3-turbo, quantized-int4-tile-packed, openai_whisper-large-v3... / linux-job

The job has exceeded the maximum execution time while awaiting a runner for 24h0m0s

export-models (openai/whisper-small, non-quantized, openai_whisper-small, 50) / linux-job

The job has exceeded the maximum execution time while awaiting a runner for 24h0m0s

export-models (mistralai/Voxtral-Mini-3B-2507, non-quantized, mistralai_Voxtral-Mini-3B-2507, 50) / linux-job

The job has exceeded the maximum execution time while awaiting a runner for 24h0m0s

export-models (google/gemma-3-4b-it, quantized-int4-weight-only, google_gemma-3-4b-it, 50) / linux-job

The job has exceeded the maximum execution time while awaiting a runner for 24h0m0s

export-models (google/gemma-3-4b-it, quantized-int4-tile-packed, google_gemma-3-4b-it, 50) / linux-job

The job has exceeded the maximum execution time while awaiting a runner for 24h0m0s

export-models (google/gemma-3-4b-it, non-quantized, google_gemma-3-4b-it, 50) / linux-job

The job has exceeded the maximum execution time while awaiting a runner for 24h0m0s

export-models (nvidia/parakeet-tdt, quantized-int4-weight-only, nvidia_parakeet-tdt, 50) / linux-job

The job has exceeded the maximum execution time while awaiting a runner for 24h0m0s

export-models (openai/whisper-medium, quantized-int4-weight-only, openai_whisper-medium, 50) / linux-job

The job has exceeded the maximum execution time while awaiting a runner for 24h0m0s

export-models (openai/whisper-small, quantized-int4-tile-packed, openai_whisper-small, 50) / linux-job

The job has exceeded the maximum execution time while awaiting a runner for 24h0m0s

benchmark-cuda (openai/whisper-small, quantized-int4-weight-only, openai_whisper-small, 50) / linux-job

The job has exceeded the maximum execution time while awaiting a runner for 24h0m0s

benchmark-cuda (mistralai/Voxtral-Mini-3B-2507, quantized-int4-weight-only, mistralai_Voxtral-Min... / linux-job

The job has exceeded the maximum execution time while awaiting a runner for 24h0m0s

benchmark-cuda (openai/whisper-small, quantized-int4-tile-packed, openai_whisper-small, 50) / linux-job

The job has exceeded the maximum execution time while awaiting a runner for 24h0m0s

benchmark-cuda (nvidia/parakeet-tdt, non-quantized, nvidia_parakeet-tdt, 50) / linux-job

The job has exceeded the maximum execution time while awaiting a runner for 24h0m0s

benchmark-cuda (google/gemma-3-4b-it, quantized-int4-weight-only, google_gemma-3-4b-it, 50) / linux-job

The job has exceeded the maximum execution time while awaiting a runner for 24h0m0s

benchmark-cuda (SocialLocalMobile/Qwen3.5-35B-A3B-HQQ-INT4, quantized-int4-tile-packed, SocialLoc... / linux-job

The job has exceeded the maximum execution time while awaiting a runner for 24h0m0s

benchmark-cuda (openai/whisper-medium, quantized-int4-tile-packed, openai_whisper-medium, 50) / linux-job

The job has exceeded the maximum execution time while awaiting a runner for 24h0m0s

benchmark-cuda (google/gemma-3-4b-it, non-quantized, google_gemma-3-4b-it, 50) / linux-job

The job has exceeded the maximum execution time while awaiting a runner for 24h0m0s

benchmark-cuda (mistralai/Voxtral-Mini-3B-2507, quantized-int4-tile-packed, mistralai_Voxtral-Min... / linux-job

The job has exceeded the maximum execution time while awaiting a runner for 24h0m0s

benchmark-cuda (nvidia/parakeet-tdt, quantized-int4-weight-only, nvidia_parakeet-tdt, 50) / linux-job

The job has exceeded the maximum execution time while awaiting a runner for 24h0m0s

benchmark-cuda (openai/whisper-medium, quantized-int4-weight-only, openai_whisper-medium, 50) / linux-job

The job has exceeded the maximum execution time while awaiting a runner for 24h0m0s

benchmark-cuda (openai/whisper-large-v3-turbo, quantized-int4-weight-only, openai_whisper-large-v... / linux-job

The job has exceeded the maximum execution time while awaiting a runner for 24h0m0s

benchmark-cuda (google/gemma-3-4b-it, quantized-int4-tile-packed, google_gemma-3-4b-it, 50) / linux-job

The job has exceeded the maximum execution time while awaiting a runner for 24h0m0s

benchmark-cuda (nvidia/parakeet-tdt, quantized-int4-tile-packed, nvidia_parakeet-tdt, 50) / linux-job

The job has exceeded the maximum execution time while awaiting a runner for 24h0m0s

benchmark-cuda (openai/whisper-small, non-quantized, openai_whisper-small, 50) / linux-job

The job has exceeded the maximum execution time while awaiting a runner for 24h0m0s

benchmark-cuda (openai/whisper-medium, non-quantized, openai_whisper-medium, 50) / linux-job

The job has exceeded the maximum execution time while awaiting a runner for 24h0m0s

benchmark-cuda (openai/whisper-large-v3-turbo, non-quantized, openai_whisper-large-v3-turbo, 50) / linux-job

The job has exceeded the maximum execution time while awaiting a runner for 24h0m0s

benchmark-cuda (mistralai/Voxtral-Mini-3B-2507, non-quantized, mistralai_Voxtral-Mini-3B-2507, 50) / linux-job

The job has exceeded the maximum execution time while awaiting a runner for 24h0m0s

benchmark-cuda (openai/whisper-large-v3-turbo, quantized-int4-tile-packed, openai_whisper-large-v... / linux-job

The job has exceeded the maximum execution time while awaiting a runner for 24h0m0s

upload-benchmark-results

Could not assume role with OIDC: Not authorized to perform sts:AssumeRoleWithWebIdentity

set-parameters

Node.js 20 actions are deprecated. The following actions are running on Node.js 20 and may not work as expected: actions/checkout@v3, actions/setup-python@v4. Actions will be forced to run with Node.js 24 by default starting June 2nd, 2026. Node.js 20 will be removed from the runner on September 16th, 2026. Please check if updated versions of these actions are available that support Node.js 24. To opt into Node.js 24 now, set the FORCE_JAVASCRIPT_ACTIONS_TO_NODE24=true environment variable on the runner or in your workflow file. Once Node.js 24 becomes the default, you can temporarily opt out by setting ACTIONS_ALLOW_USE_UNSECURE_NODE_VERSION=true. For more information see: https://github.blog/changelog/2025-09-19-deprecation-of-node-20-on-github-actions-runners/

upload-benchmark-results

Node.js 20 actions are deprecated. The following actions are running on Node.js 20 and may not work as expected: actions/checkout@v3, actions/download-artifact@v4, actions/setup-python@v4, aws-actions/configure-aws-credentials@v4. Actions will be forced to run with Node.js 24 by default starting June 2nd, 2026. Node.js 20 will be removed from the runner on September 16th, 2026. Please check if updated versions of these actions are available that support Node.js 24. To opt into Node.js 24 now, set the FORCE_JAVASCRIPT_ACTIONS_TO_NODE24=true environment variable on the runner or in your workflow file. Once Node.js 24 becomes the default, you can temporarily opt out by setting ACTIONS_ALLOW_USE_UNSECURE_NODE_VERSION=true. For more information see: https://github.blog/changelog/2025-09-19-deprecation-of-node-20-on-github-actions-runners/

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Merge branch 'pytorch:main' into main #1

Summary