Skip to content
Open
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
6 changes: 3 additions & 3 deletions .github/workflows/release.yml
Original file line number Diff line number Diff line change
Expand Up @@ -191,10 +191,10 @@ jobs:
pip install --no-deps chatterbox-tts
pip install --no-deps hume-tada

- name: Install PyTorch with CUDA 12.8
- name: Install PyTorch with CUDA 12.8 (nightly for RTX 50-series / sm_120 support)
run: |
pip install torch --index-url https://download.pytorch.org/whl/cu128 --force-reinstall --no-deps
pip install torchaudio --index-url https://download.pytorch.org/whl/cu128 --force-reinstall --no-deps
pip install torch --index-url https://download.pytorch.org/whl/nightly/cu128 --force-reinstall --no-deps
pip install torchaudio --index-url https://download.pytorch.org/whl/nightly/cu128 --force-reinstall --no-deps
Comment on lines +194 to +197
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟠 Major

🧩 Analysis chain

🌐 Web query:

When will PyTorch stable release include sm_120 Blackwell GPU support?

💡 Result:

PyTorch stable releases have supported sm_120 (Blackwell GPU compute capability 12.0) since version 2.7.0, released on April 23, 2025. This support is available in pre-built wheels for CUDA 12.8 and newer, which include Blackwell architectures (sm_120). Install via pip install torch torchvision torchaudio --index-url https://download.pytorch.org/whl/cu128. Official PyTorch 2.7 release notes confirm: "PyTorch 2.7 introduces support for NVIDIA’s new Blackwell GPU architecture and ships pre-built wheels for CUDA 12.8." The RELEASE.md CUDA support matrix lists Blackwell(10.0, 12.0) for CUDA 12.8+ builds in 2.7 and later (e.g., 2.11). Ongoing GitHub issues (e.g., #164342, updated March 2026) report platform-specific issues like Windows WDDM mode or certain workloads, but core kernel support exists in stable builds since 2.7. PyTorch forum posts from maintainers (ptrblck) affirm: "all of our stable and nightly releases starting with PyTorch 2.7.0 and built with CUDA 12.8 or newer already support the Blackwell family including sm_120". As of March 28, 2026 (post-2.11 release on March 23), use CUDA 12.8+ wheels from stable PyTorch for sm_120 support. Nightlies or source builds may resolve edge cases.

Citations:


Switch from PyTorch nightly to stable; sm_120 support is already available in PyTorch 2.7+.

PyTorch has included native support for sm_120 (Blackwell) in stable releases since version 2.7.0 (April 2025). The stable CUDA 12.8 wheels already contain the necessary Blackwell kernels, eliminating the need for nightly builds.

Replace with:

Suggested change
      - name: Install PyTorch with CUDA 12.8 (RTX 50-series / sm_120 support)
        run: |
-         pip install torch --index-url https://download.pytorch.org/whl/nightly/cu128 --force-reinstall --no-deps
-         pip install torchaudio --index-url https://download.pytorch.org/whl/nightly/cu128 --force-reinstall --no-deps
+         pip install torch --index-url https://download.pytorch.org/whl/cu128 --force-reinstall --no-deps
+         pip install torchaudio --index-url https://download.pytorch.org/whl/cu128 --force-reinstall --no-deps

Using stable releases ensures reproducible builds and avoids unexpected breakage from nightly changes.

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In @.github/workflows/release.yml around lines 194 - 197, The current CI step
named "Install PyTorch with CUDA 12.8 (nightly for RTX 50-series / sm_120
support)" installs nightly wheels; replace those pip invocations so they install
the stable CUDA 12.8 PyTorch/torchaudio packages (use the stable cu128 wheel
index or pin torch/torchaudio to >=2.7.0) instead of the nightly index-url —
update the two pip install lines accordingly and keep any necessary flags (e.g.,
--no-deps) to preserve reproducible installs.


- name: Verify CUDA support in torch
run: |
Expand Down
18 changes: 16 additions & 2 deletions backend/app.py
Original file line number Diff line number Diff line change
Expand Up @@ -146,11 +146,18 @@ def _get_gpu_status() -> str:
"""Return a human-readable string describing GPU availability."""
backend_type = get_backend_type()
if torch.cuda.is_available():
from .backends.base import check_cuda_compatibility

device_name = torch.cuda.get_device_name(0)
compatible, _warning = check_cuda_compatibility()
is_rocm = hasattr(torch.version, "hip") and torch.version.hip is not None
if is_rocm:
return f"ROCm ({device_name})"
return f"CUDA ({device_name})"
label = f"ROCm ({device_name})"
else:
label = f"CUDA ({device_name})"
if not compatible:
label += " [UNSUPPORTED - see logs]"
return label
elif hasattr(torch.backends, "mps") and torch.backends.mps.is_available():
return "MPS (Apple Silicon)"
elif backend_type == "mlx":
Expand Down Expand Up @@ -230,6 +237,13 @@ async def startup_event():
logger.info("Backend: %s", backend_type.upper())
logger.info("GPU: %s", _get_gpu_status())

# Warn if GPU architecture is not supported by this PyTorch build
from .backends.base import check_cuda_compatibility

_compatible, _cuda_warning = check_cuda_compatibility()
if not _compatible:
logger.warning("GPU COMPATIBILITY: %s", _cuda_warning)

from .services.cuda import check_and_update_cuda_binary

create_background_task(check_and_update_cuda_binary())
Expand Down
38 changes: 38 additions & 0 deletions backend/backends/base.py
Original file line number Diff line number Diff line change
Expand Up @@ -126,6 +126,44 @@ def get_torch_device(
return "cpu"


def check_cuda_compatibility() -> tuple[bool, str | None]:
"""Check if the installed PyTorch supports the current GPU's compute capability.

Returns:
(compatible, warning_message) — compatible is True if OK or no CUDA GPU,
warning_message is a human-readable string if there's a problem.
"""
import torch

if not torch.cuda.is_available():
return True, None

major, minor = torch.cuda.get_device_capability(0)
capability = f"{major}.{minor}"
device_name = torch.cuda.get_device_name(0)
sm_tag = f"sm_{major}{minor}"

# torch.cuda._get_arch_list() returns the SM architectures this build
# was compiled for (e.g. ["sm_50", "sm_60", ..., "sm_90"]).
try:
arch_list = torch.cuda._get_arch_list()
if arch_list:
# Check for both sm_XX and compute_XX (JIT-compiled) entries
compute_tag = f"compute_{major}{minor}"
if sm_tag not in arch_list and compute_tag not in arch_list:
return False, (
f"{device_name} (compute capability {capability} / {sm_tag}) "
f"is not supported by this PyTorch build. "
f"Supported architectures: {', '.join(arch_list)}. "
f"Install PyTorch nightly (cu128) for newer GPU support: "
f"pip install torch --index-url https://download.pytorch.org/whl/nightly/cu128"
)
except AttributeError:
pass

return True, None


def empty_device_cache(device: str) -> None:
"""
Free cached memory on the given device (CUDA or XPU).
Expand Down
3 changes: 2 additions & 1 deletion backend/build_binary.py
Original file line number Diff line number Diff line change
Expand Up @@ -410,7 +410,8 @@ def build_server(cuda=False):
"torchvision",
"torchaudio",
"--index-url",
"https://download.pytorch.org/whl/cu128",
# Nightly for RTX 50-series (sm_120/Blackwell) support
"https://download.pytorch.org/whl/nightly/cu128",
"--force-reinstall",
"-q",
],
Expand Down
1 change: 1 addition & 0 deletions backend/models.py
Original file line number Diff line number Diff line change
Expand Up @@ -182,6 +182,7 @@ class HealthResponse(BaseModel):
vram_used_mb: Optional[float] = None
backend_type: Optional[str] = None # Backend type (mlx or pytorch)
backend_variant: Optional[str] = None # Binary variant (cpu or cuda)
gpu_compatibility_warning: Optional[str] = None # Warning if GPU arch unsupported
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟡 Minor

🧩 Analysis chain

🏁 Script executed:

#!/bin/bash
# Check if frontend types include gpu_compatibility_warning
echo "=== Checking auto-generated types ==="
rg -n 'gpu_compatibility_warning' app/src/lib/api/ || echo "Field NOT found in auto-generated types"

echo ""
echo "=== Checking hand-written types.ts ==="
rg -n 'gpu_compatibility_warning' app/src/lib/api/types.ts || echo "Field NOT found in types.ts"

Repository: jamiepine/voicebox

Length of output: 207


🏁 Script executed:

# Check justfile for generation commands
if [ -f "justfile" ]; then
  echo "=== Justfile generation commands ==="
  rg -A 3 'generate.*api|api.*generate' justfile
else
  echo "No justfile found"
fi

Repository: jamiepine/voicebox

Length of output: 235


🏁 Script executed:

# Check headers of auto-generated files to see if they indicate regeneration is needed
echo "=== Checking auto-generated file headers ==="
head -20 app/src/lib/api/models/HealthResponse.ts 2>/dev/null || echo "File not found"
echo ""
head -20 app/src/lib/api/schemas/\$HealthResponse.ts 2>/dev/null || echo "File not found"

Repository: jamiepine/voicebox

Length of output: 961


🏁 Script executed:

# Check if frontend code tries to access gpu_compatibility_warning
echo "=== Checking frontend code that uses HealthResponse ==="
rg -n 'HealthResponse' app/src --type ts --type tsx -B 2 -A 5 | head -50

Repository: jamiepine/voicebox

Length of output: 162


🏁 Script executed:

# Search for HealthResponse usage in frontend code
echo "=== HealthResponse usage in frontend ==="
rg -n 'HealthResponse' app/src --type ts -B 2 -A 5

Repository: jamiepine/voicebox

Length of output: 8799


🏁 Script executed:

# Check what the generate-api.sh script does
echo "=== Checking generate-api.sh script ==="
cat scripts/generate-api.sh

Repository: jamiepine/voicebox

Length of output: 2351


🏁 Script executed:

# Verify the backend model structure to understand the complete change
echo "=== Checking backend HealthResponse model ==="
rg -B 5 -A 10 'class HealthResponse|gpu_compatibility_warning' backend/models.py

Repository: jamiepine/voicebox

Length of output: 944


Run API code generator to sync frontend types with backend changes.

The new gpu_compatibility_warning field in the backend's HealthResponse model is not reflected in the frontend TypeScript types:

  • app/src/lib/api/models/HealthResponse.ts (auto-generated, missing field)
  • app/src/lib/api/schemas/$HealthResponse.ts (auto-generated, missing field)
  • app/src/lib/api/types.ts (hand-written, missing field)

Run just generate-api to regenerate the auto-generated files from the OpenAPI schema. The hand-written types.ts interface will then need manual updating to include the new field.

Without this, the frontend cannot access the GPU compatibility warning to display it to users.

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@backend/models.py` at line 185, The backend added a new field
gpu_compatibility_warning to the HealthResponse model but the frontend types are
out of sync; run the API generator (just generate-api) to regenerate the
auto-generated TypeScript schemas so HealthResponse includes
gpu_compatibility_warning, then update the hand-maintained types.ts
HealthResponse interface to add the optional gpu_compatibility_warning: string |
null (or Optional<string>) so the frontend can access and display the new
warning field.



class DirectoryCheck(BaseModel):
Expand Down
7 changes: 7 additions & 0 deletions backend/routes/health.py
Original file line number Diff line number Diff line change
Expand Up @@ -93,6 +93,12 @@ async def health():
except ImportError:
pass

gpu_compat_warning = None
if has_cuda:
from ..backends.base import check_cuda_compatibility

_compatible, gpu_compat_warning = check_cuda_compatibility()

gpu_available = has_cuda or has_mps or has_xpu or has_directml or backend_type == "mlx"

gpu_type = None
Expand Down Expand Up @@ -171,6 +177,7 @@ async def health():
"VOICEBOX_BACKEND_VARIANT",
"cuda" if torch.cuda.is_available() else ("xpu" if has_xpu else "cpu"),
),
gpu_compatibility_warning=gpu_compat_warning,
)


Expand Down
4 changes: 2 additions & 2 deletions justfile
Original file line number Diff line number Diff line change
Expand Up @@ -74,8 +74,8 @@ setup-python:
$hasNvidia = ($gpus | Where-Object { $_ -match 'NVIDIA' }).Count -gt 0
$hasIntelArc = ($gpus | Where-Object { $_ -match 'Arc' }).Count -gt 0
if ($hasNvidia) { \
Write-Host "NVIDIA GPU detected — installing PyTorch with CUDA support..."; \
& "{{ pip }}" install torch torchvision torchaudio --index-url https://download.pytorch.org/whl/cu128; \
Write-Host "NVIDIA GPU detected — installing PyTorch with CUDA support (nightly for RTX 50-series)..."; \
& "{{ pip }}" install torch torchvision torchaudio --index-url https://download.pytorch.org/whl/nightly/cu128; \
} elseif ($hasIntelArc) { \
Write-Host "Intel Arc GPU detected — installing PyTorch with XPU support..."; \
& "{{ pip }}" install torch torchvision torchaudio --index-url https://download.pytorch.org/whl/xpu; \
Expand Down