Skip to content

NVIDIA Anti-Trust Enforcement#12

Closed
openSourcerer9000 wants to merge 2 commits intoaudiohacking:mainfrom
openSourcerer9000:dontLetTheCudaBite
Closed

NVIDIA Anti-Trust Enforcement#12
openSourcerer9000 wants to merge 2 commits intoaudiohacking:mainfrom
openSourcerer9000:dontLetTheCudaBite

Conversation

@openSourcerer9000
Copy link
Copy Markdown

No description provided.

@openSourcerer9000
Copy link
Copy Markdown
Author

I got the same issue and set Qwen coder next loose on it for about an hour, and it seems to have fixed it. Someone should really take a look at the diffs though, it managed to break and subsequently fix the server along the way...
#10

@openSourcerer9000 openSourcerer9000 changed the title fixed somehow Cuda issue fix Mar 16, 2026
@openSourcerer9000 openSourcerer9000 changed the title Cuda issue fix NVIDIA Anti-Trust Enforcement Mar 16, 2026
@lmangani lmangani requested a review from Copilot March 16, 2026 00:29
Copy link
Copy Markdown

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

Improves backend runtime compatibility on non-CUDA environments (MPS/CPU) and enhances failure observability via traceback logging.

Changes:

  • Add a CUDA fallback monkey-patch for non-CUDA devices and disable SageAttention on MPS.
  • Improve device selection logging when falling back to CPU.
  • Switch several failure logs to include tracebacks (logger.exception).

Reviewed changes

Copilot reviewed 4 out of 4 changed files in this pull request and generated 3 comments.

File Description
backend/ltx2_server.py Adds CUDA fallback patching and disables SageAttention on MPS; logs CPU fallback.
backend/handlers/video_generation_handler.py Adds traceback logging for i2v generation failures.
backend/handlers/queue_worker.py Logs job failures with tracebacks.
backend/handlers/generation_handler.py Logs generation failures with tracebacks during state transitions.

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

You can also share your feedback on Copilot code review. Take the survey.

Comment thread backend/ltx2_server.py
Comment on lines +49 to +115
def _setup_cuda_fallback() -> None:
"""
Monkey-patch torch.cuda functions to handle cases where PyTorch is not
compiled with CUDA support (e.g., running on MPS or CPU).

The ltx-pipelines library calls torch.cuda.synchronize() unconditionally,
which fails with "Torch not compiled with CUDA enabled" on non-CUDA builds.
"""
# Check if we're on a device that doesn't have full CUDA support
device_type = DEVICE.type

if device_type == "cuda":
# True CUDA - no fallback needed
return

logger.info(f"Setup CUDA fallback for device type: {device_type}")

# Create safe no-op implementations for CUDA functions
def safe_cuda_synchronize() -> None:
"""No-op synchronize for non-CUDA devices."""
if device_type == "mps":
try:
torch.mps.synchronize()
except Exception:
pass

def safe_cuda_empty_cache() -> None:
"""No-op empty_cache for non-CUDA devices."""
if device_type == "mps":
try:
torch.mps.empty_cache()
except Exception:
pass

def safe_cuda_memory_reserved() -> int:
"""Return 0 for memory reserved on non-CUDA devices."""
return 0

def safe_cuda_memory_allocated() -> int:
"""Return 0 for memory allocated on non-CUDA devices."""
return 0

def safe_cuda_get_device_name(device: object = None) -> str:
"""Return device name for non-CUDA devices."""
if device_type == "mps" and hasattr(torch, 'mps'):
return "Apple Silicon MPS"
return "CPU"

def safe_cuda_get_device_capability(device: object = None) -> tuple[int, int]:
"""Return (0, 0) for non-CUDA devices."""
return (0, 0)

# Patch torch.cuda module
if not hasattr(torch.cuda, "_ltx_original_synchronize"):
# Store original functions if they exist
try:
torch.cuda._ltx_original_synchronize = torch.cuda.synchronize # type: ignore[attr-defined]
except AttributeError:
pass

# Replace with safe implementations
torch.cuda.synchronize = safe_cuda_synchronize # type: ignore[assignment]
torch.cuda.empty_cache = safe_cuda_empty_cache # type: ignore[assignment]
torch.cuda.memory_reserved = safe_cuda_memory_reserved # type: ignore[assignment]
torch.cuda.memory_allocated = safe_cuda_memory_allocated # type: ignore[assignment]
torch.cuda.get_device_name = safe_cuda_get_device_name # type: ignore[assignment]
torch.cuda.get_device_capability = safe_cuda_get_device_capability # type: ignore[assignment]
logger.info("Generation cancelled by user")
return GenerateVideoResponse(status="cancelled")

logger.exception("[i2v] Generation failed with exception: %s", e)
Comment thread backend/ltx2_server.py
Comment on lines +45 to +48
# ============================================================
# CUDA Fallback Handling (for non-CUDA PyTorch builds on MPS/CPU)
# ============================================================

@lmangani
Copy link
Copy Markdown

lmangani commented Mar 16, 2026

@openSourcerer9000 the _setup_cuda_fallback patch will definitely help but all the changes to logger.exception are causing test errors and are unnecessary. Could you kindly remove those from the PR and limit changes to the CUDA related patches?

@lmangani
Copy link
Copy Markdown

@openSourcerer9000 closed and recreated in #14

@lmangani lmangani closed this Mar 16, 2026
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants