Skip to content

fix(transcribe): auto-fallback to CPU + int8 when CUDA is unavailable#21

Merged
pretyflaco merged 2 commits into
mainfrom
fix/transcribe-cuda-fallback
May 14, 2026
Merged

fix(transcribe): auto-fallback to CPU + int8 when CUDA is unavailable#21
pretyflaco merged 2 commits into
mainfrom
fix/transcribe-cuda-fallback

Conversation

@pretyflaco
Copy link
Copy Markdown
Owner

Supersedes #19. Picks up @fadenb's commit (preserved with original authorship via cherry-pick) and adds the follow-ups from review.

Summary

  • TranscriptionConfig.__post_init__ no longer raises ValueError when device='cuda' (or torch_device='cuda'/'mps') is requested but the accelerator is unavailable. It warns and auto-falls back to cpu, downgrading compute_type from float16int8 when the device flips (float16 is unsupported on CPU).
  • The model-load log line annotates whether CPU was forced (--device cpu) or auto-selected because no GPU was found.

Changes vs. #19

  1. Second warning when compute_type is downgraded (review ask). Previously the user only saw the device fallback message; the silent float16int8 flip now emits its own log line.
  2. Internal _device_auto_fallback flag on TranscriptionConfig. _load_whisperx_asr_model reads it instead of re-sniffing torch at print time, so the (forced) vs (fallback — no GPU) annotation is honest when the user explicitly passes --device cpu on a no-GPU machine (previously mislabeled as "fallback").
  3. Unit tests (review ask). Five tests in TestTranscriptionConfig:
    • test_invalid_torch_device_cuda_raises — rewritten to assert fallback + int8 downgrade + flag set.
    • test_invalid_torch_device_mps_raises — rewritten to assert torch_device flips but device/compute_type are untouched.
    • test_cuda_unavailable_logs_both_warnings — uses caplog to verify both warning lines emit.
    • test_cuda_unavailable_with_int8_does_not_log_compute_type_change — guards against spurious downgrade message when compute_type=int8 already.
    • test_explicit_cpu_is_not_marked_as_auto_fallback — guards the _device_auto_fallback semantics that drive the load-line annotation.
  4. Removed dead conditional fallback = "cpu" if value == "cuda" else "cpu"fallback = "cpu".
  5. CHANGELOG: new v0.7.1 entry crediting @fadenb.

Test plan

  • pytest tests/test_transcribe.py -k TranscriptionConfig -v — 18/18 pass.
  • ruff check meet/transcribe.py meet/cli.py tests/test_transcribe.py tests/test_utils.py — clean.
  • Original PR's manual scenarios (no-GPU box → warns, falls back, transcribes with int8) — preserved.

Closes #19.

fadenb and others added 2 commits May 14, 2026 16:54
Instead of raising ValueError when the requested CUDA device is not
present, automatically fall back to CPU and downgrade compute_type from
float16 to int8 (float16 is unsupported on CPU).  Also indicate whether
CPU is forced or a fallback in the model-loading print message.
Follow-up to #19 (cherry-picked) addressing review feedback:

- __post_init__ now emits a second warning when compute_type is flipped
  from float16 to int8 because the device fell back to CPU.  Previously
  the user only saw the device fallback message; the compute_type change
  was silent.
- TranscriptionConfig gains an internal _device_auto_fallback flag set
  when device is auto-flipped to cpu.  _load_whisperx_asr_model reads
  the flag instead of re-sniffing torch at print time, so the
  "(forced)" vs "(fallback — no GPU)" annotation is accurate even when
  the user explicitly passes --device cpu on a no-GPU machine.
- Removed dead conditional `fallback = "cpu" if value == "cuda" else "cpu"`.
- tests/test_transcribe.py: rewrote the two raise-expecting tests
  (test_invalid_torch_device_{cuda,mps}_raises) to assert the new
  fallback behavior, and added three tests covering the compute_type
  warning, the no-spurious-warning case when compute_type is already
  int8, and that explicit --device cpu does not set _device_auto_fallback.
- CHANGELOG: v0.7.1 entry crediting @fadenb.
@pretyflaco pretyflaco merged commit 3e2ee19 into main May 14, 2026
1 check passed
@pretyflaco pretyflaco deleted the fix/transcribe-cuda-fallback branch May 14, 2026 15:12
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants