fix(transcribe): auto-fallback to CPU + int8 when CUDA is unavailable by pretyflaco · Pull Request #21 · pretyflaco/meetscribe

pretyflaco · 2026-05-14T14:10:00Z

Supersedes #19. Picks up @fadenb's commit (preserved with original authorship via cherry-pick) and adds the follow-ups from review.

Summary

TranscriptionConfig.__post_init__ no longer raises ValueError when device='cuda' (or torch_device='cuda'/'mps') is requested but the accelerator is unavailable. It warns and auto-falls back to cpu, downgrading compute_type from float16 → int8 when the device flips (float16 is unsupported on CPU).
The model-load log line annotates whether CPU was forced (--device cpu) or auto-selected because no GPU was found.

Changes vs. #19

Second warning when compute_type is downgraded (review ask). Previously the user only saw the device fallback message; the silent float16→int8 flip now emits its own log line.
Internal _device_auto_fallback flag on TranscriptionConfig. _load_whisperx_asr_model reads it instead of re-sniffing torch at print time, so the (forced) vs (fallback — no GPU) annotation is honest when the user explicitly passes --device cpu on a no-GPU machine (previously mislabeled as "fallback").
Unit tests (review ask). Five tests in TestTranscriptionConfig:
- test_invalid_torch_device_cuda_raises — rewritten to assert fallback + int8 downgrade + flag set.
- test_invalid_torch_device_mps_raises — rewritten to assert torch_device flips but device/compute_type are untouched.
- test_cuda_unavailable_logs_both_warnings — uses caplog to verify both warning lines emit.
- test_cuda_unavailable_with_int8_does_not_log_compute_type_change — guards against spurious downgrade message when compute_type=int8 already.
- test_explicit_cpu_is_not_marked_as_auto_fallback — guards the _device_auto_fallback semantics that drive the load-line annotation.
Removed dead conditional fallback = "cpu" if value == "cuda" else "cpu" → fallback = "cpu".
CHANGELOG: new v0.7.1 entry crediting @fadenb.

Test plan

pytest tests/test_transcribe.py -k TranscriptionConfig -v — 18/18 pass.
ruff check meet/transcribe.py meet/cli.py tests/test_transcribe.py tests/test_utils.py — clean.
Original PR's manual scenarios (no-GPU box → warns, falls back, transcribes with int8) — preserved.

Closes #19.

Instead of raising ValueError when the requested CUDA device is not present, automatically fall back to CPU and downgrade compute_type from float16 to int8 (float16 is unsupported on CPU). Also indicate whether CPU is forced or a fallback in the model-loading print message.

@fadenb

Follow-up to #19 (cherry-picked) addressing review feedback: - __post_init__ now emits a second warning when compute_type is flipped from float16 to int8 because the device fell back to CPU. Previously the user only saw the device fallback message; the compute_type change was silent. - TranscriptionConfig gains an internal _device_auto_fallback flag set when device is auto-flipped to cpu. _load_whisperx_asr_model reads the flag instead of re-sniffing torch at print time, so the "(forced)" vs "(fallback — no GPU)" annotation is accurate even when the user explicitly passes --device cpu on a no-GPU machine. - Removed dead conditional `fallback = "cpu" if value == "cuda" else "cpu"`. - tests/test_transcribe.py: rewrote the two raise-expecting tests (test_invalid_torch_device_{cuda,mps}_raises) to assert the new fallback behavior, and added three tests covering the compute_type warning, the no-spurious-warning case when compute_type is already int8, and that explicit --device cpu does not set _device_auto_fallback. - CHANGELOG: v0.7.1 entry crediting @fadenb.

fadenb and others added 2 commits May 14, 2026 16:54

pretyflaco mentioned this pull request May 14, 2026

fix(transcribe): auto-fallback to CPU + int8 when CUDA is unavailable #19

Closed

3 tasks

pretyflaco merged commit 3e2ee19 into main May 14, 2026
1 check passed

pretyflaco deleted the fix/transcribe-cuda-fallback branch May 14, 2026 15:12

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

fix(transcribe): auto-fallback to CPU + int8 when CUDA is unavailable#21

fix(transcribe): auto-fallback to CPU + int8 when CUDA is unavailable#21
pretyflaco merged 2 commits into
mainfrom
fix/transcribe-cuda-fallback

pretyflaco commented May 14, 2026

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

Conversation

pretyflaco commented May 14, 2026

Summary

Changes vs. #19

Test plan

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants