fix: default whisper model to QuantizedLargeV3Turbo#706
fix: default whisper model to QuantizedLargeV3Turbo#706data-bot-coasys wants to merge 8 commits intodevfrom
Conversation
- Change default WHISPER_MODEL from Small to QuantizedLargeV3Turbo (kalosm's own default — multilingual, 454MB quantized GGUF) - Update launcher setup to use whisper_large_v3_turbo_quantized across all AI modes (Local, Remote, None) - Fix incorrect HuggingFace link and model name in Local AI setup (was pointing to whisper-large-v3-turbo but labeled as distil) The previous defaults (Small for Remote/None, DistilLargeV3 for Local) used unquantized safetensors models that cause CUDA OOM on consumer GPUs. The quantized v3 turbo model provides better quality at a fraction of the memory footprint.
|
Note Reviews pausedIt looks like this branch is under active development. To avoid overwhelming you with review comments due to an influx of new commits, CodeRabbit has automatically paused this review. You can configure this behavior by changing the Use the following commands to manage reviews:
Use the checkboxes below for quick actions:
📝 WalkthroughWalkthroughWhisper model defaults switched to the quantized large v3 turbo variant in backend and UI; backend transcription timing values shortened; UI constants, links, labels, and default/local selections updated to reference quantized large and tiny quantized Whisper artifacts. Changes
Estimated code review effort🎯 3 (Moderate) | ⏱️ ~20 minutes Poem
🚥 Pre-merge checks | ✅ 2 | ❌ 1❌ Failed checks (1 warning)
✅ Passed checks (2 passed)
✏️ Tip: You can configure your own custom pre-merge checks in the settings. ✨ Finishing Touches🧪 Generate unit tests (beta)
Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out. Comment |
There was a problem hiding this comment.
Actionable comments posted: 1
🧹 Nitpick comments (1)
ui/src/components/Login.tsx (1)
647-651: Consider deduplicating the Whisper URL/label literalsThe same URL and label are repeated in three UI blocks. A shared constant will reduce future drift and make model/link updates safer.
♻️ Suggested refactor
+const WHISPER_TURBO_URL = + "https://huggingface.co/Demonthos/candle-quantized-whisper-large-v3-turbo"; +const WHISPER_TURBO_LABEL = "Whisper large v3 turbo quantized (454MB)";- open("https://huggingface.co/Demonthos/candle-quantized-whisper-large-v3-turbo") + open(WHISPER_TURBO_URL) - >Whisper large v3 turbo quantized (454MB)</a> + >{WHISPER_TURBO_LABEL}</a>Also applies to: 821-825, 863-867
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@ui/src/components/Login.tsx` around lines 647 - 651, The repeated Whisper model URL and label around the JSX anchor ("Whisper large v3 turbo quantized (454MB)" linking to "https://huggingface.co/Demonthos/candle-quantized-whisper-large-v3-turbo") should be extracted to a shared constant; add a constant (e.g., WHISPER_MODEL = { url: "...", label: "Whisper large v3 turbo quantized (454MB)" } or two constants WHISPER_URL and WHISPER_LABEL) near the top of the Login.tsx component or module and replace each repeated anchor href and inner text in the Login component JSX with those constants (references: the anchor element showing the label and the href string in Login.tsx). Ensure all three occurrences (around lines shown, and also the ones at the other noted blocks) reference the same constant so future updates are done in one place.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.
Inline comments:
In `@rust-executor/src/ai_service/mod.rs`:
- Line 35: The WHISPER_MODEL constant is updated but
AIService::get_whisper_model_size() still hard-codes a fallback to
WhisperSource::Tiny; update get_whisper_model_size() to use the WHISPER_MODEL
constant as the default/fallback (or derive its fallback from WHISPER_MODEL) so
the service actually falls back to the configured QuantizedLargeV3Turbo value
instead of Tiny; locate AIService::get_whisper_model_size() and replace the
hard-coded WhisperSource::Tiny return/fallback with a reference to the
WHISPER_MODEL static (or call a helper that resolves the configured default) so
runtime behavior matches the constant.
---
Nitpick comments:
In `@ui/src/components/Login.tsx`:
- Around line 647-651: The repeated Whisper model URL and label around the JSX
anchor ("Whisper large v3 turbo quantized (454MB)" linking to
"https://huggingface.co/Demonthos/candle-quantized-whisper-large-v3-turbo")
should be extracted to a shared constant; add a constant (e.g., WHISPER_MODEL =
{ url: "...", label: "Whisper large v3 turbo quantized (454MB)" } or two
constants WHISPER_URL and WHISPER_LABEL) near the top of the Login.tsx component
or module and replace each repeated anchor href and inner text in the Login
component JSX with those constants (references: the anchor element showing the
label and the href string in Login.tsx). Ensure all three occurrences (around
lines shown, and also the ones at the other noted blocks) reference the same
constant so future updates are done in one place.
Add Whisper tiny quantized (42MB) to the download list shown in all three AI modes (Local, Remote, None) so users know both models will be downloaded. Both are needed by Flux: the main model for final transcription and tiny for fast word-by-word preview.
There was a problem hiding this comment.
Caution
Some comments are outside the diff and can’t be posted inline due to platform limitations.
⚠️ Outside diff range comments (1)
ui/src/components/Login.tsx (1)
196-206:⚠️ Potential issue | 🔴 CriticalAdd explicit default for TRANSCRIPTION model to match LLM pattern.
The two Whisper TRANSCRIPTION models are added without an explicit
setDefaultModel("TRANSCRIPTION", ...)call. Line 187 shows LLM models get this treatment, but TRANSCRIPTION does not. The backend'sset_default_model()(ai_service/mod.rs:354) only handles LLM type, and whileWHISPER_MODELstatic constant ensuresQuantizedLargeV3Turbois used, explicit default pinning is missing. CallsetDefaultModel("TRANSCRIPTION", modelId)after adding the first Whisper model to make the default deterministic and consistent with the LLM pattern.🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@ui/src/components/Login.tsx` around lines 196 - 206, Add an explicit default TRANSCRIPTION model after registering the primary Whisper model: after the client!.ai.addModel call that creates the first Whisper model (name "Whisper", local { fileName: whisperModel }, modelType "TRANSCRIPTION"), call client!.ai.setDefaultModel("TRANSCRIPTION", <that model's id>) so the TRANSCRIPTION default is pinned deterministically (mirror the LLM pattern that uses setDefaultModel). Ensure you use the returned/new model's id (or capture the id from addModel result) when calling setDefaultModel("TRANSCRIPTION", modelId).
🧹 Nitpick comments (2)
ui/src/components/Login.tsx (2)
168-174: Remove redundantwhisperModelassignment in Local mode.
whisperModelis already initialized on Line 168, so Line 173 is unnecessary noise.Suggested cleanup
async function saveModels() { let whisperModel = "whisper_large_v3_turbo_quantized"; // add llm model if (aiMode !== "None") { const llm = { modelType: "LLM" } as ModelInput; if (aiMode === "Local") { - whisperModel = "whisper_large_v3_turbo_quantized"; llm.name = "Qwen2.5.1-Coder-7B-Instruct"; llm.local = { fileName: "Qwen2.5.1-Coder-7B-Instruct" }; } else {🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@ui/src/components/Login.tsx` around lines 168 - 174, Remove the redundant reassignment of whisperModel inside the aiMode === "Local" branch: whisperModel is already set to "whisper_large_v3_turbo_quantized" when initialized, so delete the duplicate assignment in the block that sets llm (the code manipulating whisperModel, aiMode, ModelInput, and llm.name in the Login.tsx component). Keep the llm creation and llm.name assignment for Local mode, but omit the unnecessary whisperModel = "whisper_large_v3_turbo_quantized" line.
628-677: Extract duplicated download-list markup into a shared renderer.The same model list structure is repeated for Local/Remote/None blocks. This is likely to drift again on future model/size/link updates.
Also applies to: 826-862, 879-915
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@ui/src/components/Login.tsx` around lines 628 - 677, Extract the repeated model list markup into a reusable React renderer (e.g., a new ModelDownloadList component) and replace the duplicated blocks in Login.tsx (the j-text anchor/p paragraphs used in Local/Remote/None sections) with that component; the ModelDownloadList should accept an items array (each item containing url, label/displayName and size) and render the same <a onClick={() => open(url)} style={{ cursor: "pointer" }}>…</a> structure so styling and behavior (open call) remain identical across all three places, then update the three occurrences to pass their respective model arrays to ModelDownloadList.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.
Outside diff comments:
In `@ui/src/components/Login.tsx`:
- Around line 196-206: Add an explicit default TRANSCRIPTION model after
registering the primary Whisper model: after the client!.ai.addModel call that
creates the first Whisper model (name "Whisper", local { fileName: whisperModel
}, modelType "TRANSCRIPTION"), call client!.ai.setDefaultModel("TRANSCRIPTION",
<that model's id>) so the TRANSCRIPTION default is pinned deterministically
(mirror the LLM pattern that uses setDefaultModel). Ensure you use the
returned/new model's id (or capture the id from addModel result) when calling
setDefaultModel("TRANSCRIPTION", modelId).
---
Nitpick comments:
In `@ui/src/components/Login.tsx`:
- Around line 168-174: Remove the redundant reassignment of whisperModel inside
the aiMode === "Local" branch: whisperModel is already set to
"whisper_large_v3_turbo_quantized" when initialized, so delete the duplicate
assignment in the block that sets llm (the code manipulating whisperModel,
aiMode, ModelInput, and llm.name in the Login.tsx component). Keep the llm
creation and llm.name assignment for Local mode, but omit the unnecessary
whisperModel = "whisper_large_v3_turbo_quantized" line.
- Around line 628-677: Extract the repeated model list markup into a reusable
React renderer (e.g., a new ModelDownloadList component) and replace the
duplicated blocks in Login.tsx (the j-text anchor/p paragraphs used in
Local/Remote/None sections) with that component; the ModelDownloadList should
accept an items array (each item containing url, label/displayName and size) and
render the same <a onClick={() => open(url)} style={{ cursor: "pointer" }}>…</a>
structure so styling and behavior (open call) remain identical across all three
places, then update the three occurrences to pass their respective model arrays
to ModelDownloadList.
The fallback path was still hard-coded to WhisperSource::Tiny, bypassing the WHISPER_MODEL constant. Now uses the configured default consistently.
The launcher UI already sets WHISPER_MODEL as the default during setup. This fallback only triggers when no model is persisted in the DB, which happens in CI/test environments that skip launcher setup. Using Tiny here avoids downloading a 454MB model in CI.
…iterals - get_whisper_model_size() now falls back to WHISPER_MODEL (QuantizedLargeV3Turbo) instead of hard-coded WhisperSource::Tiny - Extract repeated Whisper URL and label into constants in Login.tsx Addresses CodeRabbit review comments.
There was a problem hiding this comment.
🧹 Nitpick comments (1)
ui/src/components/Login.tsx (1)
172-177: Redundant assignment:whisperModelis set to the same value twice.Line 172 initializes
whisperModelto"whisper_large_v3_turbo_quantized", and line 177 reassigns it to the identical value inside theLocalbranch. This assignment is now a no-op and can be removed.♻️ Suggested cleanup
async function saveModels() { let whisperModel = "whisper_large_v3_turbo_quantized"; // add llm model if (aiMode !== "None") { const llm = { modelType: "LLM" } as ModelInput; if (aiMode === "Local") { - whisperModel = "whisper_large_v3_turbo_quantized"; llm.name = "Qwen2.5.1-Coder-7B-Instruct"; llm.local = { fileName: "Qwen2.5.1-Coder-7B-Instruct" }; } else {🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@ui/src/components/Login.tsx` around lines 172 - 177, The variable whisperModel is initialized to "whisper_large_v3_turbo_quantized" and then redundantly reassigned to the same value inside the aiMode === "Local" branch; remove the duplicate assignment in the Local branch (keep the initial let whisperModel = "whisper_large_v3_turbo_quantized";) and ensure any logic that depends on different whisperModel values remains correct — locate usages in the Login component around the aiMode check and ModelInput creation (llm, whisperModel) to edit the Local branch only.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.
Nitpick comments:
In `@ui/src/components/Login.tsx`:
- Around line 172-177: The variable whisperModel is initialized to
"whisper_large_v3_turbo_quantized" and then redundantly reassigned to the same
value inside the aiMode === "Local" branch; remove the duplicate assignment in
the Local branch (keep the initial let whisperModel =
"whisper_large_v3_turbo_quantized";) and ensure any logic that depends on
different whisperModel values remains correct — locate usages in the Login
component around the aiMode check and ModelInput creation (llm, whisperModel) to
edit the Local branch only.
ℹ️ Review info
⚙️ Run configuration
Configuration used: defaults
Review profile: CHILL
Plan: Pro
Run ID: 3576db04-0edd-4a43-b16d-2969c3059935
📒 Files selected for processing (2)
rust-executor/src/ai_service/mod.rsui/src/components/Login.tsx
Addresses CodeRabbit review comment: Add explicit default for TRANSCRIPTION model to match LLM pattern. The first Whisper model is now set as the default TRANSCRIPTION model after it's added.
Summary
Changes the default Whisper transcription model from
Small(unquantized, 500MB) toQuantizedLargeV3Turbo(quantized GGUF, 454MB) — kalosm's own recommended default.Problem
whisper-large-v2(6.2GB f32 safetensors) causesCUDA_ERROR_OUT_OF_MEMORYon consumer GPUs and tensor config errors on some setupswhisper-smallworks but is lower quality than what's availablewhisper-large-v3-turbo)Changes
Rust executor
WHISPER_MODEL→WhisperSource::QuantizedLargeV3TurboLauncher UI (Login.tsx)
whisper_distil_large_v3→whisper_large_v3_turbo_quantizedwhisper_small→whisper_large_v3_turbo_quantizedwhisper_tiny_quantizedalready installed in all modes ✅Testing (CUDA, RTX 2070 SUPER 8GB)
QuantizedLargeV3TurboDistilLargeV3SmallLargeV2Summary by CodeRabbit
New Features
Bug Fixes / Behavior
Documentation