Skip to content

fix: default whisper model to QuantizedLargeV3Turbo#706

Open
data-bot-coasys wants to merge 8 commits intodevfrom
fix/whisper-default-model
Open

fix: default whisper model to QuantizedLargeV3Turbo#706
data-bot-coasys wants to merge 8 commits intodevfrom
fix/whisper-default-model

Conversation

@data-bot-coasys
Copy link
Contributor

@data-bot-coasys data-bot-coasys commented Mar 3, 2026

Summary

Changes the default Whisper transcription model from Small (unquantized, 500MB) to QuantizedLargeV3Turbo (quantized GGUF, 454MB) — kalosm's own recommended default.

Problem

  • whisper-large-v2 (6.2GB f32 safetensors) causes CUDA_ERROR_OUT_OF_MEMORY on consumer GPUs and tensor config errors on some setups
  • whisper-small works but is lower quality than what's available
  • Launcher UI had mismatched labels (showed "distill large v3" but linked to whisper-large-v3-turbo)

Changes

Rust executor

  • Default WHISPER_MODELWhisperSource::QuantizedLargeV3Turbo

Launcher UI (Login.tsx)

  • Local AI mode: whisper_distil_large_v3whisper_large_v3_turbo_quantized
  • Remote API / None modes: whisper_smallwhisper_large_v3_turbo_quantized
  • Fixed HuggingFace links and model size labels
  • whisper_tiny_quantized already installed in all modes ✅

Testing (CUDA, RTX 2070 SUPER 8GB)

Model Size CUDA
QuantizedLargeV3Turbo 454MB
DistilLargeV3 1.5GB
Small ~500MB
LargeV2 6.2GB ❌ OOM

Summary by CodeRabbit

  • New Features

    • Default speech recognition switched to the quantized large v3 turbo model; local selection now uses the same quantized model.
    • Added a tiny quantized model option (42MB) and surfaced model links/labels in the UI.
  • Bug Fixes / Behavior

    • Transcription timing shortened (checks and timeout reduced).
    • Default model resolution now respects the configured model when no persisted choice exists.
  • Documentation

    • Updated in-app model names, sizes and download links shown throughout the UI.

- Change default WHISPER_MODEL from Small to QuantizedLargeV3Turbo
  (kalosm's own default — multilingual, 454MB quantized GGUF)
- Update launcher setup to use whisper_large_v3_turbo_quantized
  across all AI modes (Local, Remote, None)
- Fix incorrect HuggingFace link and model name in Local AI setup
  (was pointing to whisper-large-v3-turbo but labeled as distil)

The previous defaults (Small for Remote/None, DistilLargeV3 for Local)
used unquantized safetensors models that cause CUDA OOM on consumer
GPUs. The quantized v3 turbo model provides better quality at a
fraction of the memory footprint.
@coderabbitai
Copy link
Contributor

coderabbitai bot commented Mar 3, 2026

Note

Reviews paused

It looks like this branch is under active development. To avoid overwhelming you with review comments due to an influx of new commits, CodeRabbit has automatically paused this review. You can configure this behavior by changing the reviews.auto_review.auto_pause_after_reviewed_commits setting.

Use the following commands to manage reviews:

  • @coderabbitai resume to resume automatic reviews.
  • @coderabbitai review to trigger a single review.

Use the checkboxes below for quick actions:

  • ▶️ Resume reviews
  • 🔍 Trigger review
📝 Walkthrough

Walkthrough

Whisper model defaults switched to the quantized large v3 turbo variant in backend and UI; backend transcription timing values shortened; UI constants, links, labels, and default/local selections updated to reference quantized large and tiny quantized Whisper artifacts.

Changes

Cohort / File(s) Summary
Backend: Whisper model & transcription timing
rust-executor/src/ai_service/mod.rs
Changed WHISPER_MODEL from Small to QuantizedLargeV3Turbo; reduced TRANSCRIPTION_TIMEOUT_SECS from 120 to 30 and TRANSCRIPTION_CHECK_INTERVAL_SECS from 10 to 5; get_whisper_model_size now defaults to the configured WHISPER_MODEL when no persisted model exists (was previously defaulting to Tiny).
Frontend: UI defaults, labels, and links
ui/src/components/Login.tsx
Added WHISPER_TURBO_URL and WHISPER_TURBO_LABEL; changed saved/default whisper model to whisper_large_v3_turbo_quantized and Local selection to that model; updated UI copy, links, and labels to reference quantized large v3 turbo and a ~42MB tiny quantized variant; added UI blocks surfacing the new model link/label.

Estimated code review effort

🎯 3 (Moderate) | ⏱️ ~20 minutes

Poem

🐰 I nibble bytes beneath the moonlit log,

A turbo whisper snug in my code-hollowed bog,
Tiny crumbs for speed, large crumbs for lore,
I hop, I patch, I listen — then I hop some more.

🚥 Pre-merge checks | ✅ 2 | ❌ 1

❌ Failed checks (1 warning)

Check name Status Explanation Resolution
Docstring Coverage ⚠️ Warning Docstring coverage is 0.00% which is insufficient. The required threshold is 80.00%. Write docstrings for the functions missing them to satisfy the coverage threshold.
✅ Passed checks (2 passed)
Check name Status Explanation
Description Check ✅ Passed Check skipped - CodeRabbit’s high-level summary is enabled.
Title check ✅ Passed The title accurately summarizes the main change: updating the default Whisper model to QuantizedLargeV3Turbo. Both file changes across the Rust executor and UI components align with this core objective.

✏️ Tip: You can configure your own custom pre-merge checks in the settings.

✨ Finishing Touches
🧪 Generate unit tests (beta)
  • Create PR with unit tests
  • Post copyable unit tests in a comment
  • Commit unit tests in branch fix/whisper-default-model

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

Comment @coderabbitai help to get the list of available commands and usage tips.

Copy link
Contributor

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 1

🧹 Nitpick comments (1)
ui/src/components/Login.tsx (1)

647-651: Consider deduplicating the Whisper URL/label literals

The same URL and label are repeated in three UI blocks. A shared constant will reduce future drift and make model/link updates safer.

♻️ Suggested refactor
+const WHISPER_TURBO_URL =
+  "https://huggingface.co/Demonthos/candle-quantized-whisper-large-v3-turbo";
+const WHISPER_TURBO_LABEL = "Whisper large v3 turbo quantized (454MB)";
- open("https://huggingface.co/Demonthos/candle-quantized-whisper-large-v3-turbo")
+ open(WHISPER_TURBO_URL)

- >Whisper large v3 turbo quantized (454MB)</a>
+ >{WHISPER_TURBO_LABEL}</a>

Also applies to: 821-825, 863-867

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@ui/src/components/Login.tsx` around lines 647 - 651, The repeated Whisper
model URL and label around the JSX anchor ("Whisper large v3 turbo quantized
(454MB)" linking to
"https://huggingface.co/Demonthos/candle-quantized-whisper-large-v3-turbo")
should be extracted to a shared constant; add a constant (e.g., WHISPER_MODEL =
{ url: "...", label: "Whisper large v3 turbo quantized (454MB)" } or two
constants WHISPER_URL and WHISPER_LABEL) near the top of the Login.tsx component
or module and replace each repeated anchor href and inner text in the Login
component JSX with those constants (references: the anchor element showing the
label and the href string in Login.tsx). Ensure all three occurrences (around
lines shown, and also the ones at the other noted blocks) reference the same
constant so future updates are done in one place.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.

Inline comments:
In `@rust-executor/src/ai_service/mod.rs`:
- Line 35: The WHISPER_MODEL constant is updated but
AIService::get_whisper_model_size() still hard-codes a fallback to
WhisperSource::Tiny; update get_whisper_model_size() to use the WHISPER_MODEL
constant as the default/fallback (or derive its fallback from WHISPER_MODEL) so
the service actually falls back to the configured QuantizedLargeV3Turbo value
instead of Tiny; locate AIService::get_whisper_model_size() and replace the
hard-coded WhisperSource::Tiny return/fallback with a reference to the
WHISPER_MODEL static (or call a helper that resolves the configured default) so
runtime behavior matches the constant.

---

Nitpick comments:
In `@ui/src/components/Login.tsx`:
- Around line 647-651: The repeated Whisper model URL and label around the JSX
anchor ("Whisper large v3 turbo quantized (454MB)" linking to
"https://huggingface.co/Demonthos/candle-quantized-whisper-large-v3-turbo")
should be extracted to a shared constant; add a constant (e.g., WHISPER_MODEL =
{ url: "...", label: "Whisper large v3 turbo quantized (454MB)" } or two
constants WHISPER_URL and WHISPER_LABEL) near the top of the Login.tsx component
or module and replace each repeated anchor href and inner text in the Login
component JSX with those constants (references: the anchor element showing the
label and the href string in Login.tsx). Ensure all three occurrences (around
lines shown, and also the ones at the other noted blocks) reference the same
constant so future updates are done in one place.

ℹ️ Review info

Configuration used: defaults

Review profile: CHILL

Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between ec3931e and 227262a.

📒 Files selected for processing (2)
  • rust-executor/src/ai_service/mod.rs
  • ui/src/components/Login.tsx

lucksus and others added 2 commits March 3, 2026 22:03
Add Whisper tiny quantized (42MB) to the download list shown
in all three AI modes (Local, Remote, None) so users know both
models will be downloaded. Both are needed by Flux: the main
model for final transcription and tiny for fast word-by-word
preview.
Copy link
Contributor

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Caution

Some comments are outside the diff and can’t be posted inline due to platform limitations.

⚠️ Outside diff range comments (1)
ui/src/components/Login.tsx (1)

196-206: ⚠️ Potential issue | 🔴 Critical

Add explicit default for TRANSCRIPTION model to match LLM pattern.

The two Whisper TRANSCRIPTION models are added without an explicit setDefaultModel("TRANSCRIPTION", ...) call. Line 187 shows LLM models get this treatment, but TRANSCRIPTION does not. The backend's set_default_model() (ai_service/mod.rs:354) only handles LLM type, and while WHISPER_MODEL static constant ensures QuantizedLargeV3Turbo is used, explicit default pinning is missing. Call setDefaultModel("TRANSCRIPTION", modelId) after adding the first Whisper model to make the default deterministic and consistent with the LLM pattern.

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@ui/src/components/Login.tsx` around lines 196 - 206, Add an explicit default
TRANSCRIPTION model after registering the primary Whisper model: after the
client!.ai.addModel call that creates the first Whisper model (name "Whisper",
local { fileName: whisperModel }, modelType "TRANSCRIPTION"), call
client!.ai.setDefaultModel("TRANSCRIPTION", <that model's id>) so the
TRANSCRIPTION default is pinned deterministically (mirror the LLM pattern that
uses setDefaultModel). Ensure you use the returned/new model's id (or capture
the id from addModel result) when calling setDefaultModel("TRANSCRIPTION",
modelId).
🧹 Nitpick comments (2)
ui/src/components/Login.tsx (2)

168-174: Remove redundant whisperModel assignment in Local mode.

whisperModel is already initialized on Line 168, so Line 173 is unnecessary noise.

Suggested cleanup
 async function saveModels() {
   let whisperModel = "whisper_large_v3_turbo_quantized";
   // add llm model
   if (aiMode !== "None") {
     const llm = { modelType: "LLM" } as ModelInput;
     if (aiMode === "Local") {
-      whisperModel = "whisper_large_v3_turbo_quantized";
       llm.name = "Qwen2.5.1-Coder-7B-Instruct";
       llm.local = { fileName: "Qwen2.5.1-Coder-7B-Instruct" };
     } else {
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@ui/src/components/Login.tsx` around lines 168 - 174, Remove the redundant
reassignment of whisperModel inside the aiMode === "Local" branch: whisperModel
is already set to "whisper_large_v3_turbo_quantized" when initialized, so delete
the duplicate assignment in the block that sets llm (the code manipulating
whisperModel, aiMode, ModelInput, and llm.name in the Login.tsx component). Keep
the llm creation and llm.name assignment for Local mode, but omit the
unnecessary whisperModel = "whisper_large_v3_turbo_quantized" line.

628-677: Extract duplicated download-list markup into a shared renderer.

The same model list structure is repeated for Local/Remote/None blocks. This is likely to drift again on future model/size/link updates.

Also applies to: 826-862, 879-915

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@ui/src/components/Login.tsx` around lines 628 - 677, Extract the repeated
model list markup into a reusable React renderer (e.g., a new ModelDownloadList
component) and replace the duplicated blocks in Login.tsx (the j-text anchor/p
paragraphs used in Local/Remote/None sections) with that component; the
ModelDownloadList should accept an items array (each item containing url,
label/displayName and size) and render the same <a onClick={() => open(url)}
style={{ cursor: "pointer" }}>…</a> structure so styling and behavior (open
call) remain identical across all three places, then update the three
occurrences to pass their respective model arrays to ModelDownloadList.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.

Outside diff comments:
In `@ui/src/components/Login.tsx`:
- Around line 196-206: Add an explicit default TRANSCRIPTION model after
registering the primary Whisper model: after the client!.ai.addModel call that
creates the first Whisper model (name "Whisper", local { fileName: whisperModel
}, modelType "TRANSCRIPTION"), call client!.ai.setDefaultModel("TRANSCRIPTION",
<that model's id>) so the TRANSCRIPTION default is pinned deterministically
(mirror the LLM pattern that uses setDefaultModel). Ensure you use the
returned/new model's id (or capture the id from addModel result) when calling
setDefaultModel("TRANSCRIPTION", modelId).

---

Nitpick comments:
In `@ui/src/components/Login.tsx`:
- Around line 168-174: Remove the redundant reassignment of whisperModel inside
the aiMode === "Local" branch: whisperModel is already set to
"whisper_large_v3_turbo_quantized" when initialized, so delete the duplicate
assignment in the block that sets llm (the code manipulating whisperModel,
aiMode, ModelInput, and llm.name in the Login.tsx component). Keep the llm
creation and llm.name assignment for Local mode, but omit the unnecessary
whisperModel = "whisper_large_v3_turbo_quantized" line.
- Around line 628-677: Extract the repeated model list markup into a reusable
React renderer (e.g., a new ModelDownloadList component) and replace the
duplicated blocks in Login.tsx (the j-text anchor/p paragraphs used in
Local/Remote/None sections) with that component; the ModelDownloadList should
accept an items array (each item containing url, label/displayName and size) and
render the same <a onClick={() => open(url)} style={{ cursor: "pointer" }}>…</a>
structure so styling and behavior (open call) remain identical across all three
places, then update the three occurrences to pass their respective model arrays
to ModelDownloadList.

ℹ️ Review info

Configuration used: defaults

Review profile: CHILL

Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between 227262a and 9ae20c5.

📒 Files selected for processing (1)
  • ui/src/components/Login.tsx

The fallback path was still hard-coded to WhisperSource::Tiny, bypassing
the WHISPER_MODEL constant. Now uses the configured default consistently.
The launcher UI already sets WHISPER_MODEL as the default during setup.
This fallback only triggers when no model is persisted in the DB, which
happens in CI/test environments that skip launcher setup. Using Tiny here
avoids downloading a 454MB model in CI.
…iterals

- get_whisper_model_size() now falls back to WHISPER_MODEL (QuantizedLargeV3Turbo)
  instead of hard-coded WhisperSource::Tiny
- Extract repeated Whisper URL and label into constants in Login.tsx

Addresses CodeRabbit review comments.
Copy link
Contributor

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🧹 Nitpick comments (1)
ui/src/components/Login.tsx (1)

172-177: Redundant assignment: whisperModel is set to the same value twice.

Line 172 initializes whisperModel to "whisper_large_v3_turbo_quantized", and line 177 reassigns it to the identical value inside the Local branch. This assignment is now a no-op and can be removed.

♻️ Suggested cleanup
   async function saveModels() {
     let whisperModel = "whisper_large_v3_turbo_quantized";
     // add llm model
     if (aiMode !== "None") {
       const llm = { modelType: "LLM" } as ModelInput;
       if (aiMode === "Local") {
-        whisperModel = "whisper_large_v3_turbo_quantized";
         llm.name = "Qwen2.5.1-Coder-7B-Instruct";
         llm.local = { fileName: "Qwen2.5.1-Coder-7B-Instruct" };
       } else {
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@ui/src/components/Login.tsx` around lines 172 - 177, The variable
whisperModel is initialized to "whisper_large_v3_turbo_quantized" and then
redundantly reassigned to the same value inside the aiMode === "Local" branch;
remove the duplicate assignment in the Local branch (keep the initial let
whisperModel = "whisper_large_v3_turbo_quantized";) and ensure any logic that
depends on different whisperModel values remains correct — locate usages in the
Login component around the aiMode check and ModelInput creation (llm,
whisperModel) to edit the Local branch only.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.

Nitpick comments:
In `@ui/src/components/Login.tsx`:
- Around line 172-177: The variable whisperModel is initialized to
"whisper_large_v3_turbo_quantized" and then redundantly reassigned to the same
value inside the aiMode === "Local" branch; remove the duplicate assignment in
the Local branch (keep the initial let whisperModel =
"whisper_large_v3_turbo_quantized";) and ensure any logic that depends on
different whisperModel values remains correct — locate usages in the Login
component around the aiMode check and ModelInput creation (llm, whisperModel) to
edit the Local branch only.

ℹ️ Review info
⚙️ Run configuration

Configuration used: defaults

Review profile: CHILL

Plan: Pro

Run ID: 3576db04-0edd-4a43-b16d-2969c3059935

📥 Commits

Reviewing files that changed from the base of the PR and between 9c43095 and 977652b.

📒 Files selected for processing (2)
  • rust-executor/src/ai_service/mod.rs
  • ui/src/components/Login.tsx

Addresses CodeRabbit review comment: Add explicit default for
TRANSCRIPTION model to match LLM pattern. The first Whisper model
is now set as the default TRANSCRIPTION model after it's added.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants