fix: default whisper model to QuantizedLargeV3Turbo by data-bot-coasys · Pull Request #706 · coasys/ad4m

data-bot-coasys · 2026-03-03T20:08:09Z

Summary

Changes the default Whisper transcription model from Small (unquantized, 500MB) to QuantizedLargeV3Turbo (quantized GGUF, 454MB) — kalosm's own recommended default.

Problem

whisper-large-v2 (6.2GB f32 safetensors) causes CUDA_ERROR_OUT_OF_MEMORY on consumer GPUs and tensor config errors on some setups
whisper-small works but is lower quality than what's available
Launcher UI had mismatched labels (showed "distill large v3" but linked to whisper-large-v3-turbo)

Changes

Rust executor

Default WHISPER_MODEL → WhisperSource::QuantizedLargeV3Turbo

Launcher UI (Login.tsx)

Local AI mode: whisper_distil_large_v3 → whisper_large_v3_turbo_quantized
Remote API / None modes: whisper_small → whisper_large_v3_turbo_quantized
Fixed HuggingFace links and model size labels
whisper_tiny_quantized already installed in all modes ✅

Testing (CUDA, RTX 2070 SUPER 8GB)

Model	Size	CUDA
`QuantizedLargeV3Turbo`	454MB	✅
`DistilLargeV3`	1.5GB	✅
`Small`	~500MB	✅
`LargeV2`	6.2GB	❌ OOM

Summary by CodeRabbit

New Features
- Default speech recognition switched to the quantized large v3 turbo model; local selection now uses the same quantized model.
- Added a tiny quantized model option (42MB) and surfaced model links/labels in the UI.
Bug Fixes / Behavior
- Transcription timing shortened (checks and timeout reduced).
- Default model resolution now respects the configured model when no persisted choice exists.
Documentation
- Updated in-app model names, sizes and download links shown throughout the UI.

- Change default WHISPER_MODEL from Small to QuantizedLargeV3Turbo (kalosm's own default — multilingual, 454MB quantized GGUF) - Update launcher setup to use whisper_large_v3_turbo_quantized across all AI modes (Local, Remote, None) - Fix incorrect HuggingFace link and model name in Local AI setup (was pointing to whisper-large-v3-turbo but labeled as distil) The previous defaults (Small for Remote/None, DistilLargeV3 for Local) used unquantized safetensors models that cause CUDA OOM on consumer GPUs. The quantized v3 turbo model provides better quality at a fraction of the memory footprint.

coderabbitai · 2026-03-03T20:08:27Z

Note

Reviews paused

It looks like this branch is under active development. To avoid overwhelming you with review comments due to an influx of new commits, CodeRabbit has automatically paused this review. You can configure this behavior by changing the reviews.auto_review.auto_pause_after_reviewed_commits setting.

Use the following commands to manage reviews:

@coderabbitai resume to resume automatic reviews.
@coderabbitai review to trigger a single review.

Use the checkboxes below for quick actions:

▶️ Resume reviews
🔍 Trigger review

📝 Walkthrough

Walkthrough

Whisper model defaults switched to the quantized large v3 turbo variant in backend and UI; backend transcription timing values shortened; UI constants, links, labels, and default/local selections updated to reference quantized large and tiny quantized Whisper artifacts.

Changes

Cohort / File(s)	Summary
Backend: Whisper model & transcription timing `rust-executor/src/ai_service/mod.rs`	Changed `WHISPER_MODEL` from `Small` to `QuantizedLargeV3Turbo`; reduced `TRANSCRIPTION_TIMEOUT_SECS` from 120 to 30 and `TRANSCRIPTION_CHECK_INTERVAL_SECS` from 10 to 5; `get_whisper_model_size` now defaults to the configured `WHISPER_MODEL` when no persisted model exists (was previously defaulting to `Tiny`).
Frontend: UI defaults, labels, and links `ui/src/components/Login.tsx`	Added `WHISPER_TURBO_URL` and `WHISPER_TURBO_LABEL`; changed saved/default whisper model to `whisper_large_v3_turbo_quantized` and Local selection to that model; updated UI copy, links, and labels to reference quantized large v3 turbo and a ~42MB tiny quantized variant; added UI blocks surfacing the new model link/label.

Estimated code review effort

🎯 3 (Moderate) | ⏱️ ~20 minutes

Poem

🐰 I nibble bytes beneath the moonlit log,

A turbo whisper snug in my code-hollowed bog,
Tiny crumbs for speed, large crumbs for lore,
I hop, I patch, I listen — then I hop some more.

🚥 Pre-merge checks | ✅ 2 | ❌ 1

❌ Failed checks (1 warning)

Check name	Status	Explanation	Resolution
Docstring Coverage	⚠️ Warning	Docstring coverage is 0.00% which is insufficient. The required threshold is 80.00%.	Write docstrings for the functions missing them to satisfy the coverage threshold.

✅ Passed checks (2 passed)

Check name	Status	Explanation
Description Check	✅ Passed	Check skipped - CodeRabbit’s high-level summary is enabled.
Title check	✅ Passed	The title accurately summarizes the main change: updating the default Whisper model to QuantizedLargeV3Turbo. Both file changes across the Rust executor and UI components align with this core objective.

_{✏️ Tip: You can configure your own custom pre-merge checks in the settings.}

✨ Finishing Touches

🧪 Generate unit tests (beta)

Create PR with unit tests
Post copyable unit tests in a comment
Commit unit tests in branch fix/whisper-default-model

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

_{Comment @coderabbitai help to get the list of available commands and usage tips.}

coderabbitai

Actionable comments posted: 1

🧹 Nitpick comments (1)

ui/src/components/Login.tsx (1)

647-651: Consider deduplicating the Whisper URL/label literals

The same URL and label are repeated in three UI blocks. A shared constant will reduce future drift and make model/link updates safer.

♻️ Suggested refactor

+const WHISPER_TURBO_URL =
+  "https://huggingface.co/Demonthos/candle-quantized-whisper-large-v3-turbo";
+const WHISPER_TURBO_LABEL = "Whisper large v3 turbo quantized (454MB)";

- open("https://huggingface.co/Demonthos/candle-quantized-whisper-large-v3-turbo")
+ open(WHISPER_TURBO_URL)

- >Whisper large v3 turbo quantized (454MB)</a>
+ >{WHISPER_TURBO_LABEL}</a>

Also applies to: 821-825, 863-867

🤖 Prompt for AI Agents

Verify each finding against the current code and only fix it if needed.

In `@ui/src/components/Login.tsx` around lines 647 - 651, The repeated Whisper
model URL and label around the JSX anchor ("Whisper large v3 turbo quantized
(454MB)" linking to
"https://huggingface.co/Demonthos/candle-quantized-whisper-large-v3-turbo")
should be extracted to a shared constant; add a constant (e.g., WHISPER_MODEL =
{ url: "...", label: "Whisper large v3 turbo quantized (454MB)" } or two
constants WHISPER_URL and WHISPER_LABEL) near the top of the Login.tsx component
or module and replace each repeated anchor href and inner text in the Login
component JSX with those constants (references: the anchor element showing the
label and the href string in Login.tsx). Ensure all three occurrences (around
lines shown, and also the ones at the other noted blocks) reference the same
constant so future updates are done in one place.

🤖 Prompt for all review comments with AI agents

Verify each finding against the current code and only fix it if needed.

Inline comments:
In `@rust-executor/src/ai_service/mod.rs`:
- Line 35: The WHISPER_MODEL constant is updated but
AIService::get_whisper_model_size() still hard-codes a fallback to
WhisperSource::Tiny; update get_whisper_model_size() to use the WHISPER_MODEL
constant as the default/fallback (or derive its fallback from WHISPER_MODEL) so
the service actually falls back to the configured QuantizedLargeV3Turbo value
instead of Tiny; locate AIService::get_whisper_model_size() and replace the
hard-coded WhisperSource::Tiny return/fallback with a reference to the
WHISPER_MODEL static (or call a helper that resolves the configured default) so
runtime behavior matches the constant.

---

Nitpick comments:
In `@ui/src/components/Login.tsx`:
- Around line 647-651: The repeated Whisper model URL and label around the JSX
anchor ("Whisper large v3 turbo quantized (454MB)" linking to
"https://huggingface.co/Demonthos/candle-quantized-whisper-large-v3-turbo")
should be extracted to a shared constant; add a constant (e.g., WHISPER_MODEL =
{ url: "...", label: "Whisper large v3 turbo quantized (454MB)" } or two
constants WHISPER_URL and WHISPER_LABEL) near the top of the Login.tsx component
or module and replace each repeated anchor href and inner text in the Login
component JSX with those constants (references: the anchor element showing the
label and the href string in Login.tsx). Ensure all three occurrences (around
lines shown, and also the ones at the other noted blocks) reference the same
constant so future updates are done in one place.

ℹ️ Review info

Configuration used: defaults

Review profile: CHILL

Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between ec3931e and 227262a.

📒 Files selected for processing (2)

rust-executor/src/ai_service/mod.rs
ui/src/components/Login.tsx

rust-executor/src/ai_service/mod.rs

Add Whisper tiny quantized (42MB) to the download list shown in all three AI modes (Local, Remote, None) so users know both models will be downloaded. Both are needed by Flux: the main model for final transcription and tiny for fast word-by-word preview.

coderabbitai

Caution

Some comments are outside the diff and can’t be posted inline due to platform limitations.

⚠️ Outside diff range comments (1)

ui/src/components/Login.tsx (1)
196-206: ⚠️ Potential issue | 🔴 Critical

Add explicit default for TRANSCRIPTION model to match LLM pattern.

The two Whisper TRANSCRIPTION models are added without an explicit setDefaultModel("TRANSCRIPTION", ...) call. Line 187 shows LLM models get this treatment, but TRANSCRIPTION does not. The backend's set_default_model() (ai_service/mod.rs:354) only handles LLM type, and while WHISPER_MODEL static constant ensures QuantizedLargeV3Turbo is used, explicit default pinning is missing. Call setDefaultModel("TRANSCRIPTION", modelId) after adding the first Whisper model to make the default deterministic and consistent with the LLM pattern.
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@ui/src/components/Login.tsx` around lines 196 - 206, Add an explicit default
TRANSCRIPTION model after registering the primary Whisper model: after the
client!.ai.addModel call that creates the first Whisper model (name "Whisper",
local { fileName: whisperModel }, modelType "TRANSCRIPTION"), call
client!.ai.setDefaultModel("TRANSCRIPTION", <that model's id>) so the
TRANSCRIPTION default is pinned deterministically (mirror the LLM pattern that
uses setDefaultModel). Ensure you use the returned/new model's id (or capture
the id from addModel result) when calling setDefaultModel("TRANSCRIPTION",
modelId).

🧹 Nitpick comments (2)

ui/src/components/Login.tsx (2)

168-174: Remove redundant whisperModel assignment in Local mode.

whisperModel is already initialized on Line 168, so Line 173 is unnecessary noise.

Suggested cleanup

 async function saveModels() {
   let whisperModel = "whisper_large_v3_turbo_quantized";
   // add llm model
   if (aiMode !== "None") {
     const llm = { modelType: "LLM" } as ModelInput;
     if (aiMode === "Local") {
-      whisperModel = "whisper_large_v3_turbo_quantized";
       llm.name = "Qwen2.5.1-Coder-7B-Instruct";
       llm.local = { fileName: "Qwen2.5.1-Coder-7B-Instruct" };
     } else {

🤖 Prompt for AI Agents

Verify each finding against the current code and only fix it if needed.

In `@ui/src/components/Login.tsx` around lines 168 - 174, Remove the redundant
reassignment of whisperModel inside the aiMode === "Local" branch: whisperModel
is already set to "whisper_large_v3_turbo_quantized" when initialized, so delete
the duplicate assignment in the block that sets llm (the code manipulating
whisperModel, aiMode, ModelInput, and llm.name in the Login.tsx component). Keep
the llm creation and llm.name assignment for Local mode, but omit the
unnecessary whisperModel = "whisper_large_v3_turbo_quantized" line.

628-677: Extract duplicated download-list markup into a shared renderer.

The same model list structure is repeated for Local/Remote/None blocks. This is likely to drift again on future model/size/link updates.

Also applies to: 826-862, 879-915

🤖 Prompt for AI Agents

Verify each finding against the current code and only fix it if needed.

In `@ui/src/components/Login.tsx` around lines 628 - 677, Extract the repeated
model list markup into a reusable React renderer (e.g., a new ModelDownloadList
component) and replace the duplicated blocks in Login.tsx (the j-text anchor/p
paragraphs used in Local/Remote/None sections) with that component; the
ModelDownloadList should accept an items array (each item containing url,
label/displayName and size) and render the same <a onClick={() => open(url)}
style={{ cursor: "pointer" }}>…</a> structure so styling and behavior (open
call) remain identical across all three places, then update the three
occurrences to pass their respective model arrays to ModelDownloadList.

🤖 Prompt for all review comments with AI agents

Verify each finding against the current code and only fix it if needed.

Outside diff comments:
In `@ui/src/components/Login.tsx`:
- Around line 196-206: Add an explicit default TRANSCRIPTION model after
registering the primary Whisper model: after the client!.ai.addModel call that
creates the first Whisper model (name "Whisper", local { fileName: whisperModel
}, modelType "TRANSCRIPTION"), call client!.ai.setDefaultModel("TRANSCRIPTION",
<that model's id>) so the TRANSCRIPTION default is pinned deterministically
(mirror the LLM pattern that uses setDefaultModel). Ensure you use the
returned/new model's id (or capture the id from addModel result) when calling
setDefaultModel("TRANSCRIPTION", modelId).

---

Nitpick comments:
In `@ui/src/components/Login.tsx`:
- Around line 168-174: Remove the redundant reassignment of whisperModel inside
the aiMode === "Local" branch: whisperModel is already set to
"whisper_large_v3_turbo_quantized" when initialized, so delete the duplicate
assignment in the block that sets llm (the code manipulating whisperModel,
aiMode, ModelInput, and llm.name in the Login.tsx component). Keep the llm
creation and llm.name assignment for Local mode, but omit the unnecessary
whisperModel = "whisper_large_v3_turbo_quantized" line.
- Around line 628-677: Extract the repeated model list markup into a reusable
React renderer (e.g., a new ModelDownloadList component) and replace the
duplicated blocks in Login.tsx (the j-text anchor/p paragraphs used in
Local/Remote/None sections) with that component; the ModelDownloadList should
accept an items array (each item containing url, label/displayName and size) and
render the same <a onClick={() => open(url)} style={{ cursor: "pointer" }}>…</a>
structure so styling and behavior (open call) remain identical across all three
places, then update the three occurrences to pass their respective model arrays
to ModelDownloadList.

ℹ️ Review info

Configuration used: defaults

Review profile: CHILL

Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between 227262a and 9ae20c5.

📒 Files selected for processing (1)

ui/src/components/Login.tsx

The fallback path was still hard-coded to WhisperSource::Tiny, bypassing the WHISPER_MODEL constant. Now uses the configured default consistently.

The launcher UI already sets WHISPER_MODEL as the default during setup. This fallback only triggers when no model is persisted in the DB, which happens in CI/test environments that skip launcher setup. Using Tiny here avoids downloading a 454MB model in CI.

…iterals - get_whisper_model_size() now falls back to WHISPER_MODEL (QuantizedLargeV3Turbo) instead of hard-coded WhisperSource::Tiny - Extract repeated Whisper URL and label into constants in Login.tsx Addresses CodeRabbit review comments.

coderabbitai

🧹 Nitpick comments (1)

ui/src/components/Login.tsx (1)

172-177: Redundant assignment: whisperModel is set to the same value twice.

Line 172 initializes whisperModel to "whisper_large_v3_turbo_quantized", and line 177 reassigns it to the identical value inside the Local branch. This assignment is now a no-op and can be removed.

♻️ Suggested cleanup

   async function saveModels() {
     let whisperModel = "whisper_large_v3_turbo_quantized";
     // add llm model
     if (aiMode !== "None") {
       const llm = { modelType: "LLM" } as ModelInput;
       if (aiMode === "Local") {
-        whisperModel = "whisper_large_v3_turbo_quantized";
         llm.name = "Qwen2.5.1-Coder-7B-Instruct";
         llm.local = { fileName: "Qwen2.5.1-Coder-7B-Instruct" };
       } else {

🤖 Prompt for AI Agents

Verify each finding against the current code and only fix it if needed.

In `@ui/src/components/Login.tsx` around lines 172 - 177, The variable
whisperModel is initialized to "whisper_large_v3_turbo_quantized" and then
redundantly reassigned to the same value inside the aiMode === "Local" branch;
remove the duplicate assignment in the Local branch (keep the initial let
whisperModel = "whisper_large_v3_turbo_quantized";) and ensure any logic that
depends on different whisperModel values remains correct — locate usages in the
Login component around the aiMode check and ModelInput creation (llm,
whisperModel) to edit the Local branch only.

🤖 Prompt for all review comments with AI agents

Verify each finding against the current code and only fix it if needed.

Nitpick comments:
In `@ui/src/components/Login.tsx`:
- Around line 172-177: The variable whisperModel is initialized to
"whisper_large_v3_turbo_quantized" and then redundantly reassigned to the same
value inside the aiMode === "Local" branch; remove the duplicate assignment in
the Local branch (keep the initial let whisperModel =
"whisper_large_v3_turbo_quantized";) and ensure any logic that depends on
different whisperModel values remains correct — locate usages in the Login
component around the aiMode check and ModelInput creation (llm, whisperModel) to
edit the Local branch only.

ℹ️ Review info

⚙️ Run configuration

Configuration used: defaults

Review profile: CHILL

Plan: Pro

Run ID: 3576db04-0edd-4a43-b16d-2969c3059935

📥 Commits

Reviewing files that changed from the base of the PR and between 9c43095 and 977652b.

📒 Files selected for processing (2)

rust-executor/src/ai_service/mod.rs
ui/src/components/Login.tsx

Addresses CodeRabbit review comment: Add explicit default for TRANSCRIPTION model to match LLM pattern. The first Whisper model is now set as the default TRANSCRIPTION model after it's added.

coderabbitai bot reviewed Mar 3, 2026

View reviewed changes

rust-executor/src/ai_service/mod.rs Show resolved Hide resolved

lucksus and others added 2 commits March 3, 2026 22:03

Merge branch 'dev' into fix/whisper-default-model

c68f4da

coderabbitai bot reviewed Mar 3, 2026

View reviewed changes

data-bot-coasys added 3 commits March 4, 2026 02:03

fix: use WHISPER_MODEL constant as fallback in get_whisper_model_size

5bee0b5

The fallback path was still hard-coded to WhisperSource::Tiny, bypassing the WHISPER_MODEL constant. Now uses the configured default consistently.

coderabbitai bot reviewed Mar 5, 2026

View reviewed changes

data-bot-coasys added 2 commits March 6, 2026 14:06

fix: remove redundant whisperModel assignment in Local mode

d5d328b

fix: set default TRANSCRIPTION model after adding Whisper model

83fe3b6

Addresses CodeRabbit review comment: Add explicit default for TRANSCRIPTION model to match LLM pattern. The first Whisper model is now set as the default TRANSCRIPTION model after it's added.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

fix: default whisper model to QuantizedLargeV3Turbo#706

fix: default whisper model to QuantizedLargeV3Turbo#706
data-bot-coasys wants to merge 8 commits intodevfrom
fix/whisper-default-model

data-bot-coasys commented Mar 3, 2026 •

edited by coderabbitai bot

Loading

Uh oh!

coderabbitai bot commented Mar 3, 2026 •

edited

Loading

Reviews paused

Walkthrough

Changes

Estimated code review effort

Poem

❌ Failed checks (1 warning)

Uh oh!

coderabbitai bot left a comment

Uh oh!

Uh oh!

coderabbitai bot left a comment

Uh oh!

coderabbitai bot left a comment

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

Conversation

data-bot-coasys commented Mar 3, 2026 • edited by coderabbitai bot Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Summary

Problem

Changes

Rust executor

Launcher UI (Login.tsx)

Testing (CUDA, RTX 2070 SUPER 8GB)

Summary by CodeRabbit

Uh oh!

coderabbitai bot commented Mar 3, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Reviews paused

Walkthrough

Changes

Estimated code review effort

Poem

❌ Failed checks (1 warning)

Uh oh!

coderabbitai bot left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

coderabbitai bot left a comment

Choose a reason for hiding this comment

Uh oh!

coderabbitai bot left a comment

Choose a reason for hiding this comment

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

data-bot-coasys commented Mar 3, 2026 •

edited by coderabbitai bot

Loading

coderabbitai bot commented Mar 3, 2026 •

edited

Loading