Skip to content

A possibility to use mlx-community/whisper-large-v3 as a local STT model #4972

@AndriyBalakalchuk

Description

@AndriyBalakalchuk

First of all, great work on the product — the idea and overall implementation are really impressive.

However, during usage I noticed a significant limitation:

  • Local transcription does not allow selecting a custom model, only the built-in one, which is not sufficient in some cases.

I’m based in Ukraine and tested meeting recordings in both Ukrainian and Russian. Unfortunately, the small Whisper model available in the settings is not able to properly recognize speech in these languages.

As a workaround, I disabled the built-in transcription and processed the recorded MP3 files manually using mlx-community/whisper-large-v3. This model runs locally on my base Mac Mini M4 (via Pinokio->mlx-whisper-webui-pinokio) and produces accurate results. After that, I upload the generated subtitles back into Char — and it works well.

So currently, my workflow looks like this:

  1. Record in Char
  2. Export audio
  3. Transcribe externally with whisper-large-v3
  4. Upload subtitles back to Char

It works, but it would be much better if this could be handled internally, as originally intended.

Problem:
There is no way to select a more powerful local Whisper model in Char, even if it is already installed and working on the system.

Suggested solution:
Add an option in the interface to select or configure a custom local STT model (e.g., mlx-community/whisper-large-v3).

I understand that it might be possible to manually replace the model in Char’s files, but this approach feels fragile and could break after updates.

What do you think about supporting custom/local model selection?

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Projects

    Status

    Backlog

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions