Skip to content

Feature Request: User-selectable Real-time vs. Offline Transcription #4977

@Paloma-96

Description

@Paloma-96

Is your feature request related to a problem? Please describe.
I am often frustrated when I lose the thread of a conversation during a meeting or a long discussion and have no way to quickly "scroll back" and read what was just said in the moment. Currently, if transcription is only processed offline (after the recording ends), the text is not available for immediate clarification or context recovery during the live event.

Describe the solution you'd like
I would like a setting that permits the user to decide whether to use Real-time Transcription or Offline Transcription.

  • Real-time Transcription: The system processes audio and displays text as it is spoken, allowing for immediate review.
  • Offline Transcription: The system records the audio and processes the text only after the session is completed (useful for saving battery or CPU on older machines).

Ideally, this would be a simple toggle in the application preferences or a choice presented at the start of a recording session.

Describe alternatives you've considered

  • External Live Captioning: Using OS-level features (like macOS Live Captions), but these often don't integrate well or save the text directly into the application's workflow.
  • Waiting for Post-processing: Waiting until the meeting is over to read the transcript, which does not help with re-engaging in the conversation while it is still happening.

Additional context
The feasibility of local, real-time transcription has increased significantly with modern hardware. For example, on a MacBook Pro with an M3 Pro chip, there is more than enough power to run high-quality transcription models locally without any noticeable impact on system performance.

For users with powerful machines, real-time transcription is a massive productivity and accessibility feature. It allows the user to verify in real-time what others have said the moment they lose the thread of the discourse, ensuring they can stay engaged and informed without interrupting the speaker.

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Projects

    Status

    Backlog

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions