transcriber-cli

Real-time local audio transcription CLI. Captures live audio from microphone and/or system sources, detects speech via VAD, optionally denoises, and transcribes using Whisper or Parakeet — all locally, no cloud APIs.

Features

Dual engines — OpenAI Whisper (tiny through large-v3) and NVIDIA Parakeet (parakeet-tdt-0.6b)
Multi-source capture — microphone, system audio (loopback), or both simultaneously
Voice Activity Detection — Silero VAD v5 for accurate speech segmentation
Audio enhancement — high-pass filtering, peak normalization, RNNoise denoising
Hallucination filtering — detects and drops common Whisper artifacts
Multiple output formats — console, TXT, SRT subtitles, JSON
WebSocket relay — stream results to a remote server (optional relay feature)
Auto model download — fetches models from Hugging Face on first use

Installation

cargo build --release

The binary is built as transcriber. On macOS, the build script automatically compiles the Swift helper needed for system audio capture.

Usage

# Transcribe from microphone (default)
transcriber transcribe

# Use a specific model
transcriber transcribe --model large-v3

# Transcribe system audio
transcriber transcribe --mode system

# Transcribe both mic and system audio simultaneously
transcriber transcribe --mode both

# Use Parakeet engine
transcriber transcribe --engine parakeet

# Enable noise reduction
transcriber transcribe --noise-reduce

# Save to file
transcriber transcribe -o output.srt -f srt

# List audio devices
transcriber devices

# List available models
transcriber models

Options

Option	Default	Description
`--mode`	`mic`	Audio source: `mic`, `system`, `both`
`--engine`	`whisper`	Transcription engine: `whisper`, `parakeet`
`--model`	`turbo`	Model name (e.g. `tiny`, `base`, `small`, `turbo`, `large-v3`, `parakeet-tdt-0.6b`)
`--language`	auto	Language code (e.g. `en`)
`--device`	system default	Audio device index or name substring
`--compute-device`	`auto`	Backend: `auto`, `cpu`, `cuda`
`--compute-type`	`int8`	Precision: `int8`, `float16`, `float32`
`-o, --output`	console	Output file path
`-f, --format`	`txt`	Output format: `txt`, `srt`, `json`
`--vad-threshold`	`0.5`	Speech detection threshold (0.0–1.0)
`--noise-reduce`	off	Enable RNNoise denoising
`--max-segment`	`3.0`	Max speech duration in seconds before force-emit
`--relay`	—	WebSocket relay URL (requires `--session`)
`--session`	—	Session code for relay

Models

Models are cached in ~/.cache/transcriber/models/ and downloaded automatically on first use.

Whisper models:

Name	Size	Notes
`tiny`	75 MB	Fastest, lowest accuracy
`base`	142 MB
`small`	466 MB
`turbo`	809 MB	Default — good speed/accuracy tradeoff
`medium`	1.5 GB
`distil-large-v3`	756 MB	Distilled, English-optimized
`large-v3`	3.1 GB	Best accuracy

Parakeet models:

Name	Size	Notes
`parakeet-tdt-0.6b`	~600 MB	English-only, 6.05% WER

Architecture

Audio Source (mic/system)
  → Resampling to 16kHz mono
  → High-pass filter (80Hz) + normalization
  → Silero VAD (speech detection)
  → [Optional] RNNoise denoising
  → Whisper/Parakeet transcription
  → Hallucination filter + dedup/merge
  → Output sinks (console/file/relay)

In both mode, mic and system audio run as independent pipelines in separate threads, with results multiplexed to shared output sinks via crossbeam channels.

Relationship to transcribe-rs

This CLI depends on the transcribe-rs library (include locally at ../transcribe-rs/) which provides the Whisper and Parakeet transcription engines. The CLI handles audio capture, VAD, the processing pipeline, and output — transcribe-rs handles model loading and inference.

Name		Name	Last commit message	Last commit date
Latest commit History 3 Commits
src		src
README.md		README.md

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

transcriber-cli

Features

Installation

Usage

Options

Models

Architecture

Relationship to transcribe-rs

About

Uh oh!

Releases

Packages

Uh oh!

Contributors

Uh oh!

Languages

Folders and files

Latest commit

History

Repository files navigation

transcriber-cli

Features

Installation

Usage

Options

Models

Architecture

Relationship to transcribe-rs

About

Resources

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Contributors

Uh oh!

Languages

Packages