feat(conductor): add opt-in voice STT to Telegram bridge by Abeansits · Pull Request #309 · asheshgoplani/agent-deck

Abeansits · 2026-03-08T03:58:32Z

Summary

Opt-in voice STT: Telegram voice messages are transcribed locally using parakeet-mlx via a subprocess worker (stt_worker.py), crash-isolated from the bot event loop. No cloud API needed.
Disabled by default: Voice transcription only activates when BRIDGE_STT_ENABLED=true env var is set. Without it, voice messages are silently ignored.
No hardcoded paths: stt_worker.py uses shutil.which('parakeet-mlx') to find the CLI on PATH, with PARAKEET_CLI_PATH env var as an explicit override.
File logging: Bridge now logs to ~/.agent-deck/conductor/bridge.log in addition to stdout.

Changes

conductor/bridge.py: transcribe_voice() downloads voice files and calls stt_worker via async subprocess (60s timeout). handle_message() now handles message.voice when BRIDGE_STT_ENABLED=true.
conductor/stt_worker.py: New standalone STT worker that finds and invokes the parakeet-mlx CLI, reads output text files, and prints transcription to stdout.

Configuration

# Enable voice transcription
export BRIDGE_STT_ENABLED=true

# Optional: explicit path to parakeet-mlx CLI
export PARAKEET_CLI_PATH=/path/to/parakeet-mlx

Test plan

With BRIDGE_STT_ENABLED=true, send a voice message via Telegram and verify transcription
Verify voice messages produce a Transcribing... status then the transcribed text is forwarded to the conductor
Verify failed transcription returns [Could not transcribe voice message.]
With STT disabled (default), verify voice messages are silently ignored
Verify normal text message handling is unaffected

Replace Groq Whisper API with local parakeet-mlx (parakeet-tdt-0.6b-v3) for voice message transcription. Add TTS voice replies using macOS say + ffmpeg (OGG/Opus output), toggled via BRIDGE_TTS_ENABLED env var. - stt_worker.py: standalone subprocess worker that normalizes audio to mono 16kHz WAV and runs parakeet-mlx inference, crash-isolated from the bot event loop - bridge.py: transcribe_voice() calls stt_worker via async subprocess (60s timeout), generate_voice_reply() chains say + ffmpeg via async subprocesses with per-step timeouts and proper kill/cleanup Co-Authored-By: Claude Opus 4.6 <[email protected]>

Switch from importing parakeet-mlx as a Python library to invoking the parakeet-mlx CLI binary. This avoids import/dependency issues and is cleaner for subprocess-based isolation. Co-Authored-By: Claude Opus 4.6 <[email protected]>

Strip generate_voice_reply(), BRIDGE_TTS_ENABLED/BRIDGE_TTS_VOICE config, bot.send_voice() TTS response block, say+ffmpeg pipeline, and FSInputFile import. Voice-to-text (STT) remains intact. Co-Authored-By: Claude Opus 4.6 <[email protected]>

Abeansits and others added 3 commits March 7, 2026 16:10

fix(stt): use parakeet-mlx CLI instead of Python API

cb56159

Switch from importing parakeet-mlx as a Python library to invoking the parakeet-mlx CLI binary. This avoids import/dependency issues and is cleaner for subprocess-based isolation. Co-Authored-By: Claude Opus 4.6 <[email protected]>

asheshgoplani changed the title ~~feat(conductor): add local voice STT and TTS to Telegram bridge~~ feat(conductor): add opt-in voice STT to Telegram bridge Mar 17, 2026

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

feat(conductor): add opt-in voice STT to Telegram bridge#309

feat(conductor): add opt-in voice STT to Telegram bridge#309
Abeansits wants to merge 3 commits intoasheshgoplani:mainfrom
Abeansits:feat/voice-bridge-stt

Abeansits commented Mar 8, 2026 •

edited by asheshgoplani

Loading

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

Conversation

Abeansits commented Mar 8, 2026 • edited by asheshgoplani Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Summary

Changes

Configuration

Test plan

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

Abeansits commented Mar 8, 2026 •

edited by asheshgoplani

Loading