[AGENT]
Summary
Improve live voice command responsiveness and TTS playback behavior during Discord meetings.
Current behavior and likely causes
- Live voice commands depend on transcribed snippets, so transcription backlog can delay command detection.
- Slow command handling can be skipped when a fast transcript already ran, even if the slow transcript would be more accurate.
- Voice-stop commands can therefore arrive minutes late in long or busy meetings.
- TTS has queue clearing support but no true user-speech barge-in/preemption path.
- TTS can feel stale or interrupted awkwardly when transcript backlog and playback queue state drift apart.
Proposed scope
- Add observability for live voice command latency from speech end to action.
- Consider a priority command-detection path separate from normal transcription backlog.
- Revisit whether slow transcripts should get a second command-detection chance for likely command phrases.
- Add a clear TTS interruption policy for user speech, high-priority commands, and confirmation prompts.
- Ensure generated-but-not-yet-playing TTS can be canceled when the queue is stopped.
- Add tests around stop command latency logic and TTS queue cancellation/preemption.
Acceptance criteria
- Live voice command latency is measurable.
- Stop/end meeting commands are not blocked behind normal transcript backlog when a priority path is available.
- Slow transcription can recover command detection when fast transcription missed or misheard the command.
- TTS playback has an explicit barge-in/preemption policy.
- Tests cover delayed transcription, fast/slow transcript command handling, and TTS cancellation.
[AGENT]
Summary
Improve live voice command responsiveness and TTS playback behavior during Discord meetings.
Current behavior and likely causes
Proposed scope
Acceptance criteria