Skip to content

Add Speech-To-Text Feature#36

Merged
daocha merged 20 commits into
developfrom
feature-stt
Mar 31, 2026
Merged

Add Speech-To-Text Feature#36
daocha merged 20 commits into
developfrom
feature-stt

Conversation

@daocha
Copy link
Copy Markdown
Owner

@daocha daocha commented Mar 30, 2026

No description provided.

dcha-agent and others added 8 commits March 30, 2026 22:17
Add OpenAI-Whisper and required dependencies in the setup script and server startup script

Server startups now checks the dependencies when STT config is enabled
Document the Whisper/STT flow, update localized user-facing strings, and add regression tests for speech-to-text, queue ordering, reply behavior, and installer prerequisite checks.
Add local Whisper speech-to-text support for Telegram voice/audio messages, including startup prerequisite checks, shared STT installer flow, env configuration, and transcript dispatch into the normal message pipeline.
Fix pending-action and queue drain ordering, busy/queue race handling, reply threading for working/final output, and ensure install.sh launches with the same Python interpreter used for installation.
@daocha
Copy link
Copy Markdown
Owner Author

daocha commented Mar 30, 2026

@codex pls review the changes

Copy link
Copy Markdown

@chatgpt-codex-connector chatgpt-codex-connector Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

💡 Codex Review

Here are some automated review suggestions for this pull request.

Reviewed commit: c2cc45a9c7

ℹ️ About Codex in GitHub

Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you

  • Open a pull request for review
  • Mark a draft as ready
  • Comment "@codex review".

If Codex has suggestions, it will comment; otherwise it will react with 👍.

Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".

Comment thread src/coding_agent_telegram/router/message_commands.py Outdated
dcha-agent and others added 8 commits March 31, 2026 04:21
Queue text and voice transcripts while session prerequisites are unresolved, then drain the queue after session creation completes. Add regression coverage for pending new-session text/voice cases and clean up localized README diff wording.
@daocha daocha changed the base branch from main to develop March 31, 2026 13:15
daocha and others added 3 commits March 31, 2026 21:49
Fix 2 — callback_data 64-byte limit for branch source buttons

Fix 3 — Queue delimiter injection corrupts queued messages
@daocha daocha merged commit ebdbd87 into develop Mar 31, 2026
1 check passed
@daocha daocha deleted the feature-stt branch March 31, 2026 14:27
daocha added a commit that referenced this pull request Apr 2, 2026
Fix one-liner install script by @daocha in #32

Fix live agent output format and update README by @daocha in #33

Add git signature support and Fix live output message by @daocha in #34

Add Speech-To-Text Feature by @daocha in #36

Speech-To-Text
Add local Whisper speech-to-text support for Telegram voice/audio messages, including startup prerequisite checks, shared STT installer flow, env configuration, and transcript dispatch into the normal message pipeline.

Add dependency installation script and environment detection

Add OpenAI-Whisper and required dependencies in the setup script and server startup script

Server startups now checks the dependencies when STT config is enabled

Document the Whisper/STT flow, update localized user-facing strings, and add regression tests for speech-to-text, queue ordering, reply behavior, and installer prerequisite checks.

Add test cases:

Runtime issue fix: harden queue, reply threading, and startup consistency:
Fix pending-action and queue drain ordering, busy/queue race handling, reply threading for working/final output, and ensure install.sh launches with the same Python interpreter used for installation.

Fix queue messages during pending session setup

Fix Queue text and voice transcripts while session prerequisites are unresolved, then drain the queue after session creation completes. Add regression coverage for pending new-session text/voice cases and clean up localized README diff wording.

Bug fix:
Fix 1 — Double HTML escaping in bold text

Fix 2 — callback_data 64-byte limit for branch source buttons

Fix 3 — Queue delimiter injection corrupts queued messages

Update demo image in README.md by @daocha in #37

Fix bugs and add code coverage by @daocha in #38

Correct translation for README th and de by @daocha in #39
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants