feat(voice): Claude decides when to send voice notes via MCP tool#7
Merged
Conversation
Replace the post-process word-count heuristic with a proper MCP tool.
Claude now calls `send_voice_reply(text)` explicitly when the context
warrants it — user asks for audio, conversational short reply, etc.
Changes:
- src/mcp/telegram_server.py: add send_voice_reply(text) tool
- src/claude/sdk_integration.py: update system prompt — explains the
tool and when to use it; removes the old "bot layer handles TTS" lie
- src/config/settings.py: remove voice_reply_mode + voice_reply_max_words
settings and validator (word-count gate no longer exists)
- src/bot/orchestrator.py:
* _make_stream_callback: intercept send_voice_reply MCP tool calls,
collect requested texts in mcp_voice_requests list
* agentic_text + agentic_voice: deliver voice from mcp_voice_requests
instead of applying _should_send_voice post-hoc
* Remove _VOICE_REQUEST_KEYWORDS, _user_wants_voice, _should_send_voice
* /voice on|off (auto accepted as alias for on); update /voice
BotCommand description
* Remove verbose suppression that was coupled to voice mode
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
…eeded - config/mcp.json: committed (no secrets) — removed from .gitignore - settings.py: mcp_config_path defaults to config/mcp.json so MCP_CONFIG_PATH is no longer required in .env - Remove the "mcp_config_path required" validator — default covers it - Update test_mcp_config_validation accordingly To enable MCP (and voice-as-tool): only ENABLE_MCP=true needed in .env. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Simpler approach: point the default directly at the existing example file instead of creating a redundant mcp.json. No new files needed. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
The internal telegram server (send_image_to_user, send_voice_reply) is now wired unconditionally — it is part of the bot, not an opt-in feature. ENABLE_MCP=true is now only needed when adding extra external MCP servers via mcp_config_path. Those are merged on top of the telegram server. Also: mcp_config_path defaults to config/mcp.example.json so no config variable needed for the base case. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
TestingConfig defaults voice/scheduler/project_threads to False, so CI was registering only 10 commands instead of 13. Local .env was leaking those flags and masking the failure. Now fixtures are explicit and environment-independent. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
…prevent env-var leakage in CI
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Summary
send_voice_reply(text)que llama explícitamente cuando el contexto lo justifica (el usuario pide audio, respuesta conversacional corta, etc.). Si no la llama, siempre responde en texto.El viejo diseño era fundamentalmente incorrecto: ignoraba lo que el usuario decía y lo reemplazaba por un
len(words) <= 200. Un usuario que decía "no me respondas con voz" igual recibía audio porque la respuesta era corta.Cambios
src/mcp/telegram_server.pysend_voice_reply(text)src/claude/sdk_integration.pysrc/config/settings.pyvoice_reply_mode+voice_reply_max_words(ya no existen)src/bot/orchestrator.pymcp_voice_requests; elimina_should_send_voice,_user_wants_voice,_VOICE_REQUEST_KEYWORDSComportamiento nuevo
/voice on(default): Claude puede usar la tool, decide según contexto/voice off: tool deshabilitada, siempre texto/voice auto: alias paraon(backwards compat)ENABLE_MCP=truerequerido para que la tool esté disponibleTest plan
/voice off→ nunca voz aunque se lo pidás/voice on→ Claude decide según contexto🤖 Generated with Claude Code