Skip to content

Add Doubao Speech TTS provider#57

Merged
calesthio merged 1 commit into
calesthio:mainfrom
RayJiang4S:doubao-tts-provider
May 7, 2026
Merged

Add Doubao Speech TTS provider#57
calesthio merged 1 commit into
calesthio:mainfrom
RayJiang4S:doubao-tts-provider

Conversation

@RayJiang4S
Copy link
Copy Markdown

Summary

  • Add a Volcengine Doubao Speech 2.0 TTS provider for Mandarin narration.
  • Document the new-console X-Api-Key setup flow and seed-tts-2.0 resource id.
  • Add a provider-specific agent skill covering sampling, timestamp metadata, semantic caption grouping, and troubleshooting.

Why

Doubao Speech provides natural Mandarin narration and async timestamp metadata that is useful for Chinese-language explainer videos and subtitle alignment.

Closes #56.

Test plan

  • python -m py_compile tools/audio/doubao_tts.py
  • Verified doubao_tts is discovered by the tool registry.
  • Verified tts_selector lists doubao_tts as a provider.
  • Ran git diff --check.
  • Secret scan checked that the PR diff does not include real API keys or signed audio URLs.

Manual local validation was also performed with a real Doubao Speech API key: generation through tts_selector produced MP3 audio and timestamp JSON successfully.

Add Volcengine Doubao Speech 2.0 as a TTS provider for Mandarin narration with async timestamp metadata, provider documentation, and setup guidance.
@RayJiang4S RayJiang4S requested a review from calesthio as a code owner May 4, 2026 13:41
Copy link
Copy Markdown
Owner

@calesthio calesthio left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

@calesthio calesthio merged commit 9066dcb into calesthio:main May 7, 2026
PMartinsj pushed a commit to PMartinsj/OpenMontage that referenced this pull request May 16, 2026
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Proposal: add Volcengine Doubao Speech 2.0 TTS provider for Mandarin narration

2 participants