Skip to content
/ wetts Public
forked from wenet-e2e/wetts

Production First and Production Ready End-to-End Text-to-Speech Toolkit

License

Notifications You must be signed in to change notification settings

xcmyz/wetts

 
 

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

81 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

WeTTS

Production First and Production Ready End-to-End Text-to-Speech Toolkit

Install

We suggest installing WeTTS with Anaconda or Miniconda. Clone this repo:

git clone https://github.com/wenet-e2e/wetts.git

Create environment:

conda create -n wetts python=3.8 -y

Install MFA:

conda install -n wetts montreal-forced-aligner=2.0.1 -c conda-forge -y

For CUDA 10.2, run:

conda install -n wetts pytorch=1.11 torchaudio cudatoolkit=10.2 -c pytorch -y

For CUDA 11.3, run:

conda install -n wetts pytorch=1.11 torchaudio cudatoolkit=11.3 -c pytorch -y

Installing other dependencies using:

conda activate wetts
python -m pip install -r requirements.txt

Roadmap

We mainly focus on production and on-device TTS, and we plan to use:

  • AM: FastSpeech2
  • vocoder: hifigan/melgan

And we are going to provide reference solution of:

  • Prosody
  • Polyphones
  • Text Normalization

Dataset

We plan to support a variaty of open source TTS datasets, include but not limited to:

  • BZNSYP, Chinese Standard Mandarin Speech corpus open sourced by Data Baker.
  • AISHELL-3, a large-scale and high-fidelity multi-speaker Mandarin speech corpus.
  • Opencpop, Mandarin singing voice synthesis (SVS) corpus open sourced by Netease Fuxi.

Runtime

We plan to support a variaty of hardwares and platforms, including:

  • x86
  • Android
  • Raspberry Pi
  • Other on-device platforms

Acknowledgement

  1. We borrow some code from FastSpeech2 for FastSpeech2 implentation.
  2. We refer PaddleSpeech for feature extraction, pinyin lexicon preparation for alignment, and the length regulator in FastSpeech2.

About

Production First and Production Ready End-to-End Text-to-Speech Toolkit

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Languages

  • Python 100.0%