Use NVIDIA's tacotron2 code, and modify training mode and data loader.
- Download and unzip LJSpeech dataset in
data
- Run
python3 preprocess.py
- Run
python3 train.py
- Download waveglow pretrained model in
waveglow/pretrained_model
- Run
python3 eval.py --step (checkpoint step)
- Samples here (step: 30000; batch size: 128; vocoder: waveglow)