Skip to content

Commit 3c465c3

Browse files
committed
🦋 Update README
1 parent 85f5f2f commit 3c465c3

File tree

1 file changed

+21
-26
lines changed

1 file changed

+21
-26
lines changed

README.md

Lines changed: 21 additions & 26 deletions
Original file line numberDiff line numberDiff line change
@@ -19,12 +19,15 @@
1919
:zany_face: TensorFlowTTS provides real-time state-of-the-art speech synthesis architectures such as Tacotron-2, Melgan, Multiband-Melgan, FastSpeech, FastSpeech2 based-on TensorFlow 2. With Tensorflow 2, we can speed-up training/inference progress, optimizer further by using [fake-quantize aware](https://www.tensorflow.org/model_optimization/guide/quantization/training_comprehensive_guide) and [pruning](https://www.tensorflow.org/model_optimization/guide/pruning/pruning_with_keras), make TTS models can be run faster than real-time and be able to deploy on mobile devices or embedded systems.
2020

2121
## What's new
22-
- 2020/12/02 **(NEW!)** Support German TTS with [Thorsten dataset](https://github.com/thorstenMueller/deep-learning-german-tts). See the [Colab](https://colab.research.google.com/drive/1W0nSFpsz32M0OcIkY9uMOiGrLTPKVhTy?usp=sharing). Thanks [thorstenMueller](https://github.com/thorstenMueller) and [monatis](https://github.com/monatis).
23-
- 2020/11/24 **(NEW!)** Add HiFi-GAN vocoder. See [here](https://github.com/TensorSpeech/TensorFlowTTS/tree/master/examples/hifigan)
24-
- 2020/11/19 **(NEW!)** Add Multi-GPU gradient accumulator. See [here](https://github.com/TensorSpeech/TensorFlowTTS/pull/377)
25-
- 2020/08/23 Add Parallel WaveGAN tensorflow implementation. See [here](https://github.com/TensorSpeech/TensorFlowTTS/tree/master/examples/parallel_wavegan)
22+
- 2021/06/01 (**NEW!**) Integrated with [Huggingface Hub](https://huggingface.co/tensorspeech). See the [PR](https://github.com/TensorSpeech/TensorFlowTTS/pull/555). Thanks [patrickvonplaten](https://github.com/patrickvonplaten) and [osanseviero](https://github.com/osanseviero)
23+
- 2021/03/18 (**NEW!**) Support IOS for FastSpeech2 and MB MelGAN. Thanks [kewlbear](https://github.com/kewlbear). See [here](https://github.com/TensorSpeech/TensorFlowTTS/tree/master/examples/ios)
24+
- 2021/01/18 (**NEW!**) Support TFLite C++ inference. Thanks [luan78zaoha](https://github.com/luan78zaoha). See [here](https://github.com/TensorSpeech/TensorFlowTTS/tree/master/examples/cpptflite)
25+
- 2020/12/02 Support German TTS with [Thorsten dataset](https://github.com/thorstenMueller/deep-learning-german-tts). See the [Colab](https://colab.research.google.com/drive/1W0nSFpsz32M0OcIkY9uMOiGrLTPKVhTy?usp=sharing). Thanks [thorstenMueller](https://github.com/thorstenMueller) and [monatis](https://github.com/monatis)
26+
- 2020/11/24 Add HiFi-GAN vocoder. See [here](https://github.com/TensorSpeech/TensorFlowTTS/tree/master/examples/hifigan)
27+
- 2020/11/19 Add Multi-GPU gradient accumulator. See [here](https://github.com/TensorSpeech/TensorFlowTTS/pull/377)
28+
- 2020/08/23 Add Parallel WaveGAN tensorflow implementation. See [here](https://github.com/TensorSpeech/TensorFlowTTS/tree/master/examples/parallel_wavegan)
2629
- 2020/08/23 Add MBMelGAN G + ParallelWaveGAN G example. See [here](https://github.com/TensorSpeech/TensorFlowTTS/tree/master/examples/multiband_pwgan)
27-
- 2020/08/20 Add C++ inference code. Thank [@ZDisket](https://github.com/ZDisket). See [here](https://github.com/TensorSpeech/TensorFlowTTS/tree/master/examples/cppwin)
30+
- 2020/08/20 Add C++ inference code. Thank [@ZDisket](https://github.com/ZDisket). See [here](https://github.com/TensorSpeech/TensorFlowTTS/tree/master/examples/cppwin)
2831
- 2020/08/18 Update [new base processor](https://github.com/TensorSpeech/TensorFlowTTS/blob/master/tensorflow_tts/processor/base_processor.py). Add [AutoProcessor](https://github.com/TensorSpeech/TensorFlowTTS/blob/master/tensorflow_tts/inference/auto_processor.py) and [pretrained processor](https://github.com/TensorSpeech/TensorFlowTTS/blob/master/tensorflow_tts/processor/pretrained/) json file
2932
- 2020/08/14 Support Chinese TTS. Pls see the [colab](https://colab.research.google.com/drive/1YpSHRBRPBI7cnTkQn1UcVTWEQVbsUm1S?usp=sharing). Thank [@azraelkuan](https://github.com/azraelkuan)
3033
- 2020/08/05 Support Korean TTS. Pls see the [colab](https://colab.research.google.com/drive/1ybWwOS5tipgPFttNulp77P6DAB5MtiuN?usp=sharing). Thank [@crux153](https://github.com/crux153)
@@ -261,42 +264,34 @@ import yaml
261264

262265
import tensorflow as tf
263266

264-
from tensorflow_tts.inference import AutoConfig
265267
from tensorflow_tts.inference import TFAutoModel
266268
from tensorflow_tts.inference import AutoProcessor
267269

268-
# initialize fastspeech model.
269-
fs_config = AutoConfig.from_pretrained('./examples/fastspeech/conf/fastspeech.v1.yaml')
270-
fastspeech = TFAutoModel.from_pretrained(
271-
config=fs_config,
272-
pretrained_path="./examples/fastspeech/pretrained/model-195000.h5"
273-
)
270+
# initialize fastspeech2 model.
271+
fastspeech2 = TFAutoModel.from_pretrained("tensorspeech/tts-fastspeech2-ljspeech-en")
274272

275273

276-
# initialize melgan model
277-
melgan_config = AutoConfig.from_pretrained('./examples/melgan/conf/melgan.v1.yaml')
278-
melgan = TFAutoModel.from_pretrained(
279-
config=melgan_config,
280-
pretrained_path="./examples/melgan/checkpoint/generator-1500000.h5"
281-
)
274+
# initialize mb_melgan model
275+
mb_melgan = TFAutoModel.from_pretrained("tensorspeech/tts-mb_melgan-ljspeech-en")
282276

283277

284278
# inference
285-
processor = AutoProcessor.from_pretrained(pretrained_path="./test/files/ljspeech_mapper.json")
279+
processor = AutoProcessor.from_pretrained("tensorspeech/tts-fastspeech2-ljspeech-en")
286280

287281
ids = processor.text_to_sequence("Recent research at Harvard has shown meditating for as little as 8 weeks, can actually increase the grey matter in the parts of the brain responsible for emotional regulation, and learning.")
288-
ids = tf.expand_dims(ids, 0)
289282
# fastspeech inference
290283

291-
masked_mel_before, masked_mel_after, duration_outputs = fastspeech.inference(
292-
ids,
293-
speaker_ids=tf.zeros(shape=[tf.shape(ids)[0]], dtype=tf.int32),
294-
speed_ratios=tf.constant([1.0], dtype=tf.float32)
284+
mel_before, mel_after, duration_outputs, _, _ = fastspeech2.inference(
285+
input_ids=tf.expand_dims(tf.convert_to_tensor(input_ids, dtype=tf.int32), 0),
286+
speaker_ids=tf.convert_to_tensor([0], dtype=tf.int32),
287+
speed_ratios=tf.convert_to_tensor([1.0], dtype=tf.float32),
288+
f0_ratios =tf.convert_to_tensor([1.0], dtype=tf.float32),
289+
energy_ratios =tf.convert_to_tensor([1.0], dtype=tf.float32),
295290
)
296291

297292
# melgan inference
298-
audio_before = melgan.inference(masked_mel_before)[0, :, 0]
299-
audio_after = melgan.inference(masked_mel_after)[0, :, 0]
293+
audio_before = mb_melgan.inference(mel_before)[0, :, 0]
294+
audio_after = mb_melgan.inference(mel_after)[0, :, 0]
300295

301296
# save to file
302297
sf.write('./audio_before.wav', audio_before, 22050, "PCM_16")

0 commit comments

Comments
 (0)