TensorflowTTS：Tensorflow 2实现的最先进实时语音合成

最新推荐文章于 2024-05-31 10:08:17 发布

m0_48452814

最新推荐文章于 2024-05-31 10:08:17 发布

阅读量5.1k

点赞数

TensorflowTTS是一个基于Tensorflow 2的实时语音合成框架，支持Tacotron-2、Melgan等多种先进模型。该库提供高速、可扩展且可靠的语音合成解决方案，适用于移动设备和嵌入式系统的部署。安装简单，适用于多种模型架构，并提供了从数据预处理到模型训练和推断的完整教程。

摘要由CSDN通过智能技术生成

TensorflowTTS provides real-time state-of-the-art speech synthesis architectures such as Tacotron-2-Tensorflow, Melgan-Tensorflow, Multiband-Melgan-Tensorflow, FastSpeech-Tensorflow based-on TensorFlow 2. With Tensorflow 2, we can speed- up training/inference progress, optimizer further by using fake-quantize aware and pruning , make TTS models can be run faster than real-time and be able to deploy on mobile devices or embedded systems.

What’s new
2020/06/07 (New!) Multi-band MelGAN (MB MelGAN) implementation with Tensorflow is supported.
Features
High performance on Speech Synthesis.
Be able to fine-tune on other languages.
Fast, Scalable and Reliable.
Suitable for deployment.
Easy to implement new model based-on abtract class.
Mixed precision to speed-up training if posible.
Requirements
This repository is tested on Ubuntu 18.04 with:

Python 3.6+
Cuda 10.1
CuDNN 7.6.5
Tensorflow 2.2
Tensorflow Addons 0.9.1
Different Tensorflow version should be working but not tested yet. This repo will try to work with latest stable tensorflow version.

Installation
$ git clone https: //github.com/dathudeptrai/TensorflowTTS.git
$ cd TensorflowTTS
$ pip install .
If you want upgrade the repository and its dependencies:

$ git pull
$ pip install --upgrade .
Supported Model achitectures
TensorflowTTS currently provides the following architectures:

MelGAN released with the paper MelGAN: Generative Adversarial Networks for Conditional Waveform Synthesis by Kundan Kumar, Rithesh Kumar, Thibault de Boissiere, Lucas Gestin, Wei Zhen Teoh, Jose Sotelo, Alexandre de Brebisson, Yoshua Bengio, Aaron Courville.
Tacotron-2 released with the paper Natural TTS Synthesis by Conditioning WaveNet on Mel Spectrogram Predictions by Jonathan Shen, Ruoming Pang, Ron J. Weiss, Mike Schuster, Navdeep Jaitly, Zongheng Yang, Zhifeng Chen, Yu Zhang, Yuxuan Wang, RJ Skerry- Ryan, Rif A. Saurous, Yannis Agiomyrgiannakis, Yonghui Wu.
FastSpeech released with the paper FastSpeech: Fast, Robust and Controllable Text to Speech by Yi Ren, Yangjun Ruan, Xu Tan, Tao Qin, Sheng Zhao, Zhou Zhao, Tie-Yan Liu.
Multi-band MelGAN released with the paper Multi-band MelGAN:

最低0.47元/天解锁文章

m0_48452814

关注

0
点赞
踩
4

收藏

觉得还不错? 一键收藏
0
评论
TensorflowTTS：Tensorflow 2实现的最先进实时语音合成

TensorflowTTS provides real-time state-of-the-art speech synthesis architectures such as Tacotron-2-Tensorflow, Melgan-Tensorflow, Multiband-Melgan-Tensorflow, FastSpeech-Tensorflow based-on TensorFlow 2. With Tensorflow 2, we can speed- up training/infere
复制链接

扫一扫