WaveRNN 开源项目教程-CSDN博客

本文链接：https://blog.csdn.net/gitblog_00620/article/details/141083532

WaveRNN 开源项目教程

WaveRNNWaveRNN Vocoder + TTS项目地址:https://gitcode.com/gh_mirrors/wa/WaveRNN

项目介绍

WaveRNN 是一个基于 PyTorch 的开源项目，实现了 Deepmind 的 WaveRNN 模型，该模型来自论文《Efficient Neural Audio Synthesis》。WaveRNN 是一个高效的神经音频合成模型，能够生成高质量的音频样本。项目主要包含两个预训练模型：WaveRNN（Mixture of Logistics 输出）和 Tacotron。WaveRNN 模型经过 800k 步训练，Tacotron 模型经过 180k 步训练。

项目快速启动

环境准备

克隆项目仓库：

git clone https://github.com/fatchord/WaveRNN.git
cd WaveRNN

安装依赖：
```
pip install -r requirements.txt
```

快速启动示例

以下是一个简单的示例，展示如何使用预训练模型生成音频：

import os
from utils import hparams as hp
from models.wavernn import WaveRNN
from gen_wavernn import generate

# 加载预训练模型
hp.configure('hparams.py')
model = WaveRNN(rnn_dims=hp.rnn_dims, fc_dims=hp.fc_dims, bits=hp.bits, pad=hp.pad,
                upsample_factors=hp.upsample_factors, feat_dims=hp.feat_dims,
                compute_dims=hp.compute_dims, res_out_dims=hp.res_out_dims, res_blocks=hp.res_blocks,
                hop_length=hp.hop_length, sample_rate=hp.sample_rate).to(hp.device)

model.load('pretrained/latest_weights.pyt')

# 生成音频
generate(model, 'sentences.txt', 'output/', batched=hp.batched, target=hp.target, overlap=hp.overlap)