PaddleSpeech & MFA：阿米娅中文音色复刻计划

本文链接：https://blog.csdn.net/class4715/article/details/139522865

PaddleSpeech：阿米娅中文音色复刻计划

本篇项目是对iterhui大佬项目[PaddleSpeech 原神] 音色克隆之胡桃的复刻，使用的PaddleSpeech的版本较新，也针对新版本的PaddleSpeech做了许多配置之上的更新并加入了自己对语音的对齐、配置、训练其它任何语音音色的模块。

本篇项目旨在利用PaddleSpeech框架实现音色克隆技术，目标是复制并生成游戏《明日方舟》中的干员阿米娅（Amiya）的中文语音音色。

1. 配置 PaddleSpeech 开发环境

安装 PaddleSpeech 并在 PaddleSpeech/examples/other/tts_finetune/tts3 路径下配置 tools，下载预训练模型

In [ ]

# # 配置 PaddleSpeech 开发环境
!git clone https://gitee.com/paddlepaddle/PaddleSpeech.git
%cd /home/aistudio/
%cd PaddleSpeech
!pip install .  --user -i https://mirror.baidu.com/pypi/simple
# # 下载 NLTK
# %cd /home/aistudio
# !wget -P data https://paddlespeech.bj.bcebos.com/Parakeet/tools/nltk_data.tar.gz
# !tar zxvf data/nltk_data.tar.gz

In [ ]

# 查看paddlespeech是否正常安装，如果未安装，重新运行上一单元格。
!pip show paddlespeech

In [ ]

# 安装必要库
!pip install prettytable
!pip install soundfile
!pip install librosa
!pip install paddleaudio==1.0.1
!pip install h5py
!pip install loguru
!pip install python_speech_features
!pip install jsonlines
!pip install kaldiio

In [7]

# 删除软链接
# aistudio会报错： paddlespeech 的 repo中存在失效软链接
# 执行下面这行命令!!
!find -L /home/aistudio -type l -delete

In [ ]

# 配置 MFA & 下载预训练模型
%cd /home/aistudio
!bash env.sh

In [ ]

# 配置 MFA & 下载模型及词典
!mkdir -p tools
%cd tools
# mfa tool
!wget https://github.com/MontrealCorpusTools/Montreal-Forced-Aligner/releases/download/v1.0.1/montreal-forced-aligner_linux.tar.gz
!tar xvf montreal-forced-aligner_linux.tar.gz
!cp montreal-forced-aligner/lib/libpython3.6m.so.1.0 montreal-forced-aligner/lib/libpython3.6m.so
# pretrained mfa model（预置的对齐模型和词典）
!mkdir -p aligner
%cd aligner
!wget https://paddlespeech.bj.bcebos.com/MFA/ernie_sat/aishell3_model.zip
!unzip aishell3_model.zip
!wget https://paddlespeech.bj.bcebos.com/MFA/AISHELL-3/with_tone/simple.lexicon
%cd ../../

In [ ]

# 拷贝mfa词典重构模型压缩包到指定目录
!cp /home/aistudio/data/data260888/mandarin_pinyin_g2p.zip -d /home/aistudio/tools/montreal-forced-aligner/pretrained_models/mandarin_pinyin_g2p.zip

2 数据集配置

本项目数据集提供了完整的wav、labelx以及MFA对齐标注文件

如果要自行对齐，请去PaddleSpeech查阅完整资料或参考后面的示例

Finetune your own AM based on FastSpeech2 with multi-speakers dataset.

解压文件中的

音频

work/dataset/阿米娅/wav/xx.wav

和标签

work/dataset/阿米娅/wav/labels.txt

对齐的textgrid

work/dataset/阿米娅/textgrid/newdir/xx.TextGrid

本项目采用阿米娅的声音完成

2.1 解压现有阿米娅音色数据集

In [ ]

%cd /home/aistudio/
!unzip /home/aistudio/data/data260882/dataset.zip -d work/

2.2 新音色数据集制作

制作MFA对齐标注文件

想要复刻自己找的语音音色提前要做的准备：

准备wav语音文件（建议30个文件以上，每