在Demo数据集上微调PaddleSpeech及遇到的This dataset has no examples
背景
在centos7 (CentOS Linux release 7.6.1810 (Core))下,git clone paddlespeech项目,checkout r1.4.1,并安装微调中文环境,进行微调。
微调PaddleSpeech遇到的This dataset has no examples与解决。
安装
conda环境
#speech
#create conda enviroment
conda create -n speech python=3.10
#install package
pip install -r requirements.txt -i https://mirror.baidu.com/pypi/simple/ --trusted-host mirror.baidu.com
requirements.txt
bashnumpy==1.23.5
paddlespeech_ctcdecoders
paddlepaddle==2.4.2
pytest-runner
paddlespeech
ipykernel
transformers
微调环境
参照clone下来的项目中的paddlespeech/examples/other/tts/README.md,搭建环境,具体的:
注:以下操作的根目录在paddlespeech/examples/other/tts/
下载预训练模型(如下是下载合成中文的)
mkdir -p pretrained_models && cd pretrained_models
预训练的fastspeech2模型
wget https://paddlespeech.bj.bcebos.com/Parakeet/released_models/fastspeech2/fastspeech2_aishell3_ckpt_1.1.0.zip
unzip fastspeech2_aishell3_ckpt_1.1.0.zip
预训练的hifigan 模型
wget https://paddlespeech.bj.bcebos.com/Parakeet/released_models/hifigan/hifigan_aishell3_ckpt_0.2.0.zip
unzip hifigan_aishell3_ckpt_0.2.0.zip
cd ../
准备数据(这里将数据放在input下,如下使用了csmsc的200个数据集)
下载后解压后,有200个wav和lables.txt文件。
标签文件的格式是:utt_id|pronunciation
比如:
000001|ka2 er2 pu3 pei2 wai4 sun1 wan2 hua2 ti1
mkdir -p input && cd input
wget https://paddlespeech.bj.bcebos.com/datasets/csmsc_mini.zip
unzip csmsc_mini.zip
下载MFA 一个语音对齐工具和模型
工具
mkdir -p tools && cd tools
wget https://github.com/MontrealCorpusTools/Montreal-Forced-Aligner/releases/download/v1.0.1/montreal-forced-aligner_linux.tar.gz
tar xvf montreal-forced-aligner_linux.tar.gz
cp montreal-forced-aligner/lib/libpython3.6m.so.1.0 montreal-forced-aligner/lib/libpython3.6m.so
模型
mkdir -p aligner && cd aligner
wget https://paddlespeech.bj.bcebos.com/MFA/ernie_sat/aishell3_model.zip
unzip aishell3_model.zip
wget https://paddlespeech.bj.bcebos.com/MFA/AISHELL-3/with_tone/simple.lexicon
cd ../../
以上准备好后,目录如下:
运行微调
根据需要调整conf/finetune.yaml
./run.sh
FAQ
问题
以上在运行run.sh的时候,报错This dataset has no examples
(speech) [datatech@join71 tts3]$ ./run.sh
check oov
get mfa result
align.py:60: YAMLLoadWarning: calling yaml.load() without Loader=… is deprecated, as the default Loader is unsafe. Please read https://msg.pyyaml.org/load for full details.
Setting up corpus information…
Number of speakers in corpus: 1, average number of utterances per speaker: 198.0
/home/datatech/proj/paddlespeech/examples/other/tts_finetune/tts3/tools/montreal-forced-aligner/lib/aligner/models.py:87: YAMLLoadWarning: calling yaml.load() without Loader=… is deprecated, as the default Loader is unsafe. Please read https://msg.pyyaml.org/load for full details.
Creating dictionary information…
Using previous MFCCs
Number of speakers in corpus: 1, average number of utterances per speaker: 198.0
Done with setup.
100%|###########################################################################################################################| 2/2 [00:01<00:00, 1.05it/s]
Done! Everything took 2.914034843444824 seconds
generate durations.txt
extract feature
/home/datatech/anaconda3/envs/speech/lib/python3.10/site-packages/setuptools/sandbox.py:13: DeprecationWarning: pkg_resources is deprecated as an API. See https://setuptools.pypa.io/en/latest/pkg_resources.html
import pkg_resources
/home/datatech/anaconda3/envs/speech/lib/python3.10/site-packages/pkg_resources/init.py:2871: DeprecationWarning: Deprecated call topkg_resources.declare_namespace('mpl_toolkits')
.
Implementing implicit namespace packages (as specified in PEP 420) is preferred topkg_resources.declare_namespace
. See https://setuptools.pypa.io/en/latest/references/keywords.html#keyword-namespace-packages
declare_namespace(pkg)
/home/datatech/anaconda3/envs/speech/lib/python3.10/site-packages/pkg_resources/init.py:2871: DeprecationWarning: Deprecated call topkg_resources.declare_namespace('google')
.
Implementing implicit namespace packages (as specified in PEP 420) is preferred topkg_resources.declare_namespace
. See https://setuptools.pypa.io/en/latest/references/keywords.html#keyword-namespace-packages
declare_namespace(pkg)
/home/datatech/anaconda3/envs/speech/lib/python3.10/site-packages/librosa/core/constantq.py:1059: DeprecationWarning:np.complex
is a deprecated alias for the builtincomplex
. To silence this warning, usecomplex
by itself. Doing this will not modify any behavior and is safe. If you specifically wanted the numpy scalar type, usenp.complex128
here.
Deprecated in NumPy 1.20; for more details and guidance: https://numpy.org/devdocs/release/1.20.0-notes.html#deprecations
dtype=np.complex,
196 1
100%|████████████████████████████████████████████████████████████████████████████████████████████████████████████████████| 196/196 [00:00<00:00, 25790.86it/s]
Done
Traceback (most recent call last):
File “/home/datatech/proj/paddlespeech/examples/other/tts_finetune/tts3/local/extract_feature.py”, line 346, in
extract_feature(
File “/home/datatech/proj/paddlespeech/examples/other/tts_finetune/tts3/local/extract_feature.py”, line 266, in extract_feature
normalize(speech_scaler, pitch_scaler, energy_scaler, vocab_phones,
File “/home/datatech/proj/paddlespeech/examples/other/tts_finetune/tts3/local/extract_feature.py”, line 155, in normalize
dataset = DataTable(
File “/home/datatech/proj/paddlespeech/paddlespeech/t2s/datasets/data_table.py”, line 45, in init
assert len(data) > 0, “This dataset has no examples”
AssertionError: This dataset has no examples
解决
将MFA下的如下so文件删掉或者mv成别的文件
paddlespeech/examples/other/tts_finetune/tts3/tools/montreal-forced-aligner/lib/thirdparty/bin/libopenblas.so.0
安装
yum install openblas