1. 数据准备:wave文件,,获取wav.scp,spk2utt,utt2spk三个文件
find /*/16kwav -name '*.wav' | awk -F '/' '{print $NF " " $0}' > ./data/wav.scp |
find /*/16kwav -name '*.wav' | awk -F '/' '{print $NF " " $NF}' > ./data/spk2utt |
find /*/16kwav -name '*.wav' | awk -F '/' '{print $NF " " $NF}' > ./data/utt2spk |
2. 特征提取
首先需要更改conf/mfcc.conf文件参数,更改如下:
# config for high-resolution MFCC features, intended for neural network training. # Note: we keep all cepstra, so it has the same info as filterbank features, # but MFCC is more easily compressible (because less correlated) which is why # we prefer this method. --use-energy=false # use average of log energy, not energy. --sample-frequency=16000 # AISHELL-2 is sampled at 16kHz --num-mel-bins=40 # similar to Google's setup. --num-ceps=40 # there is no dimensionality reduction. --low-freq=20 # low cutoff frequency for mel bins --high-freq=-400 # high cutoff frequency, relative to Nyquist of 8000 (=7600) |
接下来运行如下命令:
utils/fix_data_dir.sh /*/data |
./steps/make_mfcc.sh /*/data ./ts_log /*/data/mfcc |