1.操作环境
tensorflow2.1.0-gpu
2.项目工程
(https://gitee.com/deng-xinxin9/RNNoise_Wrapper)
3.具体操作
0)准备数据集
语音:爱丁堡数据集https://datashare.ed.ac.uk/handle/10283/1942(48k,总计约5小时)
噪音:NoiseX92 白噪white(时长3min55s,19980hz)现在是连白噪都做不好。
将数据分别放到各自文件夹中datasets/training_set/clean和datasets/training_set/noise。
1)编译工程
- unzip rnnoise_master_20.11.2020.zip
- cd rnnoise-master/src
- ./complish
2)进行采样率转化、转化为pcm和拼接音频的操作
- cd …/…/
- python3 training_utils/prepare_dataset_for_training.py -cf datasets/training_set/clean/ -nf datasets/training_set/noise/ -bca datasets/training_set/all_clean.raw -bna datasets/training_set/all_noise.raw
3)特征提取
- rnnoise-master/src/denoise_training datasets/training_set/all_clean.raw datasets/training_set/all_noise.raw 6400000 > train_logs/MS_SNSD/training_5000k.f32
- python3 rnnoise-master/training/bin2hdf5.py train_logs/MS_SNSD/training_5000k.f32 6400000 87 train_logs/MS_SNSD/training_5000k.h5
(预计生成17小时的含噪语音)
4)训练模型
run rnn_train_mod.py
由于原代码是用tensorflow1.0框架的,我用的是tensorflow2.1.0,所以原代码的相关头文件我做了修改,经过我的测试和原来的结果差不多。
import tensorflow as tf
import tensorflow.keras
from tensorflow.keras.models import Sequential
from tensorflow.keras.models import Model
from tensorflow.keras.layers import Input
from tensorflow.keras.callbacks import ModelCheckpoint,EarlyStopping,TensorBoard,ReduceLROnPlateau,LearningRateScheduler
from tensorflow.keras.layers import Dense
from tensorflow.keras.layers import LSTM
from tensorflow.keras.layers import GRU
from tensorflow.keras.layers import SimpleRNN
from tensorflow.keras.layers import Dropout
from tensorflow.keras.layers import concatenate
from tensorflow.keras import losses
from tensorflow.keras import regularizers
from tensorflow.keras.constraints import min_max_norm
import h5py
# import keras.backend.tensorflow_backend as KTF
from tensorflow.keras.constraints import Constraint
from tensorflow.keras import backend as K
import numpy as np
也就每个keras前加了个tensorflow。
如果用2.1的话最终模型的参数数量上可能会有所区别
tf2.1:
__________________________________________________________________________________________________
Layer (type) Output Shape Param # Connected to
==================================================================================================
main_input (InputLayer) [(None, None, 42)] 0
__________________________________________________________________________________________________
input_dense (Dense) (None, None, 24) 1032 main_input[0][0]
__________________________________________________________________________________________________
vad_gru (GRU) (None, None, 24) 3600 input_dense[0]