常见语音增广库
audiomentations
AugLy
GitHub - facebookresearch/AugLy: A data augmentations library for audio, image, text, and video.
nlpaug
GitHub - makcedward/nlpaug: Data augmentation for NLP
rubberband
三种常用增广形式及其推荐库
音量变换
推荐:AugLy
import soundfile as sf
import augly.audio as audaugs
import numpy as np
file_path = "D:/3513-163606-0019.flac"
output_path = 'D:/new.wav'
volume_change = np.random.uniform(5,8) #音量改变值
augmented_data, sr = audaugs.change_volume(audio=file_path, volume_db=volume_change)
sf.write(output_path, augmented_data, sr) #写入保存
音调变换及速度变换
推荐:rubberband(笔者下载的是rubberband命令行程序)
import subprocess
import librosa
output_file_path = 'D:/new.wav'
file_path = "D:/3513-163606-0019.flac"
rubberband_command = [
'D:/rubberband/rubberband.exe',
'-t', '1', # 调整速度,1为原速
'-p', '-1', # 调整音调,0为原调
file_path,
output_file_path
]
# 运行 rubberband 命令
subprocess.run(rubberband_command, stdout=subprocess.PIPE, stderr=subprocess.PIPE)
data, sr= librosa.load(file_path)
agumented_data, augmented_sr = librosa.load(output_file_path)