语音质量评估标准（超详细指南）

NosONE

已于 2023-09-20 17:42:47 修改

阅读量1.1k

点赞数 1

分类专栏：笔记文章标签：音频

于 2023-09-19 16:53:37 首次发布

本文链接：https://blog.csdn.net/amateurSU/article/details/133035599

版权

笔记专栏收录该内容

7 篇文章 0 订阅

订阅专栏

Speech_Metrics User’s Guide

文章目录

Speech_Metrics User's Guide

speech_metrics目录下包含了 AECMOS， DNSMOS， pysepm_test， aec_metrics四种不同的测试指标。以及配置环境的 speech_metrics_environment.yaml文件。

speech_metrics环境搭建

AECMOS，DNSMOS，pysepm_test三个测试，都需要在该环境中运行。aec_metrics仅仅依赖Octave软件。

第一步：安装Anaconda。参考：https://blog.csdn.net/wyf2017/article/details/118676765

版本：[Anaconda3-2023.03-1-Linux-x86_64.sh]

第二步：创建pytorch环境。

source ~/.bashrc

conda env create -f speech_metrics_environment.yaml

第三步：进入环境

source ~/.bashrc

conda activate pytorch1.11

aec_metrics

参考资料：https://www.cnblogs.com/joffrey/p/16588345.html

测试环境：Octave

介绍：回声消除相关的测试指标。

文件目录：

.aec_metrics
├── batch_process_wav_files.m # 主函数：能实现批量处理
├── compute_cohde_cohxe.m # 计算 cohde(输出信号e(n)与麦克风信号d(n)的频谱相关性)， cohxe(输出信号e(n)与远端参考信号x(n)的频谱相关性)
├── compute_erle_mse_sf.m # 计算erle(回声返回损耗增益)，SuppFactor(能量衰落因子)， MSE(均方误差)
├── data
│ ├── file1_mic.wav # 文件命名方式按照以下模板，包含xxx_mic.wav，xxx_ref.wav，xxx_res.wav
│ ├── file1_ref.wav # 分别代表麦克风语音，远端参考语音，输出语音。
│ ├── file1_res.wav
│ ├── file_mic.wav
│ ├── file_ref.wav
│ └── file_res.wav
├── pic # 保存了每组·音频文件所有指标的时域图
├── readme.txt
└── results.mat # 保存了各种音频所有指标的平均数据

how to run:

batch_process_wav_files('./data')

查看结果: load('results.mat')

AECMOS

参考资料： https://arxiv.org/abs/2110.03010
https://github.com/microsoft/AEC-Challenge/tree/main/AECMOS/AECMOS_local

测试环境：linux服务器pytorch1.11环境

介绍：回声消除指标（MOS分）

文件目录：

AECMOS_local：
├── AECMOS_local
│ ├── README.md
│ ├── Run_1663829550_Stage_0.onnx # 模型选择，默认该模型
│ ├── Run_1663915512_Stage_0.onnx
│ ├── Run_1668423760_Stage_0.onnx
│ ├── pycache
│ │ ├── aecmos.cpython-37.pyc
│ │ └── aecmos.cpython-38.pyc
│ ├── aecmos.py # 单组语音测试MOS分
│ ├── dataset # dataset的子目录以及文件命名方式必须如下，否则无法批量运行
│ │ ├── enhance_speech
│ │ │ ├── enhance_speech_fileid_0_enh.wav
│ │ │ ├── enhance_speech_fileid_1_enh.wav
│ │ ├── farend_speech
│ │ │ ├── farend_speech_fileid_0_lpb.wav
│ │ │ ├── farend_speech_fileid_1_lpb.wav
│ │ ├── nearend_mic_signal
│ │ │ ├── nearend_mic_fileid_0_mic.wav
│ │ │ ├── nearend_mic_fileid_1_mic.wav
│ │ └── nearend_speech
│ ├── file_process.py # 批量文件名修改
│ ├── out
│ │ ├── result.backup.xlsx # 旧result.xlsx的备份文件
│ │ └── result.xlsx # 保存MOS分结果
│ └── run.py # 批量计算dataset目录中所有组语音的MOS分

how to run:

python run.py --model_path=Run_1663829550_Stage_0.onnx --data_path=./dataset/ --output_file=./out/result.xlsx <!----data_path是必须的–>

DNSMOS

参考资料：https://github.com/microsoft/DNS-Challenge/tree/master/DNSMOS

https://arxiv.org/pdf/2010.15258.pdf
测试环境：linux服务器pytorch1.11环境

介绍：针对降噪后的语音质量指标（MOS分）

文件目录：
├── DNSMOS
│ ├── bak_ovr.onnx # 模型选择（已默认设置）
│ ├── model_v8.onnx
│ ├── sig.onnx
│ └── sig_bak_ovr.onnx
├── README.md
├── datasets
│ ├── clean
│ │ ├── book_00000_chp_0009_reader_06709_0.wav
│ │ ├── book_00000_chp_0009_reader_06709_1.wav
│ │ ├── book_00000_chp_0009_reader_06709_2.wav

│ └── noise
│ ├── noise_fileid_0.wav
│ ├── noise_fileid_1.wav
│ ├── noise_fileid_2.wav
│ ├── noise_fileid_3.wav
├── dnsmos.py
├── dnsmos_local.py # 运行测试DNSmos
├── pDNSMOS
│ └── sig_bak_ovr.onnx
└── result.csv # 评分结果保存到 result.csv

how to run:

python dnsmos_local.py --testset_dir=./datasets/clean/ --csv_path=./result.csv

pysepm_test

参考资料：https://github.com/schmiph2/pysepm
测试环境：linux服务器pytorch1.11环境

介绍：Python Speech Enhancement Performance Measures.

文件目录：

pysepm_test
├── data # 子目录和文件命名方式如下，否则无法运行
│ ├── clean
│ │ ├── clean_speech_fileid_0.wav
│ │ └── clean_speech_fileid_1.wav
│ ├── enhan
│ │ ├── enhance_speech_fileid_0.wav
│ │ └── enhance_speech_fileid_1.wav
│ └── noisy
│ ├── noisy_speech_fileid_0.wav
│ └── noisy_speech_fileid_1.wav
├── file_process.py # 批量文件改名
├── out
│ ├── score.backup.xlsx # 历史score.xlsx的备份
│ └── score.xlsx # 评分结果保存在score.xlsx
├── pysepm
│ ├── init.py
│ ├── pycache
│ ├── intelligibilityMeasures.py
│ ├── qualityMeasures.py
│ ├── reverberationMeasures.py
│ └── util.py
└── run.py # 批量处理运行文件