阿里巴巴达摩院的通义实验室近期宣布开源一项名为ClearerVoice-Studio的语音处理技术,旨在提升语音质量和可懂度。随着语音技术的广泛应用,语音质量受到越来越多人的关注,尤其是在环境噪声、混响和设备拾音等情况下,语音处理技术的需求日益迫切。
ClearerVoice-Studio集成了语音增强、语音分离和音视频说话人提取等功能,通过融合复数域深度学习算法,大幅提升了语音降噪和分离的性能。该技术能够最大限度地消除背景噪声,保留语音清晰度,同时保持语音失真最小化。
ClearerVoice-Studio的核心模型与算法包括在2022年IEEE/INTER Speech DNS Challenge中获得整体第二的FRCRN模型,以及在语音分离任务中表现卓越的MossFormer系列模型。基于MossFormer2的48kHz语音增强模型在有效抑制噪声的同时,大幅降低了语音失真。
阿里巴巴通义实验室希望通过ClearerVoice-Studio平台,为开发者、研究者和企业提供强大的语音处理工具,助力创新应用落地。用户可以通过在线体验Demo,准备一段包含噪声的语音文件,上传至指定页面,一键处理后在线试听或下载处理结果,即刻获得清晰的音质和卓越的降噪效果。
GitHub 仓库:https://github.com/modelscope/ClearerVoice-Studio
在线体验 Demo:https://huggingface.co/spaces/alibabasglab/ClearVoice
Colab 演示
!git clone https://github.com/modelscope/ClearerVoice-Studio.git
Cloning into ‘ClearerVoice-Studio’…
remote: Enumerating objects: 1033, done.
remote: Counting objects: 100% (249/249), done.
remote: Compressing objects: 100% (153/153), done.
remote: Total 1033 (delta 171), reused 139 (delta 95), pack-reused 784 (from 1)
Receiving objects: 100% (1033/1033), 203.85 MiB | 28.80 MiB/s, done.
Resolving deltas: 100% (512/512), done.
Updating files: 100% (284/284), done.
!pip install -q condacolab
import condacolab
condacolab.install()
⏬ Downloading https://github.com/conda-forge/miniforge/releases/download/23.11.0-0/Mambaforge-23.11.0-0-Linux-x86_64.sh…
📦 Installing…
📌 Adjusting configuration…
🩹 Patching environment…
⏲ Done in 0:00:12
🔁 Restarting kernel…
%%bash
cd ClearerVoice-Studio
conda create -n ClearerVoice-Studio python=3.8
Channels:
- conda-forge
Platform: linux-64
Collecting package metadata (repodata.json): ...working... done
Solving environment: ...working... done
## Package Plan ##
environment location: /usr/local/envs/ClearerVoice-Studio
added / updated specs:
- python=3.8
The following NEW packages will be INSTALLED:
_libgcc_mutex conda-forge/linux-64::_libgcc_mutex-0.1-conda_forge
_openmp_mutex conda-forge/linux-64::_openmp_mutex-4.5-2_gnu
bzip2 conda-forge/linux-64::bzip2-1.0.8-h4bc722e_7
ca-certificates conda-forge/linux-64::ca-certificates-2024.8.30-hbcca054_0
ld_impl_linux-64 conda-forge/linux-64::ld_impl_linux-64-2.43-h712a8e2_2
libffi conda-forge/linux-64::libffi-3.4.2-h7f98852_5
libgcc conda-forge/linux-64::libgcc-14.2.0-h77fa898_1
libgcc-ng conda-forge/linux-64::libgcc-ng-14.2.0-h69a702a_1
libgomp conda-forge/linux-64::libgomp-14.2.0-h77fa898_1
liblzma conda-forge/linux-64::liblzma-5.6.3-hb9d3cd8_1
liblzma-devel conda-forge/linux-64::liblzma-devel-5.6.3-hb9d3cd8_1
libnsl conda-forge/linux-64::libnsl-2.0.1-hd590300_0
libsqlite conda-forge/linux-64::libsqlite-3.47.2-hee588c1_0
libuuid conda-forge/linux-64::libuuid-2.38.1-h0b41bf4_0
libxcrypt conda-forge/linux-64::libxcrypt-4.4.36-hd590300_1
libzlib conda-forge/linux-64::libzlib-1.3.1-hb9d3cd8_2
ncurses conda-forge/linux-64::ncurses-6.5-he02047a_1
openssl conda-forge/linux-64::openssl-3.4.0-hb9d3cd8_0
pip conda-forge/noarch::pip-24.3.1-pyh8b19718_0
python conda-forge/linux-64::python-3.8.20-h4a871b0_2_cpython
readline conda-forge/linux-64::readline-8.2-h8228510_1
setuptools conda-forge/noarch::setuptools-75.3.0-pyhd8ed1ab_0
tk conda-forge/linux-64::tk-8.6.13-noxft_h4845f30_101
wheel conda-forge/noarch::wheel-0.45.1-pyhd8ed1ab_0
xz conda-forge/linux-64::xz-5.6.3-hbcc6ac9_1
xz-gpl-tools conda-forge/linux-64::xz-gpl-tools-5.6.3-hbcc6ac9_1
xz-tools conda-forge/linux-64::xz-tools-5.6.3-hb9d3cd8_1
Downloading and Extracting Packages: ...working... done
Preparing transaction: ...working... done
Verifying transaction: ...working... done
Executing transaction: ...working... done
#
# To activate this environment, use
#
# $ conda activate ClearerVoice-Studio
#
# To deactivate an active environment, use
#
# $ conda deactivate
==> WARNING: A newer version of conda exists. <==
current version: 23.11.0
latest version: 24.11.0
Please update conda by running
$ conda update -n base -c conda-forge conda
%%bash
cd ClearerVoice-Studio
conda init
source activate ClearerVoice-Studio
pip install -r requirements.txt
%%bash
source activate ClearerVoice-Studio
cd ClearerVoice-Studio
cd clearvoice
python demo.py
%%bash
source activate ClearerVoice-Studio
cd ClearerVoice-Studio
cd clearvoice
python demo_with_more_comments.py
%%bash
source activate ClearerVoice-Studio
cd ClearerVoice-Studio
cd clearvoice
python
from clearvoice import ClearVoice
myClearVoice = ClearVoice(task='speech_enhancement', model_names=['MossFormer2_SE_48K'])
#process single wave file
output_wav = myClearVoice(input_path='samples/input.wav', online_write=False)
myClearVoice.write(output_wav, output_path='samples/output_MossFormer2_SE_48K.wav')
#process wave directory
myClearVoice(input_path='samples/path_to_input_wavs', online_write=True, output_path='samples/path_to_output_wavs')
#process wave list file
myClearVoice(input_path='samples/scp/audio_samples.scp', online_write=True, output_path='samples/path_to_output_wavs_scp')