去除背景噪声 阿里通义实验室开源语音处理技术ClearerVoice-Studio

阿里巴巴达摩院的通义实验室近期宣布开源一项名为ClearerVoice-Studio的语音处理技术,旨在提升语音质量和可懂度。随着语音技术的广泛应用,语音质量受到越来越多人的关注,尤其是在环境噪声、混响和设备拾音等情况下,语音处理技术的需求日益迫切。

ClearerVoice-Studio集成了语音增强、语音分离和音视频说话人提取等功能,通过融合复数域深度学习算法,大幅提升了语音降噪和分离的性能。该技术能够最大限度地消除背景噪声,保留语音清晰度,同时保持语音失真最小化。

在这里插入图片描述
ClearerVoice-Studio的核心模型与算法包括在2022年IEEE/INTER Speech DNS Challenge中获得整体第二的FRCRN模型,以及在语音分离任务中表现卓越的MossFormer系列模型。基于MossFormer2的48kHz语音增强模型在有效抑制噪声的同时,大幅降低了语音失真。

阿里巴巴通义实验室希望通过ClearerVoice-Studio平台,为开发者、研究者和企业提供强大的语音处理工具,助力创新应用落地。用户可以通过在线体验Demo,准备一段包含噪声的语音文件,上传至指定页面,一键处理后在线试听或下载处理结果,即刻获得清晰的音质和卓越的降噪效果。

GitHub 仓库:https://github.com/modelscope/ClearerVoice-Studio

在线体验 Demo:https://huggingface.co/spaces/alibabasglab/ClearVoice

Colab 演示

!git clone https://github.com/modelscope/ClearerVoice-Studio.git

Cloning into ‘ClearerVoice-Studio’…
remote: Enumerating objects: 1033, done.
remote: Counting objects: 100% (249/249), done.
remote: Compressing objects: 100% (153/153), done.
remote: Total 1033 (delta 171), reused 139 (delta 95), pack-reused 784 (from 1)
Receiving objects: 100% (1033/1033), 203.85 MiB | 28.80 MiB/s, done.
Resolving deltas: 100% (512/512), done.
Updating files: 100% (284/284), done.

!pip install -q condacolab
import condacolab
condacolab.install()

⏬ Downloading https://github.com/conda-forge/miniforge/releases/download/23.11.0-0/Mambaforge-23.11.0-0-Linux-x86_64.sh…
📦 Installing…
📌 Adjusting configuration…
🩹 Patching environment…
⏲ Done in 0:00:12
🔁 Restarting kernel…

%%bash

cd ClearerVoice-Studio
conda create -n ClearerVoice-Studio python=3.8
Channels:
 - conda-forge
Platform: linux-64
Collecting package metadata (repodata.json): ...working... done
Solving environment: ...working... done

## Package Plan ##

  environment location: /usr/local/envs/ClearerVoice-Studio

  added / updated specs:
    - python=3.8


The following NEW packages will be INSTALLED:

  _libgcc_mutex      conda-forge/linux-64::_libgcc_mutex-0.1-conda_forge 
  _openmp_mutex      conda-forge/linux-64::_openmp_mutex-4.5-2_gnu 
  bzip2              conda-forge/linux-64::bzip2-1.0.8-h4bc722e_7 
  ca-certificates    conda-forge/linux-64::ca-certificates-2024.8.30-hbcca054_0 
  ld_impl_linux-64   conda-forge/linux-64::ld_impl_linux-64-2.43-h712a8e2_2 
  libffi             conda-forge/linux-64::libffi-3.4.2-h7f98852_5 
  libgcc             conda-forge/linux-64::libgcc-14.2.0-h77fa898_1 
  libgcc-ng          conda-forge/linux-64::libgcc-ng-14.2.0-h69a702a_1 
  libgomp            conda-forge/linux-64::libgomp-14.2.0-h77fa898_1 
  liblzma            conda-forge/linux-64::liblzma-5.6.3-hb9d3cd8_1 
  liblzma-devel      conda-forge/linux-64::liblzma-devel-5.6.3-hb9d3cd8_1 
  libnsl             conda-forge/linux-64::libnsl-2.0.1-hd590300_0 
  libsqlite          conda-forge/linux-64::libsqlite-3.47.2-hee588c1_0 
  libuuid            conda-forge/linux-64::libuuid-2.38.1-h0b41bf4_0 
  libxcrypt          conda-forge/linux-64::libxcrypt-4.4.36-hd590300_1 
  libzlib            conda-forge/linux-64::libzlib-1.3.1-hb9d3cd8_2 
  ncurses            conda-forge/linux-64::ncurses-6.5-he02047a_1 
  openssl            conda-forge/linux-64::openssl-3.4.0-hb9d3cd8_0 
  pip                conda-forge/noarch::pip-24.3.1-pyh8b19718_0 
  python             conda-forge/linux-64::python-3.8.20-h4a871b0_2_cpython 
  readline           conda-forge/linux-64::readline-8.2-h8228510_1 
  setuptools         conda-forge/noarch::setuptools-75.3.0-pyhd8ed1ab_0 
  tk                 conda-forge/linux-64::tk-8.6.13-noxft_h4845f30_101 
  wheel              conda-forge/noarch::wheel-0.45.1-pyhd8ed1ab_0 
  xz                 conda-forge/linux-64::xz-5.6.3-hbcc6ac9_1 
  xz-gpl-tools       conda-forge/linux-64::xz-gpl-tools-5.6.3-hbcc6ac9_1 
  xz-tools           conda-forge/linux-64::xz-tools-5.6.3-hb9d3cd8_1 



Downloading and Extracting Packages: ...working... done
Preparing transaction: ...working... done
Verifying transaction: ...working... done
Executing transaction: ...working... done
#
# To activate this environment, use
#
#     $ conda activate ClearerVoice-Studio
#
# To deactivate an active environment, use
#
#     $ conda deactivate



==> WARNING: A newer version of conda exists. <==
    current version: 23.11.0
    latest version: 24.11.0

Please update conda by running

    $ conda update -n base -c conda-forge conda


%%bash

cd ClearerVoice-Studio
conda init 
source activate ClearerVoice-Studio
pip install -r requirements.txt
%%bash
source activate ClearerVoice-Studio

cd ClearerVoice-Studio
cd clearvoice
python demo.py
%%bash
source activate ClearerVoice-Studio

cd ClearerVoice-Studio
cd clearvoice
python demo_with_more_comments.py

%%bash
source activate ClearerVoice-Studio

cd ClearerVoice-Studio
cd clearvoice

python
from clearvoice import ClearVoice

myClearVoice = ClearVoice(task='speech_enhancement', model_names=['MossFormer2_SE_48K'])

#process single wave file
output_wav = myClearVoice(input_path='samples/input.wav', online_write=False)
myClearVoice.write(output_wav, output_path='samples/output_MossFormer2_SE_48K.wav')

#process wave directory
myClearVoice(input_path='samples/path_to_input_wavs', online_write=True, output_path='samples/path_to_output_wavs')

#process wave list file
myClearVoice(input_path='samples/scp/audio_samples.scp', online_write=True, output_path='samples/path_to_output_wavs_scp')

常见问题

原来这项目只支持python3.8啊?

音频文件长度超过20秒好像就非常卡?

评论
添加红包

请填写红包祝福语或标题

红包个数最小为10个

红包金额最低5元

当前余额3.43前往充值 >
需支付:10.00
成就一亿技术人!
领取后你会自动成为博主和红包主的粉丝 规则
hope_wisdom
发出的红包
实付
使用余额支付
点击重新获取
扫码支付
钱包余额 0

抵扣说明:

1.余额是钱包充值的虚拟货币,按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载,可以购买VIP、付费专栏及课程。

余额充值