spleeter 人声分离笔记

目录

安装

Windows安装:

python 3.11报错;

方式一:一行命令(最快上手)

方式二:Python 脚本(批量 & 更灵活)

单文件分离(人声/伴奏)

批量处理整个文件夹

指定设备(CPU/GPU)与多线程

常见问题排查


下面给你一套最实用的 Spleeter 语音分离(人声/伴奏)代码,含安装、CLI 一行命令、以及 Python API 脚本(可批量处理、可选 2/4/5 stems,自动选 CPU/GPU)。


安装

Windows安装:

# 新建 conda 环境,指定 Python 版本
conda create -n spleeter python=3.9
conda activate spleeter

pip install pyqt5==5.15.9 pyqt5-qt5==5.15.2 qt5-applications==5.15.2.2.2 python-lzf

pip install tensorflow==2.9.3 numpy==1.23.5

# 再安装 spleeter
pip install spleeter

安装的是2.4.0

python 3.11报错;

      Running from numpy source directory.
      <string>:461: UserWarning: Unrecognized setuptools command, proceeding with generating Cython sources and expanding templates
      Traceback (most recent call last):
        File "B:\ProgramData\miniconda3\envs\py311\Lib\site-packages\pip\_vendor\pyproject_hooks\_in_process\_in_process.py", line 353, in <module>
          main()
        File "B:\ProgramData\miniconda3\envs\py311\Lib\site-packages\pip\_vendor\pyproject_hooks\_in_process\_in_process.py", line 335, in main
          json_out['return_val'] = hook(**hook_input['kwargs'])
                                   ^^^^^^^^^^^^^^^^^^^^^^^^^^^^
        File "B:\ProgramData\miniconda3\envs\py311\Lib\site-packages\pip\_vendor\pyproject_hooks\_in_process\_in_process.py", line 149, in prepare_metadata_for_build_wheel
          return hook(metadata_directory, config_settings)
                 ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
        File "C:\Users\Administrator\AppData\Local\Temp\pip-build-env-mj28xqh3\overlay\Lib\site-packages\setuptools\build_meta.py", line 374, in prepare_metadata_for_build_wheel
          self.run_setup()
        File "C:\Users\Administrator\AppData\Local\Temp\pip-build-env-mj28xqh3\overlay\Lib\site-packages\setuptools\build_meta.py", line 512, in run_setup
          super().run_setup(setup_script=setup_script)
        File "C:\Users\Administrator\AppData\Local\Temp\pip-build-env-mj28xqh3\overlay\Lib\site-packages\setuptools\build_meta.py", line 317, in run_setup
          exec(code, locals())
        File "<string>", line 488, in <module>
        File "<string>", line 465, in setup_package
        File "C:\Users\Administrator\AppData\Local\Temp\pip-install-ehzsep4y\numpy_13571a2d75c64c83ab4a81b86d7d1406\numpy\distutils\__init__.py", line 26, in <module>
          from . import ccompiler
        File "C:\Users\Administrator\AppData\Local\Temp\pip-install-ehzsep4y\numpy_13571a2d75c64c83ab4a81b86d7d1406\numpy\distutils\ccompiler.py", line 111, in <module>
          replace_method(CCompiler, 'find_executables', CCompiler_find_executables)
                         ^^^^^^^^^
      NameError: name 'CCompiler' is not defined. Did you mean: 'ccompiler'?
      [end of output]

# 建议使用新虚拟环境
pip install -U pip setuptools wheel
pip install spleeter  # 默认安装 TensorFlow CPU 版

# 可选:安装 ffmpeg(写 mp3/flac 等需要)
# macOS: brew install ffmpeg
# Ubuntu: sudo apt-get install ffmpeg
# Windows: 安装 ffmpeg 并把 ffmpeg.exe 所在目录加入 PATH

使用 NVIDIA GPU:请改装对应版本的 tensorflow(如 pip install tensorflow==2.12.*tensorflow[and-cuda],取决于环境)。若只用 CPU,直接上面默认即可。

测试安装是否成功;

python -c "from spleeter.separator import Separator;from spleeter.audio.adapter import AudioAdapter"


方式一:一行命令(最快上手)

# 2 声部(人声 + 伴奏)
spleeter separate -i input.mp3 -p spleeter:2stems -o output_dir

# 4 声部(人声、鼓、贝斯、其他)
spleeter separate -i input.mp3 -p spleeter:4stems -o output_dir

# 5 声部(人声、钢琴、贝斯、鼓、其他)
spleeter separate -i input.wav -p spleeter:5stems -o output_dir

输出会在 output_dir/输入文件名/ 下生成对应的 wav 文件。


方式二:Python 脚本(批量 & 更灵活)

单文件分离(人声/伴奏)



# coding=utf-8
import sys
import os
current_dir = os.path.dirname(os.path.abspath(__file__))
os.chdir(current_dir)

# file: separate_simple.py
from spleeter.separator import Separator
from spleeter.audio.adapter import AudioAdapter
 
def separate_to_files(input_path: str, output_dir: str, stems: int = 2, sample_rate: int = 44100):
    """
    stems: 2 / 4 / 5
    """
    assert stems in (2, 4, 5)
    # 选择模型:2stems/4stems/5stems
    separator = Separator(f'spleeter:{stems}stems')
    audio_loader = AudioAdapter.default()
    # 加载音频为 numpy
    waveform, sr = audio_loader.load(input_path, sample_rate=sample_rate)
    # 执行分离
    prediction = separator.separate(waveform)
    # 将各个轨写入文件(默认写 wav)
    audio_loader.save(f"{output_dir}/vocals.wav", prediction.get('vocals'), sample_rate=sample_rate)
    if stems == 2:
        audio_loader.save(f"{output_dir}/accompaniment.wav", prediction.get('accompaniment'), sample_rate=sample_rate)
    else:
        # 4/5 stems 其他部件可能包括: drums, bass, piano, other
        for k, v in prediction.items():
            if k != 'vocals':
                audio_loader.save(f"{output_dir}/{k}.wav", v, sample_rate=sample_rate)
 
if __name__ == "__main__":

    out_dir=r"out_song"
    os.makedirs(out_dir,exist_ok=True)
    # 示例
    separate_to_files("/nas/lbg/project/tool_down/data_0815/volume_low/1_lang3.mp3", out_dir, stems=2)

批量处理整个文件夹

# file: separate_batch.py
import os
from pathlib import Path
from spleeter.separator import Separator
from spleeter.audio.adapter import AudioAdapter

def batch_separate(input_dir: str, output_root: str, stems: int = 2, sample_rate: int = 44100):
    assert stems in (2, 4, 5)
    separator = Separator(f'spleeter:{stems}stems')   # 初始化一次,重复用
    audio_loader = AudioAdapter.default()

    input_dir = Path(input_dir)
    for p in input_dir.rglob("*"):
        if p.suffix.lower() in {".wav", ".mp3", ".flac", ".m4a", ".ogg"}:
            rel = p.relative_to(input_dir)
            out_dir = Path(output_root) / rel.with_suffix("")  # 每首歌一个文件夹
            out_dir.mkdir(parents=True, exist_ok=True)

            # 加载 & 分离
            waveform, _ = audio_loader.load(str(p), sample_rate=sample_rate)
            prediction = separator.separate(waveform)

            # 保存
            for stem_name, stem_audio in prediction.items():
                audio_loader.save(str(out_dir / f"{stem_name}.wav"), stem_audio, sample_rate=sample_rate)

            # 2 stems 时补一份伴奏
            if stems == 2 and "accompaniment" in prediction:
                pass  # 上面已经保存

if __name__ == "__main__":
    # 示例:把 ./songs 里所有音频分离到 ./separated 下
    batch_separate("./songs", "./separated", stems=2)

指定设备(CPU/GPU)与多线程

 

from spleeter.separator import Separator # 使用 CPU(默认) sep_cpu = Separator('spleeter:2stems') # 或 Separator('spleeter:4stems') # 使用 GPU(需要正确安装 GPU 版 TensorFlow + CUDA/CuDNN) sep_gpu = Separator('spleeter:2stems', params_descriptor=None) # Spleeter 本身不直接选择设备,设备选择由 TensorFlow 决定: # import tensorflow as tf; tf.config.list_physical_devices('GPU') # 若未检测到 GPU,会自动回落到 CPU。


常见问题排查

  • 导出 mp3 失败:请确认系统已安装 ffmpeg,否则改存 wav

  • 显存不足/慢:用 2stems 更快更省;或降低 sample_rate=32000;或仅 CPU 运行。

  • 报 TensorFlow/CUDA 版本问题:统一 TensorFlow 与 CUDA/CuDNN 版本;只需 CPU 就卸载 GPU 相关。

ReadMe Release Version beta_1.0 index.py imageMatlab.py This is more or less a wrapper for Matplotlib imaging functions such that their behavior is equivalent, in terms of colormap, aspect and so forth, to the expected behavior of Matlab's functions. sepVocal.py This script can be used to execute the desired separation. See below for an example of use of this file. SIMM.py This script implements the actual algorithm for parameter estimation. It is mainly used by sepVocal.py. tracking.py The Viterbi decoding algorithm is implemented in this script. Requirements: These scripts have been tested with Python 2.7, The packages that are required to run the scripts are pydub,ffmepg, Numpy, Spicy, Matplotlib. One can respectively find the latest versions at the following addresses: http://pydub.com/ https://ffmpeg.org http://numpy.org/ http://scipy.org/ http://matplotlib.sourceforge.net/ Notes: Prefer recent versions of the above packages, in order to avoid compatibility issues, notably for Matplotlib. Note that this latter package is not necessary for the program to run, although you might want to watch a bit what is happening! Spicy should be version 0.8+, since we use its io.wavefile module to read the wave files. We once used the audio lab module, but it would seem that it is a bit more complicated to install (with the benefit that many more file formats are allowed). Usage: The easy way to use these scripts is to run the exec package of our release version: http://www.github.com/beata_1.0 for more develop: you can run the index.py on pycharm directly. note: the output files will create under you source wav file. ContactMe Email:xlzhang14@fudan.edu.cn
评论
添加红包

请填写红包祝福语或标题

红包个数最小为10个

红包金额最低5元

当前余额3.43前往充值 >
需支付:10.00
成就一亿技术人!
领取后你会自动成为博主和红包主的粉丝 规则
hope_wisdom
发出的红包

打赏作者

AI算法网奇

你的鼓励将是我创作的最大动力

¥1 ¥2 ¥4 ¥6 ¥10 ¥20
扫码支付:¥1
获取中
扫码支付

您的余额不足,请更换扫码支付或充值

打赏作者

实付
使用余额支付
点击重新获取
扫码支付
钱包余额 0

抵扣说明:

1.余额是钱包充值的虚拟货币,按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载,可以购买VIP、付费专栏及课程。

余额充值