AI语音模型PaddleSpeech部署到昇腾NPU详细步骤

PaddleSpeech是飞桨推出的一个开源语音处理工具包,提供了完整的端到端语音处理解决方案,包括语音识别(ASR)、语音合成(TTS)、语音增强和语音翻译等功能。

https://github.com/PaddlePaddle/PaddleSpeech

一、华为鲲鹏CPU验证

1.购买华为云虚拟私有云VPC和弹性云服务器ECS

详细流程参考:创建华为云弹性云服务器ECS流程_华为云创建的弹性云服务-CSDN博客

已有可跳过此步骤。

ECS配置概要:

基础配置

计费模式: 按需计费

区域/可用区: 华北-北京四 | 随机分配

实例

规格: 鲲鹏通用计算增强型 | 8vCPUs | 16GiB | kc1.2xlarge.2

操作系统

镜像: Huawei Cloud EulerOS 2.0 64bit for kAi2p with HDK 23.0.1 and CANN 7.0.0.1 RC

存储与备份

系统盘: 通用型SSD, 200GiB

网络

虚拟私有云: 选择已有VPC

源/目的检查: 开启

安全组

default

公网访问

弹性公网IP: 全动态BGP | 按流量计费 | 1 Mbit/s

云服务器管理

云服务器名称: 自定义

登录凭证: 密码

其他设为默认。

2.环境搭建

使用PaddleSpeech最难的部分就在于配置环境和安装,因为README和setup.py太久没有维护,当中提供的下载路径和依赖版本很多已经弃用失效,Issues中提问也得不到有效回复。安装编译中遇到了各种问题,只能根据报错提示结合Issues中的讨论来解决。

以下是经过验证可以正常运行的环境配置。

从 GitHub 拉取代码。

git clone https://github.com/PaddlePaddle/PaddleSpeech.git
cd PaddleSpeech

conda创建python3.9环境。

conda create --name PaddleSpeech python=3.9
conda activate PaddleSpeech

README相关依赖中推荐paddlepaddle<=2.5.1,2.5.1版本已失效,也可以使用2.4.2版本,但后续为了部署在昇腾NPU上需要安装适配的3.0.0b2版本。

python -m pip install paddlepaddle==3.0.0b2 -i https://www.paddlepaddle.org.cn/packages/stable/cpu/

为了能够与paddlepaddle版本对应,使用paddlespeech的develop版本,通过源码编译方式安装项目,需要安装conda 依赖。

conda install -y -c conda-forge sox libsndfile swig bzip2

如果出现报错:

Solving environment: failed

PackagesNotFoundError: The following packages are not available from current channels:

  - sox

conda渠道没有适用版本的依赖,用pip安装即可。

pip install sox

安装依赖,进行编译。

pip install pytest-runner
# 请确保目前处于PaddleSpeech项目的根目录
pip install . 

编译这一步各种报错,setup.py文件里面的很多依赖版本都有问题,按照报错先都改为不指定版本。例如:

ERROR: Ignored the following versions that require a different python version: 1.14.0 Requires-Python >=3.10; 1.14.0rc1 Requires-Python >=3.10; 1.14.0rc2 Requires-Python >=3.10; 1.14.1 Requires-Python >=3.10; 2.1.0 Requires-Python >=3.10; 2.1.0rc1 Requires-Python >=3.10; 2.1.1 Requires-Python >=3.10; 2.1.2 Requires-Python >=3.10; 2.1.3 Requires-Python >=3.10; 3.10.0rc1 Requires-Python >=3.10
ERROR: Could not find a version that satisfies the requirement opencc==1.1.6 (from paddlespeech) (from versions: 0.1, 0.2, 1.1.8, 1.1.9)
ERROR: No matching distribution found for opencc==1.1.6
base = [...
opencc==1.1.6 改为opencc
paddleaudio>=1.1.0 改为paddleaudio
...]

编译成功。

3.下载示例音频

PaddleSpeech支持 16k wav 格式音频,提供了测试音频示例,也可以使用自己准备的音频:

wget https://paddlespeech.bj.bcebos.com/PaddleAudio/zh.wav

4.运行代码

新建run_asr.py文件,进行语音识别。

import paddle
from paddlespeech.cli.asr.infer import ASRExecutor
from paddlespeech.cli.text.infer import TextExecutor

print(f"Current device: {paddle.device.get_device()}")

# 语音识别
asr = ASRExecutor()
asr_result = asr(audio_file="zh.wav")
print(asr_result)
 
# 标点恢复
text_punc = TextExecutor() 
result = text_punc(text=asr_result)
print(result)

运行提示缺少依赖kaldiio。

pip install kaldiio

再次运行虽然有报错但是模型能够正常推理输出,第一次运行需要下载模型会耗时比较久。

which: no ccache in (/root/miniconda3/envs/PaddleSpeech/bin:/root/miniconda3/condabin:/usr/local/Ascend/ascend-toolkit/latest/bin:/usr/local/Ascend/ascend-toolkit/latest/compiler/ccec_compiler/bin:/usr/local/sbin:/usr/local/bin:/usr/sbin:/usr/bin)
/root/miniconda3/envs/PaddleSpeech/lib/python3.9/site-packages/paddle/utils/cpp_extension/extension_utils.py:686: UserWarning: No ccache found. Please be aware that recompiling all source files may be required. You can download and install ccache from: https://github.com/ccache/ccache/blob/master/doc/INSTALL.md
  warnings.warn(warning_message)
/root/miniconda3/envs/PaddleSpeech/lib/python3.9/site-packages/_distutils_hack/__init__.py:31: UserWarning: Setuptools is replacing distutils. Support for replacing an already imported distutils is deprecated. In the future, this condition will fail. Register concerns at https://github.com/pypa/setuptools/issues/new?template=distutils-deprecation.yml
  warnings.warn(
Current device: cpu
100%|███████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████| 499M/499M [00:34<00:00, 14.3MB/s]
WARNING: pip is being invoked by an old script wrapper. This will fail in a future version of pip.
Please see https://github.com/pypa/pip/issues/5599 for advice on fixing the underlying issue.
To avoid this problem you can invoke Python with '-m pip' instead of running pip directly.
Looking in indexes: https://pypi.tuna.tsinghua.edu.cn/simple
ERROR: Could not find a version that satisfies the requirement paddlespeech_ctcdecoders (from versions: none)
ERROR: No matching distribution found for paddlespeech_ctcdecoders
2024-11-21 11:13:24.766 | INFO     | paddlespeech.s2t.modules.ctc:<module>:45 - paddlespeech_ctcdecoders not installed!
W1121 11:13:24.876871 388779 dygraph_functions.cc:83253] got different data type, run type promotion automatically, this may cause data type been changed.
2024-11-21 11:13:24.943 | INFO     | paddlespeech.s2t.modules.embedding:__init__:153 - max len: 5000
我认为跑步最重要的就是给我带来了身体健康
100%|███████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████| 438M/438M [00:38<00:00, 11.5MB/s]
[2024-11-21 11:14:26,790] [    INFO] - Loading configuration file /root/.paddlespeech/models/ernie_linear_p7_wudao-punc-zh/1.0/ernie_linear_p7_wudao-punc-zh.tar/ckpt/model_config.json
[2024-11-21 11:14:26,791] [    INFO] - Loading weights file /root/.paddlespeech/models/ernie_linear_p7_wudao-punc-zh/1.0/ernie_linear_p7_wudao-punc-zh.tar/ckpt/model_state.pdparams
[2024-11-21 11:14:29,396] [    INFO] - Loaded weights file from disk, setting weights to model.
[2024-11-21 11:14:38,135] [    INFO] - All model checkpoint weights were used when initializing ErnieForTokenClassification.

[2024-11-21 11:14:38,135] [    INFO] - All the weights of ErnieForTokenClassification were initialized from the model checkpoint at /root/.paddlespeech/models/ernie_linear_p7_wudao-punc-zh/1.0/ernie_linear_p7_wudao-punc-zh.tar/ckpt.
If your task is similar to the task the model of the checkpoint was trained on, you can already use ErnieForTokenClassification for predictions without further training.
[2024-11-21 11:14:38,158] [    INFO] - tokenizer config file saved in /root/.paddlenlp/models/ernie-1.0/tokenizer_config.json
[2024-11-21 11:14:38,159] [    INFO] - Special tokens file saved in /root/.paddlenlp/models/ernie-1.0/special_tokens_map.json
我认为,跑步最重要的,就是给我带来了身体健康。

虽然不影响运行结果,但还是尽量分析解决一下这些警告:

  • which: no ccache in (/root/miniconda3/envs/PaddleSpeech/bin:/root/miniconda3/condabin:/usr/local/Ascend/ascend-toolkit/latest/bin:/usr/local/Ascend/ascend-toolkit/latest/compiler/ccec_compiler/bin:/usr/local/sbin:/usr/local/bin:/usr/sbin:/usr/bin)
    /root/miniconda3/envs/PaddleSpeech/lib/python3.9/site-packages/paddle/utils/cpp_extension/extension_utils.py:686: UserWarning: No ccache found. Please be aware that recompiling all source files may be required. You can download and install ccache from: https://github.com/ccache/ccache/blob/master/doc/INSTALL.md
      warnings.warn(warning_message)

    提示在当前环境中没有找到 ccacheccache 是一个编译缓存工具,可以显著加快重新编译的速度。如果不介意重新编译所有源文件的时间,可以选择忽略这个警告。如果希望提高编译速度,可以按照提示安装 ccache

    conda install -c conda-forge ccache
  • /root/miniconda3/envs/PaddleSpeech/lib/python3.9/site-packages/_distutils_hack/__init__.py:30: UserWarning: Setuptools is replacing distutils. Support for replacing an already imported distutils is deprecated. In the future, this condition will fail. Register concerns at https://github.com/pypa/setuptools/issues/new?template=distutils-deprecation.yml
      warnings.warn(

    这个警告表示 setuptools 正在替换 distutils,并且在未来这种替换可能会失败,setuptools项目中建议通过更新 setuptools 来解决。

    WARNING: pip is being invoked by an old script wrapper. This will fail in a future version of pip.
    Please see https://github.com/pypa/pip/issues/5599 for advice on fixing the underlying issue.
    To avoid this problem you can invoke Python with '-m pip' instead of running pip directly.

    这个警告表示您正在使用的 pip 是通过一个旧的脚本包装器调用的,这在未来可能会导致问题。建议使用 python -m pip 命令来调用 pip。

    python -m pip install --upgrade setuptools

    按照提示更新setuptools,警告还是存在。

  • ERROR: Could not find a version that satisfies the requirement paddlespeech_ctcdecoders (from versions: none)
    ERROR: No matching distribution found for paddlespeech_ctcdecoders
    这个警告表示找不到满足要求的pspeech_ctcdecoders版本,pip安装也无法找到合适的版本。lssues中反馈这个包不是必须的,发现项目代码里有,还是尝试一下手动编译:
    cd third_party/ctc_decoders
    bash setup.sh
    pip install --user .

重新运行run_asr.py文件:

 5.构建本地示例应用

5.1查看弹性公网IP和设置开放端口

详细流程参考:查看华为云ECS弹性公网IP和设置开放端口-CSDN博客

查看后得到用于登录Web应用界面的IP地址。这里为http://113.44.138.39:7863/

5.2使用gradio构建Web应用界面

安装gradio库。

pip install gradio

新建gradio_show.py文件内容为:

import gradio as gr
import paddle
from paddlespeech.cli.asr.infer import ASRExecutor
from paddlespeech.cli.text.infer import TextExecutor

print(f"Current device: {paddle.device.get_device()}")

# 初始化 ASR 和 TextExecutor
text_punc = TextExecutor()
asr = ASRExecutor()

def process_audio(audio_file):
    # 进行语音识别
    asr_result = asr(audio_file=audio_file)
    print(f"ASR Result: {asr_result}")
    
    # 进行文本加标点符号处理
    result = text_punc(text=asr_result)
    print(f"Punctuation Result: {result}")
    
    return asr_result, result

# 创建 Gradio 接口
demo = gr.Interface(
    fn=process_audio,
    inputs=gr.Audio(type="filepath"),  # 输入类型为音频文件路径
    outputs=[
        gr.Textbox(label="ASR Result"),  # 输出 ASR 结果
        gr.Textbox(label="Punctuation Result")  # 输出加标点符号的结果
    ],
    title="语音识别与文本加标点符号",
    description="上传音频文件,进行语音识别并添加标点符号。"
)

# 启动 Gradio 应用
if __name__ == "__main__":
    demo.launch(server_name="0.0.0.0", server_port=7863)

nohup 运行gradio_show.py,并在浏览器转到http://113.44.138.39:7863/打开服务页面,可以上传音频文件进行语音识别:

nohup python gradio_show.py &

二、部署到华为昇腾NPU

 昇腾环境:

芯片类型:昇腾910B3
CANN版本:CANN 7.0.1.5
驱动版本:23.0.6
操作系统:Huawei Cloud EulerOS 2.0

1.查看NPU硬件信息
npu-smi info


如果 Health 状态为 OK,说明 NPU 和 CANN 正常运行。

2.NPU环境下运行代码
2.1 下载PaddlePaddle的NPU插件

参照开始使用_飞桨-源于产业实践的开源深度学习平台

PaddlePaddle框架适配昇腾NPU目前只支持CANN 8.0.RC1,需要升级CANN驱动版本,所以采用docker镜像。

拉取飞桨官方发布的昇腾 NPU 开发镜像,

docker pull registry.baidubce.com/device/paddle-npu:cann80RC1-ubuntu20-aarch64-gcc84-py39

启动容器,指定可见的 NPU 卡号

docker run -it --name paddlespeech -v $(pwd):/work \
    --privileged --network=host --shm-size=128G -w=/work \
    -v /usr/local/Ascend/driver:/usr/local/Ascend/driver \
    -v /usr/local/bin/npu-smi:/usr/local/bin/npu-smi \
    -v /usr/local/dcmi:/usr/local/dcmi \
    -e ASCEND_RT_VISIBLE_DEVICES="0,1,2,3,4,5,6,7" \
    registry.baidubce.com/device/paddle-npu:cann80RC1-ubuntu20-$(uname -m)-gcc84-py39 /bin/bash

安装paddle NPU 插件包

python -m pip install paddlepaddle==3.0.0b2 -i https://www.paddlepaddle.org.cn/packages/stable/cpu/
python -m pip install paddle-custom-npu==3.0.0b2 -i https://www.paddlepaddle.org.cn/packages/stable/npu/

检查当前安装版本

python -c "import paddle_custom_device; paddle_custom_device.npu.version()"

2.2 参照华为云上流程进行环境搭建

步骤参照ECS环境搭建。

其中conda依赖改为直接用系统包管理器安装:

apt-get install -y sox libsndfile1 swig bzip2 ccache
2.3 运行代码

运行run_ocr.py代码:

λ devserver-314b-1 /work/PaddleSpeech {develop} python run_asr.py 
I1122 17:53:15.621927    46 init.cc:236] ENV [CUSTOM_DEVICE_ROOT]=/usr/local/lib/python3.9/dist-packages/paddle_custom_device
I1122 17:53:15.621981    46 init.cc:145] Try loading custom device libs from: [/usr/local/lib/python3.9/dist-packages/paddle_custom_device]
I1122 17:53:16.312119    46 custom_device.cc:1099] Succeed in loading custom runtime in lib: /usr/local/lib/python3.9/dist-packages/paddle_custom_device/libpaddle-custom-npu.so
I1122 17:53:16.319656    46 custom_kernel.cc:63] Succeed in loading 357 custom kernel(s) from loaded lib(s), will be used like native ones.
I1122 17:53:16.319842    46 init.cc:157] Finished in LoadCustomDevice with libs_path: [/usr/local/lib/python3.9/dist-packages/paddle_custom_device]
I1122 17:53:16.319882    46 init.cc:242] CustomDevice: npu, visible devices count: 1
/usr/local/lib/python3.9/dist-packages/_distutils_hack/__init__.py:30: UserWarning: Setuptools is replacing distutils. Support for replacing an already imported distutils is deprecated. In the future, this condition will fail. Register concerns at https://github.com/pypa/setuptools/issues/new?template=distutils-deprecation.yml
  warnings.warn(
W1122 17:53:18.899689    46 ir_context.cc:305] custom_op.my_add_n op already registered.
W1122 17:53:18.899780    46 custom_operator.cc:969] Operator (my_add_n) has been registered.
W1122 17:53:18.899803    46 ir_context.cc:305] custom_op.update_inputs_ op already registered.
W1122 17:53:18.899811    46 custom_operator.cc:969] Operator (update_inputs) has been registered.
W1122 17:53:18.899827    46 ir_context.cc:305] custom_op.get_output_ op already registered.
W1122 17:53:18.899835    46 custom_operator.cc:969] Operator (get_output) has been registered.
W1122 17:53:18.899847    46 ir_context.cc:305] custom_op.set_stop_value_multi_ends op already registered.
W1122 17:53:18.899855    46 custom_operator.cc:969] Operator (set_stop_value_multi_ends) has been registered.
W1122 17:53:18.899875    46 ir_context.cc:305] custom_op.fused_blha_layer_op op already registered.
W1122 17:53:18.899883    46 custom_operator.cc:969] Operator (fused_blha_layer_op) has been registered.
W1122 17:53:18.899893    46 ir_context.cc:305] custom_op.remove_padding op already registered.
W1122 17:53:18.899900    46 custom_operator.cc:969] Operator (remove_padding) has been registered.
W1122 17:53:18.899911    46 ir_context.cc:305] custom_op.fused_get_rotary_embedding op already registered.
W1122 17:53:18.899920    46 custom_operator.cc:969] Operator (fused_get_rotary_embedding) has been registered.
W1122 17:53:18.899931    46 ir_context.cc:305] custom_op.fused_rope op already registered.
W1122 17:53:18.899940    46 ir_context.cc:305] custom_op.fused_rope_grad op already registered.
W1122 17:53:18.899947    46 custom_operator.cc:969] Operator (fused_rope) has been registered.
W1122 17:53:18.899957    46 ir_context.cc:305] custom_op.rms_norm_npu op already registered.
W1122 17:53:18.899967    46 ir_context.cc:305] custom_op.rms_norm_npu_grad op already registered.
W1122 17:53:18.899976    46 custom_operator.cc:969] Operator (rms_norm_npu) has been registered.
W1122 17:53:18.899988    46 ir_context.cc:305] custom_op.fused_mm_reduce_scatter op already registered.
W1122 17:53:18.899997    46 custom_operator.cc:969] Operator (fused_mm_reduce_scatter) has been registered.
W1122 17:53:18.900010    46 ir_context.cc:305] custom_op.fused_allgather_mm op already registered.
W1122 17:53:18.900018    46 custom_operator.cc:969] Operator (fused_allgather_mm) has been registered.
W1122 17:53:18.900029    46 ir_context.cc:305] custom_op.rebuild_padding_v2 op already registered.
W1122 17:53:18.900038    46 custom_operator.cc:969] Operator (rebuild_padding_v2) has been registered.
W1122 17:53:18.900050    46 ir_context.cc:305] custom_op.flash_attention_npu op already registered.
W1122 17:53:18.900063    46 ir_context.cc:305] custom_op.flash_attention_npu_grad op already registered.
W1122 17:53:18.900069    46 custom_operator.cc:969] Operator (flash_attention_npu) has been registered.
W1122 17:53:18.900079    46 ir_context.cc:305] custom_op.write_cache_kv_ op already registered.
W1122 17:53:18.900086    46 custom_operator.cc:969] Operator (write_cache_kv) has been registered.
W1122 17:53:18.900095    46 ir_context.cc:305] custom_op.rebuild_padding op already registered.
W1122 17:53:18.900102    46 custom_operator.cc:969] Operator (rebuild_padding) has been registered.
W1122 17:53:18.900112    46 ir_context.cc:305] custom_op.get_padding_offset op already registered.
W1122 17:53:18.900120    46 custom_operator.cc:969] Operator (get_padding_offset) has been registered.
W1122 17:53:18.900132    46 ir_context.cc:305] custom_op.fused_mm_allreduce op already registered.
W1122 17:53:18.900142    46 custom_operator.cc:969] Operator (fused_mm_allreduce) has been registered.
W1122 17:53:18.900151    46 ir_context.cc:305] custom_op.get_token_penalty_multi_scores_v2_ op already registered.                                                                                                                
W1122 17:53:18.900159    46 custom_operator.cc:969] Operator (get_token_penalty_multi_scores_v2) has been registered.
W1122 17:53:18.900171    46 ir_context.cc:305] custom_op.get_padding_offset_v2 op already registered.
W1122 17:53:18.900178    46 custom_operator.cc:969] Operator (get_padding_offset_v2) has been registered.
W1122 17:53:18.900192    46 ir_context.cc:305] custom_op.lm_head op already registered.
W1122 17:53:18.900199    46 custom_operator.cc:969] Operator (lm_head) has been registered.
W1122 17:53:18.900211    46 ir_context.cc:305] custom_op.quant_int8 op already registered.
W1122 17:53:18.900219    46 custom_operator.cc:969] Operator (quant_int8) has been registered.
W1122 17:53:18.900231    46 ir_context.cc:305] custom_op.qkv_transpose_split op already registered.
W1122 17:53:18.900238    46 custom_operator.cc:969] Operator (qkv_transpose_split) has been registered.
W1122 17:53:18.900249    46 ir_context.cc:305] custom_op.save_with_output op already registered.
W1122 17:53:18.900256    46 custom_operator.cc:969] Operator (save_with_output) has been registered.
W1122 17:53:18.900266    46 ir_context.cc:305] custom_op.set_value_by_flags_and_idx op already registered.
W1122 17:53:18.900274    46 custom_operator.cc:969] Operator (set_value_by_flags_and_idx) has been registered.
W1122 17:53:18.900285    46 ir_context.cc:305] custom_op.save_output_ op already registered.
W1122 17:53:18.900293    46 custom_operator.cc:969] Operator (save_output) has been registered.
W1122 17:53:18.900305    46 ir_context.cc:305] custom_op.set_value_by_flags_and_idx_v2_ op already registered.
W1122 17:53:18.900314    46 custom_operator.cc:969] Operator (set_value_by_flags_and_idx_v2) has been registered.
W1122 17:53:18.900324    46 ir_context.cc:305] custom_op.dequant_int8 op already registered.
W1122 17:53:18.900333    46 custom_operator.cc:969] Operator (dequant_int8) has been registered.
W1122 17:53:18.900343    46 ir_context.cc:305] custom_op.get_token_penalty_multi_scores op already registered.
W1122 17:53:18.900350    46 custom_operator.cc:969] Operator (get_token_penalty_multi_scores) has been registered.                                                                                                                
W1122 17:53:18.900362    46 ir_context.cc:305] custom_op.step_paddle_ op already registered.
W1122 17:53:18.900370    46 custom_operator.cc:969] Operator (step_paddle) has been registered.
W1122 17:53:18.900379    46 ir_context.cc:305] custom_op.set_stop_value_multi_ends_v2_ op already registered.
W1122 17:53:18.900388    46 custom_operator.cc:969] Operator (set_stop_value_multi_ends_v2) has been registered.
W1122 17:53:18.900398    46 ir_context.cc:305] custom_op.write_int8_cache_kv_ op already registered.
W1122 17:53:18.900404    46 custom_operator.cc:969] Operator (write_int8_cache_kv) has been registered.
W1122 17:53:18.900415    46 ir_context.cc:305] custom_op.encode_rotary_qk_ op already registered.
W1122 17:53:18.900422    46 custom_operator.cc:969] Operator (encode_rotary_qk) has been registered.
W1122 17:53:18.900431    46 ir_context.cc:305] custom_op.transpose_remove_padding op already registered.
W1122 17:53:18.900439    46 custom_operator.cc:969] Operator (transpose_remove_padding) has been registered.
Current device: npu:0
W1122 17:53:40.505782    46 dygraph_functions.cc:83253] got different data type, run type promotion automatically, this may cause data type been changed.
2024-12-06 17:53:41.525 | INFO     | paddlespeech.s2t.modules.embedding:__init__:153 - max len: 5000
[2024-12-06 17:53:45,123] [CRITICAL] transformation.py:149 - Catch a exception from 0th func: LogMelSpectrogramKaldi(fs=16000, n_mels=80, n_frame_shift=10.0, n_frame_length=25.0, dither=1.0))
Traceback (most recent call last):
  File "/work/PaddleSpeech/run_asr.py", line 10, in <module>
    asr_result = asr(audio_file="zh.wav")
  File "/work/PaddleSpeech/paddlespeech/cli/utils.py", line 328, in _warpper
    return executor_func(self, *args, **kwargs)
  File "/work/PaddleSpeech/paddlespeech/cli/asr/infer.py", line 510, in __call__
    self.preprocess(model, audio_file)
  File "/work/PaddleSpeech/paddlespeech/cli/asr/infer.py", line 275, in preprocess
    audio = preprocessing(audio, **preprocess_args)
  File "/work/PaddleSpeech/paddlespeech/audio/transform/transformation.py", line 147, in __call__
    xs = [func(x, **_kwargs) for x in xs]
  File "/work/PaddleSpeech/paddlespeech/audio/transform/transformation.py", line 147, in <listcomp>
    xs = [func(x, **_kwargs) for x in xs]
  File "/work/PaddleSpeech/paddlespeech/audio/transform/spectrogram.py", line 372, in __call__
    mat = kaldi.fbank(
  File "/usr/local/lib/python3.9/dist-packages/paddleaudio/compliance/kaldi.py", line 462, in fbank
    strided_input, signal_log_energy = _get_window(
  File "/usr/local/lib/python3.9/dist-packages/paddleaudio/compliance/kaldi.py", line 170, in _get_window
    offset_strided_input = paddle.nn.functional.pad(
  File "/usr/local/lib/python3.9/dist-packages/paddle/nn/functional/common.py", line 2036, in pad
    out = _C_ops.pad3d(x, pad, mode, value, data_format)
NotImplementedError: (Unimplemented) npu npu only support mode=constant right now,but received mode is replicate .                                                                                                                
  [Hint: Expected mode == "constant", but received mode:replicate != "constant":constant.] (at /paddle/backends/npu/kernels/pad3d_kernel.cc:43)

根据提示修改源码File "/usr/local/lib/python3.9/dist-packages/paddleaudio/compliance/kaldi.py", line 170, in _get_window:

169     if preemphasis_coefficient != 0.0:
170         if paddle.device.get_device().startswith('npu'): 
171             mode = 'constant'
172         else:
173             mode = 'replicate'
174         offset_strided_input = paddle.nn.functional.pad(
175             strided_input.unsqueeze(0), (1, 0),
176             data_format='NCL',
177             mode=mode).squeeze(0)  # (m, window_size + 1)                                                    
178         strided_input = strided_input - preemphasis_coefficient * offset_strided_input[:, :
179                                                                                        -1]

运行run_asr.py得出结果:

### 部署 Deepseek R1 671B 模型NPU 的环境配置 #### 环境准备 为了成功部署 Deepseek R1 671B 至 NPU,需先确认硬件支持并安装必要的软件包。这包括但不限于 Ascend 芯片的支持库以及 Python 开发环境。 对于特定于 Deepseek-R1 大模型部署,建议使用官方提供的 MINDIE-1.0.0 镜像来简化设置过程[^2]。该镜像已预装了大部分必需组件,减少了手动配置的工作量。 #### 安装依赖项 在启动实际部署前,确保所有依赖项均已就绪。通常情况下,这些依赖会随同 MINDIE-1.0.0 自动处理;然而,在某些特殊场景下可能仍需额外操作: ```bash pip install -r requirements.txt ``` 此命令将依据 `requirements.txt` 文件中的列表自动下载并安装所需Python库。 #### 下载模型权重文件 由于 Deepseek R1 671B 是一个非常庞大的模型,其参数数量达到了惊人的671亿级别,因此需要特别注意存储空间规划。应提前准备好足够的磁盘容量用于保存模型权重文件,并通过指定链接获取最新版本的模型数据[^1]。 #### 加载与初始化模型 完成上述准备工作之后,可以开始加载模型实例。考虑到资源消耗巨大,推荐采用分布式训练框架如 PyTorch DDP 或者 TensorFlow Horovod 来提高效率。 以下是基于PyTorch的一个简单例子展示如何创建模型对象: ```python import torch from transformers import AutoModelForCausalLM, AutoTokenizer tokenizer = AutoTokenizer.from_pretrained('path_to_model') model = AutoModelForCausalLM.from_pretrained('path_to_model', device_map="auto", offload_folder="./offload") ``` 这段代码片段展示了怎样利用 Hugging Face Transformers 库快速构建起一个能够运行于多GPU/NPU上的大型语言模型实例。请注意这里的 `'path_to_model'` 参数应当替换为实际路径名。 #### 性能优化调整 针对 NPU 特性做出相应调优措施非常重要。比如可以通过修改 batch size、学习率等超参来进行微调测试,找到最适合当前系统的设定组合。另外还可以考虑应用量化技术减少计算开销而不显著影响精度表现。
评论 2
添加红包

请填写红包祝福语或标题

红包个数最小为10个

红包金额最低5元

当前余额3.43前往充值 >
需支付:10.00
成就一亿技术人!
领取后你会自动成为博主和红包主的粉丝 规则
hope_wisdom
发出的红包
实付
使用余额支付
点击重新获取
扫码支付
钱包余额 0

抵扣说明:

1.余额是钱包充值的虚拟货币,按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载,可以购买VIP、付费专栏及课程。

余额充值