处理使用 PaddleSpeech 过程中出现的报错 ValueError (InvalidArgument) Broadcast dimension mismatch

最新推荐文章于 2024-05-27 16:57:47 发布

阿莫、

最新推荐文章于 2024-05-27 16:57:47 发布

阅读量1.5k

点赞数 5

文章标签： python 语音识别 paddlepaddle

本文链接：https://blog.csdn.net/qq_45897239/article/details/136573149

版权

环境搭建可参考：在百度 AiStudio 平台中使用 PaddleSpeech

文章目录

1、运行环境

paddle-bfloat               0.1.7
paddle2onnx                 1.1.0
paddleaudio                 1.1.0
paddlefsl                   1.1.0
paddlenlp                   2.5.2
paddlepaddle                2.4.2
paddlesde                   0.2.5
paddleslim                  2.6.0
paddlespeech                1.4.1
paddlespeech-ctcdecoders    0.2.1
paddlespeech-feat           0.1.0

ppdiffusers                 0.19.4
Python                      3.8.18

注：在百度 AiStudio 平台上进行的测试。

2、终端执行以下命令时出错

from paddlespeech.cli.asr.infer import ASRExecutor
asr = ASRExecutor()
result = asr(audio_file="zh.wav")
print(result)

注：以上代码为语音识别的Python代码。

代码来源：https://github.com/PaddlePaddle/PaddleSpeech/blob/develop/README_cn.md

3、报错详情

ValueError: (InvalidArgument) Broadcast dimension mismatch. Operands could not be broadcast together with the shape of X = [1, 1, 0, 498] and the shape of Y = [1, 123, 123]. Received [498] in X is not equal to [123] in Y at i:3.
  [Hint: Expected x_dims_array[i] == y_dims_array[i] || x_dims_array[i] <= 1 || y_dims_array[i] <= 1 == true, but received x_dims_array[i] == y_dims_array[i] || x_dims_array[i] <= 1 || y_dims_array[i] <= 1:0 != true:1.] (at /paddle/paddle/phi/kernels/funcs/common_shape.h:84)

Traceback (most recent call last):
  File "test1.py", line 3, in <module>
    result = asr(audio_file="zh.wav")
  File "/home/aistudio/PaddleSpeech/paddlespeech/cli/utils.py", line 328, in _warpper
    return executor_func(self, *args, **kwargs)
  File "/home/aistudio/PaddleSpeech/paddlespeech/cli/asr/infer.py", line 512, in __call__
    res = self.postprocess()  # Retrieve result of asr.
  File "/home/aistudio/PaddleSpeech/paddlespeech/cli/asr/infer.py", line 335, in postprocess
    return self._outputs["result"]
KeyError: 'result'

4、尝试解决

上网查找资料后，部分网友说是版本问题，依赖包冲突之类的，尝试修改后无果。

参考链接：https://github.com/PaddlePaddle/PaddleSpeech/issues/3246

5、解决方法（指定模型）

在使用 视频字幕生成 功能时发现了同样的问题：

参考链接(视频字幕生成功能)：https://github.com/PaddlePaddle/PaddleSpeech/blob/develop/demos/automatic_video_subtitiles/README_cn.md

后续参考 语音识别 的详细功能文档后发现以下代码可正常执行，且不报错：

import paddle
from paddlespeech.cli.asr import ASRExecutor

asr_executor = ASRExecutor()
text = asr_executor(
    model='conformer_wenetspeech',
    lang='zh',
    sample_rate=16000,
    config=None,  # Set `config` and `ckpt_path` to None to use pretrained model.
    ckpt_path=None,
    audio_file='./zh.wav',
    force_yes=False,
    device=paddle.get_device())
print('ASR Result: \n{}'.format(text))

参考链接(语音识别功能)：https://github.com/PaddlePaddle/PaddleSpeech/blob/develop/demos/speech_recognition/README_cn.md

经测试，发现使用 asr_executor() 方法时还需要指定模型。

创建一个 python 文件进行测试：

创建一个 python 文件进行测试

代码：

from paddlespeech.cli.asr.infer import ASRExecutor
asr = ASRExecutor()
result = asr(audio_file="zh.wav",model='conformer_wenetspeech')
print(result)

在终端中执行：

在终端中执行

同样的，在使用 视频字幕生成 功能时指定模型后，经测试不再报错。

阿莫、

关注

5
点赞
踩
11

收藏

觉得还不错? 一键收藏
0
评论
处理使用 PaddleSpeech 过程中出现的报错 ValueError (InvalidArgument) Broadcast dimension mismatch

发现使用 `asr_executor()` 方法时还需要指定模型。
复制链接

扫一扫