ROS小车对接PaddleSpeech语音识别

绯虹剑心

已于 2023-09-13 17:11:59 修改

阅读量289

点赞数

分类专栏：物联网文章标签：语音识别人工智能

于 2023-08-15 19:12:24 首次发布

本文链接：https://blog.csdn.net/weixin_43422012/article/details/132293027

版权

本文介绍了如何将ROS小车与PaddleSpeech语音识别系统结合，详细阐述了安装PaddleSpeech的特定版本和python环境要求。同时，文章探讨了audio_common的调试，包括安装、配置脉冲音频（pulseaudio）设备和降噪方法。通过NoiseTorch进行噪声抑制，并解决了在使用gstreamer和pulseaudio时遇到的问题。最后，调整参数以优化语音识别效果。

摘要由CSDN通过智能技术生成

paddlespeech

安装版本

paddlepaddle 2.5.1
paddlespeech 必须选最新develop分支代码（pip install -e . -i https://mirror.baidu.com/pypi/simple本地编译安装，选最新release版本有bug没有修复会报错，编译过程中git经常克隆失败可以用clash代理）
python 3.8（python3.10 不支持_PyGen_Send，编译会有报错）

使用

paddlespeech tts --input "唤醒词，小微小微，小车前进，小车后退，小车左转，小车右转，小车停，小车休眠，小车过来，小车去I点，小车去J点，小车去K点，小车雷达跟随，关闭雷达跟随，小车色块跟随，关闭色块跟随，打开自主建图，关闭自主建图，开始导航，关闭导航" --output output.wav
paddlespeech asr --lang zh --input output.wav
paddlespeech whisper --help
paddlespeech whisper --size tiny --language  --task transcribe --input zh.wav
paddlespeech whisper --size small --language Chinese --task transcribe --input output.wav

# 启动 http服务端，RESTful API可以被调用
cp ../git/PaddleSpeech/demos/speech_server/conf/application.yaml ./
# 修改端口80
paddlespeech_server start --config_file application.yaml

audio_common

安装

# 查看ros版本，根据目录名 opt/ros/melodic，或者命令
rosversion -d
# melodic

# sudo apt-get install ros-<distro>-audio-common
sudo apt-get install ros-melodic-audio-common

# /opt/ros/melodic/lib/目录下有安装好的库，如audio_capture

源码编译安装

cd ~/catkin_ws/src
git clone https://github.com/ros-drivers/audio_common.git
rosdep install audio_common
cd ~/catkin_ws
catkin_make
source ~/catkin_ws/devel/setup.bash

调试音频设备

# 查看声卡
cat /proc/asound/cards

# 查看播放设备
sudo aplay -l                                                                                                                       

# 查看录音设备
sudo arecord -l

# gst-launch-1.0 alsasrc ! audioconvert ! audioresample ! alsasink
gst-launch-1.0 alsasrc device=hw:5,0 ! audioconvert ! audioresample ! alsasink device=hw:4,0
 
# roslaunch audio_capture capture.launch
roslaunch audio_capture capture.launch device:="hw:3,0" format:="wave"

# roslaunch audio_play play.launch
roslaunch audio_play play.launch device:="hw:2,0" format:="wave"

ROS录音节点

代码

<launch>
    <arg name="device" default="hw:3,0"/>
    <arg name="format" default="wave"/>

    <!-- 录音节点 --> 
    <include file="$(find audio_capture)/launch/capture.launch">
        <arg name="device" value="$(arg device)"/>
        <arg name="format" value="$(arg format)"/>
    </include>

    <!-- 语音识别功能节点 -->
    <node name='speech_recognition' pkg&