前言
暂时了解到的能上ROS的包有baidu_speech,hark,pocketsphinx,rospeex,还有科大讯飞
pocketsphinx英文包
安装过程略,包从github扒,依赖除了直接apt-get的,剩余的去debian下。
名称如下:,
直接安装: ros-kinetic-audio-common,libasound2, libgstreamer0.10,python-gst0.10
去debian下: libsphinxbase1_0.8-6,libpocketsphinx1_0.8-5,gstreamer0.10-pocketsphinx
语音包名称: pocketsphinx-hmm-en-tidigits_0.8-5_all,pocketsphinx-hmm-zh-tdt_0.8-5_all,pocketsphinx-lm-zh-hans-gigatdt_0.8-5_all(本篇是en的英文语音效果)
步骤
- 在pocketsphinx包里创建一个model目录,存放解压的语音模型文件
cd ~/dev/catkin_ws/src/pocketsphinx
mkdir model
sudo cp /usr/share/pocketsphinx/model/* ~/catkin_ws/src/pocketsphinx/model -r
- 修改recognizer.py
cd ~/catkin_ws/src/pocketsphinx/nodes
vim recognizer.py
注释掉self.asr.set_property(‘configured’, True)
添加lm,dict,hmm支持英语识别(如果是其他语言可以改为别的路径)
self.asr.set_property('lm', '/usr/share/pocketsphinx/model/lm/en/tidigits.DMP')
self.asr.set_property('dict', '/usr/share/pocketsphinx/model/lm/en/tidigits.dic')
self.asr.set_property('hmm', '/usr/share/pocketsphinx/model/hmm/en/tidigits')
- 启动
roslaunch pocketsphinx robocup.launch
注意的一点就是添加代码的位置,否则会报找不到lm,hmm的错误,
运行
- 一共启动三个节点:turtlesim,recognizer,voice_teleop,关系如下:
voice_teleop代码如下(网上扒的,我是以voice_cmd_vel2.py保存在pocketsphinx的nodes文件夹下)
#!/usr/bin/env python
# -*- coding: utf-8 -*-
import rospy
from geometry_msgs.msg import Twist
from std_msgs.msg import String
# 初始化ROS节点,声明一个发布速度控制的Publisher
rospy.init_node('voice_teleop')
pub = rospy.Publisher('/cmd_vel', Twist, queue_size=10)
r = rospy.Rate(10)
# 接收到语音命令后发布速度指令
def get_voice(data):
voice_text=data.data
rospy.loginfo("I said:: %s",voice_text)
twist = Twist()
if voice_text == "one":
twist.linear.x = 0.3
elif voice_text == "two":
twist.linear.x = -0.3
elif voice_text == "three":
twist.angular.z = 0.3
elif voice_text == "four":
twist.angular.z = -0.3
pub.publish(twist)
# 订阅pocketsphinx语音识别的输出字符
def teleop():
rospy.loginfo("Starting voice Teleop")
rospy.Subscriber("/recognizer/output", String, get_voice)
rospy.spin()
while not rospy.is_shutdown():
teleop()
- 屏幕全览
结语
识别效果无论是在精度或者远场效果上都非常垃圾。拿小乌龟试试效果。不往狗子上实验了。
视频演示传送门视频演示。一个平板来念单词,一块麦克风阵列(亮rgb那个)