[AI] 如何制作一个低配山寨版贾维斯?-口语人机交互 Spoken Human Robot Interaction

关注!点赞!评论!收藏!谢谢!


如何制作一个低配山寨版贾维斯?
在这里插入图片描述


一、交互结构总览

在这里插入图片描述

  • 第一步:首先通过语音输入设备,将语音信息输入计算机。这里我使用speech_recognition.sr.Microphone() 函数调用计算机麦克风,然后 sr.Recognizer().listen() 将麦克风输入的语音信息保留下来。

  • 第二步:使用语言识别库,将输入语音信息转为文本信息。sr.Recognizer().recognize_google() 使用谷歌的语音识别获取文本信息。

  • 第三步:使用 en_core_web_sm 库对语言进行解析,并绘制依赖树

  • 第四步: 根据文本单词的词性或者特定词(此处可以自由替换对话中的某些信息,比如时间,数量等,并让计算机可以识别到这些信息),来让计算机自动回复设定好的语音(用Espeak工具可以让计算机用语音读出文本)

二、展示

一、代码背景

我设定的对话背景是,我的计算机是个低配版贾维斯 (是真的低配) ,他管理我的房子,然后有朋友要来家里做客,我让贾维斯帮我提前做一些准备。

二、对话运行结果展示

为了方便,所用的包均用的英文,读者可以自由替换语言包,比如中文包

对话一

贾维斯:What can I do for you sir?
我:   today my friends will come to my house

在这里插入图片描述
下面的对话可以是任意人数,贾维斯会识别语言中的表示数字的单词,并在后面的对话中复述出来。

贾维斯:Sir, How many friends will come?
我:    maybe six people

在这里插入图片描述

对话二

这里“ isyncr” 其实是 “I think”,谷歌语音识别的不是特别准(当然不能是我发音不漂准!

此处语音可选择 三种饮料中的一种,贾维斯识别后会复述一遍你的选择

贾维斯:and what drinks do I need to prepare?cola, tea or coffee?
我 :   isyncr coffee is the best
贾维斯:got it sir, I will prepare six cups of coffe

在这里插入图片描述

对话三

贾维斯:By the way, sir, when they will come?
我:   if I remember correctly live well, either 5 p.m.
贾维斯:Sir, please confirm, your six friends will come at 5. pm and I will prepare six cups of coffe for them

在这里插入图片描述

对话四

贾维斯:And Sir, your house looks not clean, can I clean it now?
我:yes  of course please
贾维斯:As your wish, please wait a moment
贾维斯:Sir, your house is clean now!
贾维斯:Have a nice day sir!

三、实现代码

我还在开头加了个下载好的贾维斯“欢迎回家”的语音包,代码及语音包戳这里

import speech_recognition as sr
import en_core_web_sm
import os
import spacy
from spacy import displacy
from spacy.symbols import NOUN, NUM, VERB
from nltk import Tree
import winsound

# Built a nltk tree
def to_nltk_tree(node):
    if node.n_lefts + node.n_rights > 0:
        return Tree(node.orth_, [to_nltk_tree(child) for child in node.children])
    else:
        return node.orth_

def dependency_tree(text):
    nlp = en_core_web_sm.load()
    nlp_doc = nlp(text)
    print('---------------')
    print('Dependency Graph')
    [to_nltk_tree(sent.root).pretty_print() for sent in nlp_doc.sents]
    print('---------------')

def clean():
    winsound.PlaySound('Jarvis/Jarvis-System/Garbage cleared.wav', winsound.SND_FILENAME)
    reply = 'Sir, your house is clean now!'
    print(reply)
    os.system('espeak "{}"'.format(reply))


winsound.PlaySound('Jarvis/Jarvis-System/Welcome Home Sir(No Song).wav', winsound.SND_FILENAME)

#today my friends will come to my house
r = sr.Recognizer()
with sr.Microphone() as source:
    l = "What can I do for you sir?"
    print(l)
    os.system('espeak "{}"'.format(l))
    audio = r.listen(source)

try:
    textT = r.recognize_google(audio)
except sr.UnknownValueError:
    print("Jarvis could not understand your audio")
except sr.RequestError as e:
    print("Could not request results from Google Speech Recognition service; {0}".format(e))

print(textT)
dependency_tree(textT)

nlp = en_core_web_sm.load()
nlp_doc = nlp(textT)

for word in nlp_doc:
    
    if word.pos == VERB and str(word) == 'come':
        
        l = 'Sir, How many friends will come?'
        print(l)
        os.system('espeak "{}"'.format(l))

        with sr.Microphone() as source:
            audio = r.listen(source)

        try:
            textT = r.recognize_google(audio)
        except sr.UnknownValueError:
            print("Jarvis could not understand your audio")
        except sr.RequestError as e:
            print("Could not request results from Google Speech Recognition service; {0}".format(e))

        print(textT)
        dependency_tree(textT)

        nlp = en_core_web_sm.load()
        nlp_doc = nlp(textT)

        for word in nlp_doc:
            if word.pos == NUM:
                friends_num = str(word)
#I remember six people

# I think tea is the best
r = sr.Recognizer()
with sr.Microphone() as source:
    l = "and what drinks do I need to prepare?cola, tea or coffee?"
    print(l)
    os.system('espeak "{}"'.format(l))
    audio = r.listen(source)

try:
    textT = r.recognize_google(audio)
except sr.UnknownValueError:
    print("Jarvis could not understand your audio")
except sr.RequestError as e:
    print("Could not request results from Google Speech Recognition service; {0}".format(e))
    
print(textT)
dependency_tree(textT)

if 'tea' in textT.lower():
    drink = 'tea'
    l = 'got it sir, I will prepare ' + friends_num+' cups of tea'
elif 'cola' in textT.lower():
    drink = 'cola'
    l = 'got it sir, I will prepare ' + friends_num+' cups of cola'
elif 'coffe' in textT.lower():
    drink = 'coffe'
    l = 'got it sir, I will prepare ' + friends_num+' cups of coffe'
else:
    l0 = 'Sir, can you please say it again?'
    print(l0)
    os.system('espeak "{}"'.format(l0))

print(l)
os.system('espeak "{}"'.format(l))

#If I remeber correctly, they will come at 5 pm
r = sr.Recognizer()
with sr.Microphone() as source:
    l = "By the way, sir, when they will come?"
    print(l)
    os.system('espeak "{}"'.format(l))
    audio = r.listen(source)

try:
    textT = r.recognize_google(audio)
except sr.UnknownValueError:
    print("Jarvis could not understand your audio")
except sr.RequestError as e:
    print("Could not request results from Google Speech Recognition service; {0}".format(e))

print(textT)
dependency_tree(textT)    

nlp_doc = nlp(textT)

time_num = ''

for word in nlp_doc:
    if word.pos == NUM:
        time_num += str(word) + '.'
    if 'a.m.' in textT.lower():
        m = 'am'
    elif 'p.m.' in textT.lower():
        m = 'pm'
    else:
        m = 'pm'
    
l = 'Sir, please confirm, your ' + friends_num + ' friends will come at '\
    + time_num + ' ' + m + ' and I will prepare ' + friends_num + ' cups of ' + drink + ' for them'
print(l)
os.system('espeak "{}"'.format(l))


r = sr.Recognizer()
with sr.Microphone() as source:
    l = "And Sir, your house looks not clean, can I clean it now?"
    print(l)
    os.system('espeak "{}"'.format(l))
    audio = r.listen(source)

try:
    textT = r.recognize_google(audio)
except sr.UnknownValueError:
    print("Jarvis could not understand your audio")
except sr.RequestError as e:
    print("Could not request results from Google Speech Recognition service; {0}".format(e))

print(textT) 
dependency_tree(textT) 

if 'yes' or 'yeah' in textT.lower():
    l = 'As your wish, please wait a moment'
    print(l)
    os.system('espeak "{}"'.format(l))
    clean()
    l = 'Have a nice day sir!'
    print(l)
    os.system('espeak "{}"'.format(l))

当然,读者可以在此框架下,加入RNN来让贾维斯的回复更加智能!

关注!点赞!评论!收藏!谢谢!

  • 4
    点赞
  • 26
    收藏
    觉得还不错? 一键收藏
  • 打赏
    打赏
  • 3
    评论

“相关推荐”对你有帮助么?

  • 非常没帮助
  • 没帮助
  • 一般
  • 有帮助
  • 非常有帮助
提交
评论 3
添加红包

请填写红包祝福语或标题

红包个数最小为10个

红包金额最低5元

当前余额3.43前往充值 >
需支付:10.00
成就一亿技术人!
领取后你会自动成为博主和红包主的粉丝 规则
hope_wisdom
发出的红包

打赏作者

是土豆大叔啊!

你的鼓励将是我创作的最大动力

¥1 ¥2 ¥4 ¥6 ¥10 ¥20
扫码支付:¥1
获取中
扫码支付

您的余额不足,请更换扫码支付或充值

打赏作者

实付
使用余额支付
点击重新获取
扫码支付
钱包余额 0

抵扣说明:

1.余额是钱包充值的虚拟货币,按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载,可以购买VIP、付费专栏及课程。

余额充值