对着电脑吼一声,自动打开谷歌网站或者自动打开命令行终端--使用google speech 语音识别程序操作电脑

最新推荐文章于 2020-12-23 21:20:00 发布

易枭寒

最新推荐文章于 2020-12-23 21:20:00 发布

阅读量3.2k

点赞数

分类专栏： Python 文章标签： Python

本文链接：https://blog.csdn.net/xiaowanggedege/article/details/8772296

版权

Python 专栏收录该内容

30 篇文章 1 订阅

订阅专栏

1)更新源文件：@ubuntu:~$ sudo vim /etc/apt/sources.list

deb http://cn.archive.ubuntu.com/ubuntu precise main restricted universe

2)更新源文件后，要 update：sudo aptitude update

3)我自己数据库有问题，重新安装mysql数据库：

@ubuntu:~$ aptitude search mysql-server

sudo aptitude reinstall mysql-server

@ubuntu:~$ aptitude install mysql-server

@ubuntu:~$ aptitude purge mysql-server

4)安装相关的包以及依赖环境:

sudo easy_install wave

~/pyvoice$ sudo aptitude install flac 此工具将 wav 转换成 flac

~/pyvoice$ sudo aptitude install python-alsaaudio

5)思路:

输入--处理--输出

1：获取电脑录音-->WAV文件
python record wav

2：录音文件-->文本
STT: Speech to Text

STT API Google API
TTS: Text to Speech

3:文本-->电脑命令

6)代码:

jiangge@ubuntu:~/pycode/pyvoice$ tree
.
├── 1.flac
├── 1.txt
├── 1.wav
├── recordtest.py
├── runcmd.py
└── stt_google.py

采样的库:recordtest.py
SpeechToText:stt_google.py
runcmd.py
1.txt

采样的库:recordtest.py

#!/usr/bin/env python

## recordtest.py
##
## This is an example of a simple sound capture script.
##
## The script opens an ALSA pcm forsound capture. Set
## various attributes of the capture, and reads in a loop,
## writing the data to standard out.
##
## To test it out do the following:
## python recordtest.py out.raw # talk to the microphone
## aplay -r 8000 -f S16_LE -c 1 out.raw


# Footnote: I'd normally use print instead of sys.std(out|err).write,
# but we're in the middle of the conversion between python 2 and 3
# and this code runs on both versions without conversion

import sys
import time
import getopt
import alsaaudio
import wave

def usage():
    sys.stderr.write('usage: recordtest.py [-c <card>] <file>\n')
    sys.exit(2)

if __name__ == '__main__':

    card = 'default'

    opts, args = getopt.getopt(sys.argv[1:], 'c:')
    for o, a in opts:
        if o == '-c':
            card = a

    if not args:
        usage()

    f = wave.open(args[0], 'wb')
    f.setnchannels(1)
    f.setsampwidth(2)
    f.setframerate(8000)

    # Open the device in nonblocking capture mode. The last argument could
    # just as well have been zero for blocking mode. Then we could have
    # left out the sleep call in the bottom of the loop
    inp = alsaaudio.PCM(alsaaudio.PCM_CAPTURE, alsaaudio.PCM_NONBLOCK, card)

    # Set attributes: Mono, 44100 Hz, 16 bit little endian samples
    inp.setchannels(1)
    inp.setrate(8000)
    inp.setformat(alsaaudio.PCM_FORMAT_S16_LE)

    # The period size controls the internal number of frames per period.
    # The significance of this parameter is documented in the ALSA api.
    # For our purposes, it is suficcient to know that reads from the device
    # will return this many frames. Each frame being 2 bytes long.
    # This means that the reads below will return either 320 bytes of data
    # or 0 bytes of data. The latter is possible because we are in nonblocking
    # mode.
    inp.setperiodsize(160)

    loops = 2000000
    while loops > 0:
        loops -= 1
        # Read data from device
        l, data = inp.read()
      
        if l:
            f.writeframes(data)
            time.sleep(.001)

SpeechToText: stt_google.py

#coding=utf-8
import os
import urllib2
import urllib
import time
import json

def writetofile(list_data):
    f = open("1.txt","w")
    for n in list_data:
    print n['utterance']
    f.write(n['utterance'].encode("utf-8"))
    f.close()

def stt_google_wav(filename):
    #Convert to flac
    os.system(FLAC_CONV+ filename+'.wav')
    f = open(filename+'.flac','rb')
    flac_cont = f.read()
    f.close()

    #post it
    lang_code='zh-CN'
    googl_speech_url = 'https://www.google.com.ua/speech-api/v1/recognize?xjerr=1&client=chromium&pfilter=2&lang=%s&maxresults=6'%(lang_code)
    hrs = {"User-Agent": "Mozilla/5.0 (X11; Linux i686) AppleWebKit/535.7 (KHTML, like Gecko) Chrome/16.0.912.63 Safari/535.7",'Content-type': 'audio/x-flac; rate=8000'}
    req = urllib2.Request(googl_speech_url, data=flac_cont, headers=hrs)
    p = urllib2.urlopen(req)
    data =  p.read()
    list_data = json.loads(data)["hypotheses"]
    writetofile(list_data)
    return 

FLAC_CONV = 'flac -f ' # We need a WAV to FLAC converter.
if(__name__ == '__main__'):
     stt_google_wav("1")

runcmd.py

#coding=utf-8
import os

f = open("1.txt","r")
cmds = f.read()
f.close()

def run_browser(cmds):
    if cmds.find("谷歌") > -1:
        os.system("firefox www.google.com")
    if cmds.find("百度") > -1:
        os.system("firefox www.baidu.com")
    if cmds.find("新浪") > -1:
        os.system("firefox www.sina.com.cn")


def run_term(cmds):
    if cmds.find("终端") > -1:
        os.system("gnome-terminal")

def run_gedit(cmds):
    if cmds.find("编程") > -1:
        os.system("gedit test.py")

def run_jeap(cmds):
    count = 0
    if cmds.find("智") > -1:
        count += 1
    if cmds.find("志") > -1:
        count += 1
    if cmds.find("只") > -1:
        count += 1
    if cmds.find("支") > -1:
        count += 1
    if cmds.find("普") > -1:
        count += 1
    if cmds.find("扑") > -1:
        count += 1
    if cmds.find("谱") > -1:
        count += 1
    if count > 1:
        os.system("firefox www.jeapedu.com")

run_browser(cmds)
run_jeap(cmds)
run_term(cmds)
run_gedit(cmds)

7)运行:

jiangge@ubuntu:~/pycode/pyvoice$ python recordtest.py 1.wav;python stt_google.py ;python runcmd.py

然后对着电脑喊"新浪"

会看到命令行变化:

flac 1.2.1, Copyright (C) 2000,2001,2002,2003,2004,2005,2006,2007 Josh Coalson
flac comes with ABSOLUTELY NO WARRANTY. This is free software, and you are
welcome to redistribute it under certain conditions. Type `flac' for details.

1.wav: wrote 23004 bytes, ratio=0.621
新浪网
新浪吗
新浪

如果正常的话,你就会在浏览器上看到已经打开新浪网首页了.不正常的情况...嗯,Fuck GFW

----------------------------------------------------------------

参考文献:

http://uliweb.clkg.org/forum/3/210

代码作者为@hejiasheng