闲来无事,参考文档写了一个语音助手小程序,主要流程为,录音,音频转文字,自动化处理文字并回复,将回复内容转语音输出,代码写的烂,博客也写的烂,不喜勿喷。
1.用python实现录音
看别人的文章用的是 speech_recognition实现录音并调用其recognize_google()来实现语音转换,但是国内不能访问谷歌,加上speech_recognition录音时报的错误无法处理,所以退而求其次,采用pyaudio实现录音,由于各种XXXXXXX原因,直接贴别人代码了。
import pyaudio
import wave
def speak():
CHUNK = 1024
FORMAT = pyaudio.paInt16
CHANNELS = 2
RATE = 16000
RECORD_SECONDS = 3#设置录音时间
WAVE_OUTPUT_FILENAME = "saying.wav"
p = pyaudio.PyAudio()
stream = p.open(format=FORMAT,
channels=CHANNELS,
rate=RATE,
input=True,
frames_per_buffer=CHUNK)
print("please say....")
frames = []
for i in range(0, int(RATE / CHUNK * RECORD_SECONDS)):
data = stream.read(CHUNK)
frame