Python语音识别基础实践-SpeechRecognition

最新推荐文章于 2024-08-13 08:53:48 发布

BuLingLings

最新推荐文章于 2024-08-13 08:53:48 发布

阅读量3.4k

点赞数 4

分类专栏： Python 语音识别文章标签：语音识别

本文链接：https://blog.csdn.net/BuLingLings/article/details/109510391

版权

Python 同时被 2 个专栏收录

38 篇文章 0 订阅

订阅专栏

语音识别

1 篇文章 0 订阅

订阅专栏

声明：参考B站视频，自学成长记录
https://www.bilibili.com/video/BV1Jk4y1R7a5?p=2
并参考博客：https://blog.csdn.net/Datapad/article/details/82970253

安装SpeechRecognition

C:\Users\Administrator>pip3 install SpeechRecognition
......
Installing collected packages: SpeechRecognition
Successfully installed SpeechRecognition-3.8.1

安装成功入下图

在这里插入图片描述

安装pocketsphinx

C:\Users\Administrator>pip install pocketsphinx
......
Installing collected packages: pocketsphinx
Successfully installed pocketsphinx-0.1.15

安装成功入下图

在这里插入图片描述

代码示例

实现将wav格式的语音文件读取并进行整体 / 部分识别

import speech_recognition as sr

r = sr.Recognizer()     # 调用识别器
harvard = sr.AudioFile('E:\speek\harvard.wav')      # 导入语音文件

# 上下文管理器打开文件并读取其内容
with harvard as source:
    all_audio = r.record(source)    # 使用record()从文件中捕获数据

# 查看类型
print(type(all_audio))      # <class 'speech_recognition.AudioData'>

all_text = r.recognize_sphinx(all_audio)    # 识别输出
print(all_text)
# this they'll smell of old we're lingers it takes heat to
# bring out the odor called it restores health and zest
# case all the colt is fine with him couples all pastore
# my favorite is as full food is the hot cross mon


# 识别部分文件并输出
with harvard as source:
    # 分割视频文件   指定偏移量及持续时间
    audio = r.record(source, offset=4, duration=3)  # 从第4秒开始,持续时间3秒

text = r.recognize_sphinx(audio)    # 识别输出
print(text)     # it takes heat to bring out the odor

注意事项

1、预先知道音频文件中语音的结构，那么offset和duration关键字参数对于分割音频文件非常有用。然而，匆忙使用它们会导致转录不良
2、音频文件类型以PCM WAV、AIFF/AIFF- c或本机FLAC读取音频文件，不然会报错
3、声音文件链接：https://pan.baidu.com/s/10oClt_NWgjOsDmIPuqQGzg 提取码：0wv4