文本语音阅读器——Python简单实现

最新推荐文章于 2024-04-05 12:30:00 发布

Smaller.孔

最新推荐文章于 2024-04-05 12:30:00 发布

阅读量1.4k

点赞数

文章标签： python 语音识别

本文链接：https://blog.csdn.net/weixin_45373427/article/details/108199280

版权

文本阅读器——Python简单实现

一、功能描述

实现txt文本的自动阅读功能

二、实现流程

1）txt文本准备

将所需要进行语音播放的文本转换为txt文本格式（自行准备或爬取）

2）语音文件生成

利用语音合成的API接口将文本生成为多个mp3格式音频文件
API接口选用 百度智能云语音合成接口 （自行注册账号并在应用列表中创建语音合成应用即可获得接口ID与密钥点击领取免费调用次数如下图——）

在这里插入图片描述

接口调用官方参考文档：https://ai.baidu.com/ai-doc/SPEECH/zk4nlz99s

（1）安装库 pip install baidu-aip
（2) 调用库并创建客户端相关信息在应用创建后自行生成

from aip import AipSpeech

""" 你的 APPID AK SK """
APP_ID = '你的 App ID'
API_KEY = '你的 Api Key'
SECRET_KEY = '你的 Secret Key'

client = AipSpeech(APP_ID, API_KEY, SECRET_KEY)

（3）接口调用（注：合成文本长度必须小于1024字节，如果本文长度较长，可以采用多次请求的方式）

#txtname 文本文件路径
#savefile 生成的多个音频文件保存目录
# spd	String	语速，取值0-9，默认为5中语速
# pit	String	音调，取值0-9，默认为5中语调
# vol	String	音量，取值0-15，默认为5中音量
# per	String	发音人选择, 0为女声，1为男声， 3为情感合成-度逍遥，4为情感合成-度丫丫，默认为普通女
def txt2sound(txtname, savefile):
    """ 你的 APPID AK SK """
	APP_ID = '你的 App ID'
	API_KEY = '你的 Api Key'
	SECRET_KEY = '你的 Secret Key'
    client = AipSpeech(APP_ID, API_KEY, SECRET_KEY)
    with open(txtname, 'r') as sub:
        num = 0
        while True:
            text = sub.read(1023)#每次读取1023字节
            if text:
                num += 1
                #调用接口生成音频文件
                result = client.synthesis(text, 'zh', 1, {'vol': 5, 'per': 4})
                if not isinstance(result, dict):
                    with open('{}/{}.mp3'.format(savefile, str(num)), 'wb') as f:
                        f.write(result)
                print('正在合成第'+str(num)+'段文本......')
            else:
                break
#执行成功后则在音频文件保存目录中生成MP3音频文件

3）语音文件拼接

将生成的多个mp3格式音频文件拼接为一个wav音频文件

(1) 安装相关库 pip install pydub
(2) 执行代码如下

from pydub import AudioSegment
def joinvoice(savefile, save_name):
    finally_sound = AudioSegment.empty()#创建一个空音频用于多个音频文件的拼接
    for i in os.listdir('{}/'.format(savefile)):#遍历音频文件目录进行拼接
        sound = AudioSegment.from_mp3('auido/{}'.format(i))
        finally_sound += sound
    #将拼接完成的音频文件导出，导出格式选用了wav 尝试导出为mp3格式后无法播放文件
    finally_sound.export(save_name, format="wav")

（3）执行成功后则在对应保存路径下生成合成的wav音频文件（此部分可能报错可查看末尾的报错解决）

4）语音文件播放

播放最终拼接的wa音频文件

(1) 安装相关库 pip install pygame
(2) 执行代码如下：

def play(filename):
    # frequency频率即播放速度  size=-16音频样本使用的位数,channels=4 1表示单声道，2表示立体声。不支持其他值（负值被视为1，大于2的值被视为2）
    mixer.init(frequency=16000)
    mixer.music.load(filename)
    mixer.music.play()
    userin = input('输入p停止播放：')
    if userin == 'p':
        mixer.music.pause()
    mixer.music.stop()

5) 完整实现代码

from aip import AipSpeech
from pygame import mixer
import os
from pydub import AudioSegment


def txt2sound(txtname, savefile):
    """ 你的 APPID AK SK """
	APP_ID = '你的 App ID'
	API_KEY = '你的 Api Key'
	SECRET_KEY = '你的 Secret Key'
    client = AipSpeech(APP_ID, API_KEY, SECRET_KEY)
    with open(txtname, 'r') as sub:
        num = 0
        while True:
            text = sub.read(1023)
            if text:
                num += 1
                result = client.synthesis(text, 'zh', 1, {'vol': 5, 'per': 4})
                if not isinstance(result, dict):
                    with open('{}/{}.mp3'.format(savefile, str(num)), 'wb') as f:
                        f.write(result)
                print('正在合成第'+str(num)+'段文本......')
            else:
                break


def play(filename):
    # frequency频率即播放速度  size=-16音频样本使用的位数,channels=4 1表示单声道，2表示立体声。不支持其他值（负值被视为1，大于2的值被视为2）
    mixer.init(frequency=16000)
    mixer.music.load(filename)
    mixer.music.play()
    userin = input('输入p停止播放：')
    if userin == 'p':
        mixer.music.pause()
    mixer.music.stop()


def joinvoice(savefile, save_name):
    finally_sound = AudioSegment.empty()
    for i in os.listdir('{}/'.format(savefile)):
        sound = AudioSegment.from_mp3('auido/{}'.format(i))
        finally_sound += sound
    finally_sound.export(save_name, format="wav")


def main():
    txtname = 'data.txt'
    savefile = 'auido'
    result_name = 'resultsound.wav'
    txt2sound(txtname, savefile)
    joinvoice(savefile, result_name)
    play(result_name)

if __name__ == '__main__':
    main()

三、相关报错及解决方式

1.AudioSegment 相关报错解决：
报错：Python AudioSegment winError 2 The system cannot find the file specified（winError2 找不到系统文件）
解决方式：https://www.pianshen.com/article/5739874938/

Smaller.孔

关注

0
点赞
踩
4

收藏

觉得还不错? 一键收藏
0
评论
文本语音阅读器——Python简单实现

文本阅读器——Python简单实现一、功能描述实现txt文本的自动阅读功能二、实现流程1）txt文本准备将所需要进行语音播放的文本转换为txt文本格式（自行准备或爬取）2）语音文件生成利用语音合成的API接口将文本生成为多个mp3格式音频文件API接口选用百度智能云语音合成接口（自行注册账号并在应用列表中创建语音合成应用即可获得接口ID与密钥点击领取免费调用次数如下图——）接口调用官方参考文档：https://ai.baidu.com/ai-d
复制链接

扫一扫