python 语音播报文本_如何使用python入门Google文本语音转换

最新推荐文章于 2024-01-10 17:52:03 发布

weixin_26704853

最新推荐文章于 2024-01-10 17:52:03 发布

阅读量873

点赞数 1

文章标签： python 人工智能语音识别自然语言处理深度学习

原文链接：https://towardsdatascience.com/how-to-get-started-with-google-text-to-speech-using-python-485e43d1d544

版权

本文介绍了如何使用Python入门Google的文本语音转换技术，通过示例代码详细解析了将文本转化为语音的过程。

摘要由CSDN通过智能技术生成

python 语音播报文本

语言模型设计 (Language Model Designing)

In this article, we will be covering the basics of text to speech translation. I will give a brief introduction on text to speech and then I will cover how to get started with the installation of the Google text to speech (gTTS) module. We will then proceed to understand how the gTTS module exactly works and also write a simple program to grasp the overall concepts better. We will also explore more ideas on what else could be done using this module and then try to comprehend the many real-life use cases which can be accomplished using gTTS.

在本文中，我们将介绍文本到语音翻译的基础知识。我将简要介绍文本语音转换，然后介绍如何开始安装Google文本语音转换(gTTS)模块。然后，我们将继续了解gTTS模块的工作原理，并编写一个简单的程序以更好地掌握总体概念。我们还将探索更多有关使用此模块可以完成的工作的想法，然后尝试理解可以使用gTTS完成的许多实际用例。

Primary Requirement: An active internet connection with at least a moderate bandwidth. This is because we will be using the Google API for the text to speech translation.

主要要求：具有至少中等带宽的活动Internet连接。这是因为我们将使用Google API进行文本到语音的翻译。

Note: This will be the first part of the language modeling tutorial. In this tutorial series, we will cover how to use google text to speech, optical character recognition, deep learning models like sequence to sequence models with attention, and other cool things.

注意：这将是语言建模教程的第一部分。在本教程系列中，我们将介绍如何使用google文本进行语音转换，光学字符识别，深度学习模型(如序列到注意序列模型)以及其他有趣的东西。

文字转语音简介： (Introduction to Text-to-Speech:)

The text-to-speech (TTS) is the process of converting words into a vocal audio form. The program, tool, or software takes an input text from the user, and using methods of natural language processing understands the linguistics of the language being used, and performs logical inference on the text. This processed text is passed into the next block where digital signal processing is performed on the processed text. Using many algorithms and transformations this processed text is finally converted into a speech format. This entire process involves the synthesizing of speech. Below is a simple block diagram to understand the same.

文本语音转换(TTS)是将单词转换为语音形式的过程。程序，工具或软件从用户处获取输入文本，并使用自然语言处理方法来理解所用语言的语言学，并对文本进行逻辑推断。该处理后的文本被传递到下一个块，在该块中对处理后的文本执行数字信号处理。使用许多算法和转换，最终将处理后的文本转换为语音格式。这整个过程涉及语音的合成。下面是一个简单的框图，可以理解它们。

This seems like quite a complicated process, but thanks to python and the gTTS module, this process can be simplified to just a few lines of code. From the block diagram, we can understand that the text being passed is firstly pre-processed with the help of natural language processing, and then using digital signal processing is converted to speech.

这似乎是一个相当复杂的过程，但是由于有了python和gTTS模块，该过程可以简化为仅几行代码。从框图中我们可以了解到，首先通过自然语言处理对传递的文本进行预处理，然后使用数字信号处理将其转换为语音。

Without further ado, let’s get our hands dirty with some code.

事不宜迟，让我们开始编写一些代码。

gTTS模块的安装： (Installation of gTTS module:)

We are going to be working with python for the rest of the tutorial. If you haven’t installed python already, then please do so here.

在本教程的其余部分中，我们将使用python。如果您尚未安装python，请在此处进行安装。

The installation of the gTTS module is simple and can be done using the following command in the command prompt terminal —

gTTS模块的安装非常简单，可以在命令提示符终端中使用以下命令来完成-

pip install gTTS

After the installation is done, we can proceed to write a very simple program to understand how exactly we can use this module to convert our typed text into a speech converted output. Open the python file and give it a name of your choice and make sure it ends with the .py format. Let us look at the program —

安装完成后，我们可以继续编写一个非常简单的程序，以了解如何使用此模块将键入的文本转换为语音转换的输出。打开python文件，并为其选择一个名称，并确保其以.py格式结尾。让我们看一下程序-

from gtts import gTTS


text = "Hello! My name is Bharath."
tts = gTTS(text)
tts.save("hi.mp3")

gTTS (Google Text-to-Speech)is a Python library and CLI tool to interface with Google Translate text-to-speech API. We will import the gTTS library from the gtts module which can be used for speech translation.

gTTS (Google文本到语音)是一个Python库和CLI工具，可与Google Translate文本到语音API交互。我们将从gtts模块导入gTTS库，该库可用于语音翻译。

The text variable is a string used to store the user’s input. The text can be replaced by anything of your choice within the quotes. Another alternative can be to use the input statement for the user to type their own desired input each time the program is run. This can be done as follows:

文本变量是用于存储用户输入的字符串。文本可以用引号引起的任意替换。另一种选择是在每次运行程序时使用输入语句让用户键入自己所需的输入。可以按照以下步骤进行：

text = input("Enter your text: ")
tts = gTTS(text)
tts.save("user_input.mp3")

The tts variable is used to perform the Google text-to-speech translation on the user’s input. The output of the converted text is stored in the form of speech in the tts variable.

tts变量用于在用户输入上执行Google文本到语音的翻译。转换后的文本的输出以语音形式存储在tts变量中。

The tts.save function allows us to save the converted speech in a format that allows us to play sounds. I have saved it in a file called hi and in a format called .mp3. Other formats like .wav format can also be used.

tts.save功能使我们能够以允许我们播放声音的格式保存转换后的语音。我已经将其保存在名为hi的文件中，并以.mp3格式保存。也可以使用其他格式，例如.wav格式。

We have successfully completed saving our file. Now there are 3 ways to run this file. I will go over all the 3 ways to do this task —

我们已成功完成文件的保存。现在，有3种方法可以运行该文件。我将介绍完成此任务的所有3种方法-

Directly running the saved file from the folder: You can find the file saved in the same location you ran your python program and you can access it there.
从文件夹直接运行保存的文件：您可以在运行python程序的位置找到保存的文件，然后可以在其中访问它。
Using the OS module: We can use python’s os module to directly run the file from the program. The command to do so is as follows.
使用OS模块：我们可以使用python的os模块直接从程序中运行文件。这样做的命令如下。

import os 
os.system("hi.mp3")

3. Using the playsound module: The playsound module can be installed with a simple pip install playsound command and then we can proceed to play the audio from the file using the following command.

3.使用playsound模块：可以使用简单的pip install playsound命令安装playsound模块，然后我们可以使用以下命令继续播放文件中的音频。

from playsound import playsound
os.system("user_input.mp3")

结论： (Conclusion:)

The gTTS module can be used extensively on other languages such as French, German, Hindi, etc., as well. This is extremely useful when there is a communication barrier and the user is unable to convey his messages to people. Text-to-speech is a great help to the visually impaired people or people with other disabilities as it can help them by assisting in the text to speech translation. There are also many ideas possible with the gTTS module and it can be used for other languages as well.

gTTS模块也可以广泛用于其他语言，例如法语，德语，印地语等。当存在通信障碍并且用户无法将其消息传达给人们时，这非常有用。文字转语音对视障人士或其他残障人士有很大的帮助，因为它可以通过协助文字到语音的翻译来帮助他们。 gTTS模块也有很多想法，它也可以用于其他语言。

There is potential for a lot of awesome projects with the same. I will encourage viewers to try experimenting around more with this module. Viewers can feel free to refer here for a cool project which I had previously done. In this series, we will see how we can implement a translation using deep learning and how the gTTS module will play a role in doing this.

有很多很棒的项目都具有相同的潜力。我鼓励观众尝试使用此模块进行更多尝试。观众可以随意在这里参考我以前做过的一个很酷的项目。在本系列文章中，我们将看到如何使用深度学习来实现翻译，以及gTTS模块将如何发挥作用。