使用edge-tts将文字转成语音

最新推荐文章于 2025-03-08 09:56:28 发布

Michael阿明

最新推荐文章于 2025-03-08 09:56:28 发布

阅读量1.9k

点赞数 3

分类专栏： LLM / AI应用文章标签： python 语音识别 edge-tts

本文链接：https://blog.csdn.net/qq_21201267/article/details/136800521

版权

LLM / AI应用专栏收录该内容

30 篇文章

订阅专栏

参考：https://github.com/rany2/edge-tts 目前3.1k 🌟

重点：免费，无需 API-KEY 即可使用 tts

安装 pip install edge-tts

可以使用命令行来执行

$ edge-tts --text "Hello, world!" --write-media hello.mp3 --write-subtitles hello.vtt

改变速度、音量、音调

$ edge-tts --rate=-50% --text "Hello, world!" --write-media hello_with_rate_halved.mp3 --write-subtitles hello_with_rate_halved.vtt
$ edge-tts --volume=-50% --text "Hello, world!" --write-media hello_with_volume_halved.mp3 --write-subtitles hello_with_volume_halved.vtt
$ edge-tts --pitch=-50Hz --text "Hello, world!" --write-media hello_with_pitch_halved.mp3 --write-subtitles hello_with_pitch_halved.vtt

也可以使用代码，主要的 api 有

edge_tts.Communicate(TEXT, VOICE)
Communicate.save、Communicate.stream

# _*_ coding: utf-8 _*_
# @Time : 2024/3/19 21:03
# @Author : Michael
# @File : edgeTTS.py
# @desc :
import asyncio
import random
import edge_tts


async def tts() -> None:
    communicate = edge_tts.Communicate(TEXT, VOICE)
    with open(OUTPUT_FILE, "wb") as file:
        async for chunk in communicate.stream():  # 流式获取
            if chunk["type"] == "audio":
                file.write(chunk["data"])
            elif chunk["type"] == "WordBoundary":
                print(f"WordBoundary: {chunk}")

async def search_voice_tts() -> None:
    # 根据条件获取语音列表
    voices = await edge_tts.VoicesManager.create()
    # 查找男性、中文、中国大陆的语音
    voice = voices.find(Gender="Male", Language="zh", Locale="zh-CN")
    print(voice)
    # 在查找的结果中随机选择语音
    selected_voice = random.choice(voice)["Name"]
    print(selected_voice)
    communicate = edge_tts.Communicate(TEXT, random.choice(voice)["Name"])
    await communicate.save(OUTPUT_FILE)

async def tts_with_submaker() -> None:
    """输出字幕"""
    communicate = edge_tts.Communicate(TEXT, VOICE)
    submaker = edge_tts.SubMaker()
    with open(OUTPUT_FILE, "wb") as file:
        async for chunk in communicate.stream():
            if chunk["type"] == "audio":
                file.write(chunk["data"])
            elif chunk["type"] == "WordBoundary":
                submaker.create_sub((chunk["offset"], chunk["duration"]), chunk["text"])

    with open(WEBVTT_FILE, "w", encoding="utf-8") as file:
        file.write(submaker.generate_subs())

if __name__ == "__main__":
    TEXT = "微软的 edge tts 好棒啊!"
    VOICE = "zh-CN-YunyangNeural"  # ShortName
    OUTPUT_FILE = "test1.mp3"
    WEBVTT_FILE = "test.vtt"
    # 列出相关的voice
    voices_options = asyncio.run(edge_tts.list_voices())
    voices_options = [voice for voice in voices_options if voice["Locale"].startswith("zh-")]
    print(voices_options)
    # 调用 tts
    asyncio.run(tts())
    # 调用 search_voice_tts, 随机选择语音
    asyncio.run(search_voice_tts())
    # 调用 tts_with_submaker, 生成字幕
    asyncio.run(tts_with_submaker())