针对 GPT-SoVITS 项目的 API 接口改进

最新推荐文章于 2024-09-20 19:40:02 发布

全泡方便面

最新推荐文章于 2024-09-20 19:40:02 发布

阅读量1.1k

点赞数 21

文章标签： gpt python 语音识别 fastapi 人工智能

本文链接：https://blog.csdn.net/polo_fang/article/details/140696031

版权

修改后代码的github连接：GitHub - AndrewFangZequan/GPT-SoVITS_improving_API: This is an improvement of the original GPT-SoVITS project, mainly focusing on the api.py. This improvement provide you the ability to change the GPT weight and the SoVITS weight while using the api. Also you can choose your expected emotion while using. GET and POST are both avaliable.This is an improvement of the original GPT-SoVITS project, mainly focusing on the api.py. This improvement provide you the ability to change the GPT weight and the SoVITS weight while using the api. Also you can choose your expected emotion while using. GET and POST are both avaliable. - AndrewFangZequan/GPT-SoVITS_improving_APIhttps://github.com/AndrewFangZequan/GPT-SoVITS_improving_API.git

1. 项目介绍

本代码是针对 GPT-SoVITS 项目的补充，主要是为了解决原项目 api 接口功能不足的问题。

原项目连接如下：GitHub - RVC-Boss/GPT-SoVITS: 1 min voice data can also be used to train a good TTS model! (few shot voice cloning)

本 api.py 文件可以直接覆盖原 GPT-SoVITS 项目代码的 api.py 文件，并不会影响原项目的完整性。主要的改进如下：

提供在运行过程中切换模型（音色）的接口，支持 POST 和 GET
提供切换情绪的接口，即可以通过预设参考音频并赋予情绪的形式，在推理过程中快速选择指定情绪，支持 POST 和 GET

2. 使用方法

所有原生 api 接口都得到保留，详细调用方法可以参考 api.py 文件的最上方的教程，或参考该博客：GPT-SoVITS 本地化部署及使用 [详细教程]_gpt sov 教程-CSDN博客

这里详细介绍新加入的接口的使用方法

2.1 切换推理模型

GET：

http://127.0.0.1:9880/change_weights?sovits_weight_path=填SoVITS模型路径&gpt_weight_path=填GPT模型路径

POST：

请求网址：http://127.0.0.1:9880/change_weights  
请求JSON：
{
    "gpt_weight_path":"填GPT模型路径",
    "sovits_weight_path":"填SoVITS模型路径"
}

返回 ok 即表示切换成功。

2.2 切换各种情绪

前期准备：准备一个命名为某某情绪的文件夹，文件夹中放入该情绪的参考音频，参考音频的名字改成对应的参考文字即可。注意每个文件夹内只能有1个参考音频文件，若有多个参考音频将自动选择第一个。目前仅支持中文参考音频。

GET：

http://127.0.0.1:9880/emotion?emotions=情绪文件夹路径&text=要生成的文字（语言不限）&text_language=目标文字的语言&cut_punc=切分方式

POST：

请求网址：http://127.0.0.1:9880/emotion
请求JSON：
{
    "emotions":"情绪文件夹路径",
    "text":"要生成的文字（语言不限）",
    "text_language":"目标文字的语言",
    "cut_punc":"切分方式"
}

其中，cut_punc 不强制要求，但是前三个参数必须传入。

3. 代码实现

切换模型实现代码如下：

@app.get("/change_weights")
async def change_weights(
        sovits_weight_path: str = None,
        gpt_weight_path: str = None
):
    change_gpt_weights(gpt_weight_path)
    change_sovits_weights(sovits_weight_path)
    return "ok"

@app.post("/change_weights")
async def change_weight(request: Request):
    json_post_raw = await request.json()
    change_gpt_weights(json_post_raw.get("gpt_weight_path"))
    change_sovits_weights(json_post_raw.get("sovits_weight_path"))
    return "ok"

修改情绪实现代码如下：

def handle_emotions(emotions,text,text_language,cut_punc):
    file_list = os.listdir(emotions)
    prompt_text = file_list[0].split(".")[0]
    refer_wav_path = os.path.join(emotions,file_list[0])
    prompt_language = "zh"
    if (
            refer_wav_path == "" or refer_wav_path is None
            or prompt_text == "" or prompt_text is None
            or prompt_language == "" or prompt_language is None
    ):
        refer_wav_path, prompt_text, prompt_language = (
            default_refer.path,
            default_refer.text,
            default_refer.language,
        )
        if not default_refer.is_ready():
            return JSONResponse({"code": 400, "message": "未指定情绪且接口无预设"}, status_code=400)
    
    if cut_punc == None:
        text = cut_text(text,default_cut_punc)
    else:
        text = cut_text(text,cut_punc)
    
    return StreamingResponse(get_tts_wav(refer_wav_path, prompt_text, prompt_language, text, text_language), media_type="audio/"+media_type)


@app.get("/emotion")
async def tts_endpoint(
        emotions:str = None,
        text: str = None,
        text_language: str = None,
        cut_punc: str = None,
):
    return handle_emotions(emotions,text,text_language,cut_punc)

#emotion 是感情文件夹路径，一个文件夹里放一个参考音频，参考音频名字改为参考文本。

@app.post("/emotion")
async def tts_endpoint(request: Request):
    json_post_raw = await request.json()
    return handle_emotions(
        json_post_raw.get("emotions"),
        json_post_raw.get("text"),
        json_post_raw.get("text_language"),
        json_post_raw.get("cut_punc"),
    )

其中，handle_emotion() 函数添加在初始化之前，后面接口实现代码需添加在 app = FastAPI（）之后。