Unity 口型驱动方式总结之与 Flask 服务联调

大只弱鱼

已于 2024-04-03 12:55:52 修改

阅读量729

点赞数 6

文章标签： unity 游戏引擎 python flask 人工智能

于 2024-04-03 10:40:57 首次发布

本文链接：https://blog.csdn.net/weixin_38777563/article/details/137331315

版权

项目背景：unity嘴型驱动目前都是使用Oculus ，但是其对于中文的支持不是很友好。

方案	参考网站	需要的BS节点	优点	缺点
Oculus	oculuslipsync-unity官网	官网15实际18	免费、使用者多	中文效果不好
Oculus + Vosk	参考网站	a i u e o	免费、个人开发者共享、中文效果可以（参考网站有视频效果）	需额外搭建音频转换服务
	GitHub地址
	训练好的模型
uLipSync	GitHub地址	a i u e o	免费、可校准	实际使用效果没有 Oculus + Vosk好
	效果视频
CRI LipSync	官网	a i u e o	中文效果好	收费贵、买断费用约11w
Unity3D Salsa LipSync	B站视频	a i u e o	\	中文效果不好、收费49.55$
	插件地址
微软AZure	代码书写参考网站	55个节点网站	微表情、ARKit通用	收费、且文档未维护到 Unity端效果并不理想
	获取Viseme后如何操作
	如何返回55节点（注意事项）
	使用定制声音
	定制声音精简版与pro区别
	定制声音（工作台）
	使用定制声音
ARkit	AR kit face BS	52个节点	面捕	\

参考了很多方案后觉得Unity + Flask 这套方案比较靠谱。

准备：

下载Vosk示例
找到Python示例
安装python 环境，我使用的是3.12.2
使用VScode打开D:\Vosk\python\example
安装Flask环境：pip install Flask

编写python 服务脚本，并监听Unity链接，解析数据

from flask import Flask,request
import io
from vosk import Model, KaldiRecognizer
import logging

# 创建 Flask 应用程序实例
app = Flask(__name__)

# 定义路由和视图函数
@app.route('/')
def hello():
    return 'Hello, World!'

@app.route('/api/unity_post_endpoint', methods=['POST'])
def receive_audio_bytes():
    try:
        if 'audioData' in request.files:
            audio_file = request.files['audioData']
            audio_bytes = audio_file.read()
            json = process_data('flakOutJson',audio_bytes)
            # 保存音频字节数组到文件或其他处理
            with open('received_audio.wav', 'wb') as f:
                f.write(audio_bytes)
        print(request.files)
        return json, 200
    except Exception as e :print(f"类型错误：{e}")

    
def process_data(text_input: str, byte_input: bytes):
    model = Model("model")
    rec = KaldiRecognizer(model, 24000)
    rec.SetWords(True)
    rec.SetPartialWords(True)
    rec.AcceptWaveform(byte_input)
    output = rec.FinalResult()
    outputfilename = text_input+'.json'
    print(output)
    with open(outputfilename,mode = 'w',encoding = 'utf-8') as file_obj:
        file_obj.write(output)
    return output


# 如果直接运行该文件，则启动 Flask 服务器
if __name__ == '__main__':
    app.run(debug=True, port=6002)

Unity端代码

public IEnumerator SendAudioDataToServer(byte[] audioData, Action<bool, string, AudioClip> GetJsonDone)
{
    string apiUrl = "http://localhost:6002/api/unity_post_endpoint";

    var form = new WWWForm();
    form.AddBinaryData("audioData", audioData, "audioFile.wav", "audio/wav");

    using (var request = UnityWebRequest.Post(apiUrl, form))
    {
        yield return request.SendWebRequest();

        if (request.result != UnityWebRequest.Result.Success)
        {
            Debug.LogError($"Upload failed: {request.error}");
            GetJsonDone?.Invoke(false,request.error,null);
        }
        else
        {
            GetJsonDone?.Invoke(true, request.downloadHandler.text, WavUtility.ToAudioClip(audioData));
            
            string audioOutPutPath = Application.persistentDataPath + "/AzureSynthesizedAudio/outPut.wav";
            SaveAsWav(audioData,audioOutPutPath);
            Debug.Log($"保存完成  路径为 ： {audioOutPutPath}");
        }
    }
}

python方法process_data是自己写的，就是把原来demo里通过bat文件执行的逻辑改成收到UnityWebRequest消息后执行，rec = KaldiRecognizer(model, 24000)这里的24000是音频采样率，从Unity端可以得到后从wwwform传过来也行，这里直接定义好了24000，所以使用的是定值
错误解决
1. unity端报错InvalidOperationException: Insecure connection not allowed

大只弱鱼

关注

6
点赞
踩
20

收藏

觉得还不错? 一键收藏
打赏
1
评论
Unity 口型驱动方式总结之与 Flask 服务联调

项目背景：unity嘴型驱动目前都是使用Oculus ，但是其对于中文的支持不是很友好。参考了很多方案后觉得Unity + Flask 这套方案比较靠谱。Oculus + Vosk好。中文效果不好、收费49.55$Unity端效果并不理想。（参考网站有视频效果）收费贵、买断费用约11w。免费、个人开发者共享、需额外搭建音频转换服务。微表情、ARKit通用。收费、且文档未维护到。
复制链接

扫一扫