【Qwen-Audio部署实战】Qwen-Audio-Chat模型之对话机器人部署测试

系列篇章💥



引言

自然语言处理的浩瀚星海中,Qwen-Audio-Chat 模型以其卓越的性能和创新性,犹如一颗冉冉升起的新星,照亮了智能对话技术的前行之路。它不仅代表着对话系统的前沿发展,更是为实现自然、流畅且富有洞察力的交流体验提供了坚实的技术基础。本文将带领读者深入探讨 Qwen-Audio-Chat 模型的部署与测试流程,揭示其背后的技术奥秘,共同踏上这段充满挑战与创新的技术探索之旅。我们将重点介绍如何在 web 端构建并测试一个基于 Qwen-Audio-Chat 模型的对话机器人。

一、环境准备

在开始我们的技术之旅之前,确保拥有一个稳定而强大的运行环境是至关重要的。为此,可以在 autodl 平台上租赁一台性能卓越的服务器,该服务器应配备至少 24GB 的显存,例如 NVIDIA 的 RTX 3090 显卡,以满足模型训练和推理过程中对计算资源的高需求。

在镜像的选择上,我们建议采用 PyTorch-2.0.0 作为基础框架,并搭配 Python 3.8 环境(基于 Ubuntu 20.04 系统),同时推荐使用 CUDA 11.8 版本(至少 11.3 版本)以确保与 PyTorch 的兼容性和性能优化。完成服务器的租赁后,您可以通过 JupyterLab 图形界面快速访问服务器,并在其终端中进行环境配置、模型下载以及运行演示等关键步骤。
在这里插入图片描述

二、安装依赖

在终端中,我们需要执行一系列命令来完成 pip 换源以及相关依赖包的安装。为了确保顺利完成这些步骤,请按照以下指令操作。

1、升级pip并更换源

# 升级pip
python -m pip install --upgrade pip
# 更换 pypi 源加速库的安装
pip config set global.index-url https://pypi.tuna.tsinghua.edu.cn/simple

  
  
  • 1
  • 2
  • 3
  • 4

2、安装基础依赖包

# 安装常用的科学计算和机器学习库
pip install scipy torchvision pillow tensorboard matplotlib

  
  
  • 1
  • 2

3、安装特定工具包及版本

# 安装模型管理和优化相关的包
pip install modelscope==1.9.5 accelerate tiktoken einops transformers_stream_generator==0.0.4
# 安装较新版本的Transformers 和 gradio 库以支持AI大模型的部署
pip install transformers==4.32.0 gradio==3.39.0 nest_asyncio

  
  
  • 1
  • 2
  • 3
  • 4

4、安装ffmpeg

打开终端,输入以下命令安装ffmpeg:

sudo apt update
sudo apt install ffmpeg

  
  
  • 1
  • 2

通过以上步骤,您可以顺利更新pip、更换为更快的软件源,并安装所需的Python包和库,为您的Python开发环境做好准备。

三、模型下载

1、模型下载准备

使用 modelscope 中的snapshot_download函数下载模型,第一个参数为模型名称,参数cache_dir为模型的下载路径。
在 /root/autodl-tmp 路径下新建 d.py 文件并在其中输入以下内容

import torchfrom modelscope import snapshot_download, AutoModel, AutoTokenizerfrom modelscope import GenerationConfigmodel_dir = snapshot_download('qwen/Qwen-Audio-Chat', cache_dir='/root/autodl-tmp', revision='master')

  
  
  • 1

在这里插入图片描述

2、模型下载执行

运行 python /root/autodl-tmp/d.py 执行下载,模型大小为 20 GB,下载模型大概需要10~20分钟
在这里插入图片描述

四、对话聊天机器人代码准备

在/root/autodl-tmp路径下新建web_demo_audio.py 文件并在其中输入以下内容:

# Copyright (c) Alibaba Cloud.
#
# This source code is licensed under the license found in the
# LICENSE file in the root directory of this source tree.

“”“A simple web interactive chat demo based on gradio.”“”

from argparse import ArgumentParser
from pathlib import Path

import copy
import gradio as gr
import os
import re
import secrets
import tempfile
from transformers import AutoModelForCausalLM, AutoTokenizer
from transformers.generation import GenerationConfig
from pydub import AudioSegment

#DEFAULT_CKPT_PATH = ‘Qwen/Qwen-Audio-Chat’
DEFAULT_CKPT_PATH = “/root/autodl-tmp/qwen/Qwen-Audio-Chat”

def _get_args():
parser = ArgumentParser()
parser.add_argument(“-c”, “–checkpoint-path”, type=str, default=DEFAULT_CKPT_PATH,
help=“Checkpoint name or path, default to %(default)r”)
parser.add_argument(“–cpu-only”, action=“store_true”, help=“Run demo with CPU only”)

parser<span class="token punctuation">.</span>add_argument<span class="token punctuation">(</span><span class="token string">"--share"</span><span class="token punctuation">,</span> action<span class="token operator">=</span><span class="token string">"store_true"</span><span class="token punctuation">,</span> default<span class="token operator">=</span><span class="token boolean">False</span><span class="token punctuation">,</span>
                    <span class="token builtin">help</span><span class="token operator">=</span><span class="token string">"Create a publicly shareable link for the interface."</span><span class="token punctuation">)</span>
parser<span class="token punctuation">.</span>add_argument<span class="token punctuation">(</span><span class="token string">"--inbrowser"</span><span class="token punctuation">,</span> action<span class="token operator">=</span><span class="token string">"store_true"</span><span class="token punctuation">,</span> default<span class="token operator">=</span><span class="token boolean">False</span><span class="token punctuation">,</span>
                    <span class="token builtin">help</span><span class="token operator">=</span><span class="token string">"Automatically launch the interface in a new tab on the default browser."</span><span class="token punctuation">)</span>
parser<span class="token punctuation">.</span>add_argument<span class="token punctuation">(</span><span class="token string">"--server-port"</span><span class="token punctuation">,</span> <span class="token builtin">type</span><span class="token operator">=</span><span class="token builtin">int</span><span class="token punctuation">,</span> default<span class="token operator">=</span><span class="token number">6006</span><span class="token punctuation">,</span>
                    <span class="token builtin">help</span><span class="token operator">=</span><span class="token string">"Demo server port."</span><span class="token punctuation">)</span>
parser<span class="token punctuation">.</span>add_argument<span class="token punctuation">(</span><span class="token string">"--server-name"</span><span class="token punctuation">,</span> <span class="token builtin">type</span><span class="token operator">=</span><span class="token builtin">str</span><span class="token punctuation">,</span> default<span class="token operator">=</span><span class="token string">"127.0.0.1"</span><span class="token punctuation">,</span>
                    <span class="token builtin">help</span><span class="token operator">=</span><span class="token string">"Demo server name."</span><span class="token punctuation">)</span>

args <span class="token operator">=</span> parser<span class="token punctuation">.</span>parse_args<span class="token punctuation">(</span><span class="token punctuation">)</span>
<span class="token keyword">return</span> args

def _load_model_tokenizer(args):
tokenizer = AutoTokenizer.from_pretrained(
args.checkpoint_path, trust_remote_code=True, resume_download=True,
)

<span class="token keyword">if</span> args<span class="token punctuation">.</span>cpu_only<span class="token punctuation">:</span>
    device_map <span class="token operator">=</span> <span class="token string">"cpu"</span>
<span class="token keyword">else</span><span class="token punctuation">:</span>
    device_map <span class="token operator">=</span> <span class="token string">"cuda"</span>

model <span class="token operator">=</span> AutoModelForCausalLM<span class="token punctuation">.</span>from_pretrained<span class="token punctuation">(</span>
    args<span class="token punctuation">.</span>checkpoint_path<span class="token punctuation">,</span>
    device_map<span class="token operator">=</span>device_map<span class="token punctuation">,</span>
    trust_remote_code<span class="token operator">=</span><span class="token boolean">True</span><span class="token punctuation">,</span>
    resume_download<span class="token operator">=</span><span class="token boolean">True</span><span class="token punctuation">,</span>
<span class="token punctuation">)</span><span class="token punctuation">.</span><span class="token builtin">eval</span><span class="token punctuation">(</span><span class="token punctuation">)</span>
model<span class="token punctuation">.</span>generation_config <span class="token operator">=</span> GenerationConfig<span class="token punctuation">.</span>from_pretrained<span class="token punctuation">(</span>
    args<span class="token punctuation">.</span>checkpoint_path<span class="token punctuation">,</span> trust_remote_code<span class="token operator">=</span><span class="token boolean">True</span><span class="token punctuation">,</span> resume_download<span class="token operator">=</span><span class="token boolean">True</span><span class="token punctuation">,</span>
<span class="token punctuation">)</span>

<span class="token keyword">return</span> model<span class="token punctuation">,</span> tokenizer

def parse_text(text):
lines = text.split(“\n”)
lines = [line for line in lines if line != “”]
count = 0
for i, line in enumerate(lines):
if “```” in line:
count += 1
items = line.split("</span><span class="token punctuation">)</span> <span class="token keyword">if</span> count <span class="token operator">%</span> <span class="token number">2</span> <span class="token operator">==</span> <span class="token number">1</span><span class="token punctuation">:</span> lines<span class="token punctuation">[</span>i<span class="token punctuation">]</span> <span class="token operator">=</span> <span class="token string-interpolation"><span class="token string">f'&lt;pre&gt;&lt;code class="language-</span><span class="token interpolation"><span class="token punctuation">{<!-- --></span>items<span class="token punctuation">[</span><span class="token operator">-</span><span class="token number">1</span><span class="token punctuation">]</span><span class="token punctuation">}</span></span><span class="token string">"&gt;'</span></span> <span class="token keyword">else</span><span class="token punctuation">:</span> lines<span class="token punctuation">[</span>i<span class="token punctuation">]</span> <span class="token operator">=</span> <span class="token string-interpolation"><span class="token string">f"&lt;br&gt;&lt;/code&gt;&lt;/pre&gt;"</span></span> <span class="token keyword">else</span><span class="token punctuation">:</span> <span class="token keyword">if</span> i <span class="token operator">&gt;</span> <span class="token number">0</span><span class="token punctuation">:</span> <span class="token keyword">if</span> count <span class="token operator">%</span> <span class="token number">2</span> <span class="token operator">==</span> <span class="token number">1</span><span class="token punctuation">:</span> line <span class="token operator">=</span> line<span class="token punctuation">.</span>replace<span class="token punctuation">(</span><span class="token string">", r"`“)
line = line.replace(”<“, ”&lt;“)
line = line.replace(”>“, ”&gt;“)
line = line.replace(” “, ”&nbsp;“)
line = line.replace(”*“, ”&ast;“)
line = line.replace(”“, ”&lowbar;“)
line = line.replace(”-“, ”&#45;“)
line = line.replace(”.“, ”&#46;“)
line = line.replace(”!“, ”&#33;“)
line = line.replace(”(“, ”&#40;“)
line = line.replace(”)“, ”&#41;“)
line = line.replace(”$“, ”&#36;“)
lines[i] = ”<br>“ + line
text = ”".join(lines)
return text

def _launch_demo(args, model, tokenizer):
uploaded_file_dir = os.environ.get(“GRADIO_TEMP_DIR”) or str(
Path(tempfile.gettempdir()) / “gradio”
)

<span class="token keyword">def</span> <span class="token function">predict</span><span class="token punctuation">(</span>_chatbot<span class="token punctuation">,</span> task_history<span class="token punctuation">)</span><span class="token punctuation">:</span>
    query <span class="token operator">=</span> task_history<span class="token punctuation">[</span><span class="token operator">-</span><span class="token number">1</span><span class="token punctuation">]</span><span class="token punctuation">[</span><span class="token number">0</span><span class="token punctuation">]</span>
    <span class="token keyword">print</span><span class="token punctuation">(</span><span class="token string">"User: "</span> <span class="token operator">+</span> _parse_text<span class="token punctuation">(</span>query<span class="token punctuation">)</span><span class="token punctuation">)</span>
    history_cp <span class="token operator">=</span> copy<span class="token punctuation">.</span>deepcopy<span class="token punctuation">(</span>task_history<span class="token punctuation">)</span>
    full_response <span class="token operator">=</span> <span class="token string">""</span>

    history_filter <span class="token operator">=</span> <span class="token punctuation">[</span><span class="token punctuation">]</span>
    audio_idx <span class="token operator">=</span> <span class="token number">1</span>
    pre <span class="token operator">=</span> <span class="token string">""</span>
    <span class="token keyword">global</span> last_audio
    <span class="token keyword">for</span> i<span class="token punctuation">,</span> <span class="token punctuation">(</span>q<span class="token punctuation">,</span> a<span class="token punctuation">)</span> <span class="token keyword">in</span> <span class="token builtin">enumerate</span><span class="token punctuation">(</span>history_cp<span class="token punctuation">)</span><span class="token punctuation">:</span>
        <span class="token keyword">if</span> <span class="token builtin">isinstance</span><span class="token punctuation">(</span>q<span class="token punctuation">,</span> <span class="token punctuation">(</span><span class="token builtin">tuple</span><span class="token punctuation">,</span> <span class="token builtin">list</span><span class="token punctuation">)</span><span class="token punctuation">)</span><span class="token punctuation">:</span>
            last_audio <span class="token operator">=</span> q<span class="token punctuation">[</span><span class="token number">0</span><span class="token punctuation">]</span>
            q <span class="token operator">=</span> <span class="token string-interpolation"><span class="token string">f'Audio </span><span class="token interpolation"><span class="token punctuation">{<!-- --></span>audio_idx<span class="token punctuation">}</span></span><span class="token string">: &lt;audio&gt;</span><span class="token interpolation"><span class="token punctuation">{<!-- --></span>q<span class="token punctuation">[</span><span class="token number">0</span><span class="token punctuation">]</span><span class="token punctuation">}</span></span><span class="token string">&lt;/audio&gt;'</span></span>
            pre <span class="token operator">+=</span> q <span class="token operator">+</span> <span class="token string">'\n'</span>
            audio_idx <span class="token operator">+=</span> <span class="token number">1</span>
        <span class="token keyword">else</span><span class="token punctuation">:</span>
            pre <span class="token operator">+=</span> q
            history_filter<span class="token punctuation">.</span>append<span class="token punctuation">(</span><span class="token punctuation">(</span>pre<span class="token punctuation">,</span> a<span class="token punctuation">)</span><span class="token punctuation">)</span>
            pre <span class="token operator">=</span> <span class="token string">""</span>
    history<span class="token punctuation">,</span> message <span class="token operator">=</span> history_filter<span class="token punctuation">[</span><span class="token punctuation">:</span><span class="token operator">-</span><span class="token number">1</span><span class="token punctuation">]</span><span class="token punctuation">,</span> history_filter<span class="token punctuation">[</span><span class="token operator">-</span><span class="token number">1</span><span class="token punctuation">]</span><span class="token punctuation">[</span><span class="token number">0</span><span class="token punctuation">]</span>
    response<span class="token punctuation">,</span> history <span class="token operator">=</span> model<span class="token punctuation">.</span>chat<span class="token punctuation">(</span>tokenizer<span class="token punctuation">,</span> message<span class="token punctuation">,</span> history<span class="token operator">=</span>history<span class="token punctuation">)</span>
    ts_pattern <span class="token operator">=</span> <span class="token string">r"&lt;\|\d{1,2}\.\d+\|&gt;"</span>
    all_time_stamps <span class="token operator">=</span> re<span class="token punctuation">.</span>findall<span class="token punctuation">(</span>ts_pattern<span class="token punctuation">,</span> response<span class="token punctuation">)</span>
    <span class="token keyword">print</span><span class="token punctuation">(</span>response<span class="token punctuation">)</span>
    <span class="token keyword">if</span> <span class="token punctuation">(</span><span class="token builtin">len</span><span class="token punctuation">(</span>all_time_stamps<span class="token punctuation">)</span> <span class="token operator">&gt;</span> <span class="token number">0</span><span class="token punctuation">)</span> <span class="token keyword">and</span> <span class="token punctuation">(</span><span class="token builtin">len</span><span class="token punctuation">(</span>all_time_stamps<span class="token punctuation">)</span> <span class="token operator">%</span> <span class="token number">2</span> <span class="token operator">==</span><span class="token number">0</span><span class="token punctuation">)</span> <span class="token keyword">and</span> last_audio<span class="token punctuation">:</span>
        ts_float <span class="token operator">=</span> <span class="token punctuation">[</span> <span class="token builtin">float</span><span class="token punctuation">(</span>t<span class="token punctuation">.</span>replace<span class="token punctuation">(</span><span class="token string">"&lt;|"</span><span class="token punctuation">,</span><span class="token string">""</span><span class="token punctuation">)</span><span class="token punctuation">.</span>replace<span class="token punctuation">(</span><span class="token string">"|&gt;"</span><span class="token punctuation">,</span><span class="token string">""</span><span class="token punctuation">)</span><span class="token punctuation">)</span> <span class="token keyword">for</span> t <span class="token keyword">in</span> all_time_stamps<span class="token punctuation">]</span>
        ts_float_pair <span class="token operator">=</span> <span class="token punctuation">[</span>ts_float<span class="token punctuation">[</span>i<span class="token punctuation">:</span>i <span class="token operator">+</span> <span class="token number">2</span><span class="token punctuation">]</span> <span class="token keyword">for</span> i <span class="token keyword">in</span> <span class="token builtin">range</span><span class="token punctuation">(</span><span class="token number">0</span><span class="token punctuation">,</span><span class="token builtin">len</span><span class="token punctuation">(</span>all_time_stamps<span class="token punctuation">)</span><span class="token punctuation">,</span><span class="token number">2</span><span class="token punctuation">)</span><span class="token punctuation">]</span>
        <span class="token comment"># 读取音频文件</span>
        <span class="token builtin">format</span> <span class="token operator">=</span> os<span class="token punctuation">.</span>path<span class="token punctuation">.</span>splitext<span class="token punctuation">(</span>last_audio<span class="token punctuation">)</span><span class="token punctuation">[</span><span class="token operator">-</span><span class="token number">1</span><span class="token punctuation">]</span><span class="token punctuation">.</span>replace<span class="token punctuation">(</span><span class="token string">"."</span><span class="token punctuation">,</span><span class="token string">""</span><span class="token punctuation">)</span>
        audio_file <span class="token operator">=</span> AudioSegment<span class="token punctuation">.</span>from_file<span class="token punctuation">(</span>last_audio<span class="token punctuation">,</span> <span class="token builtin">format</span><span class="token operator">=</span><span class="token builtin">format</span><span class="token punctuation">)</span>
        chat_response_t <span class="token operator">=</span> response<span class="token punctuation">.</span>replace<span class="token punctuation">(</span><span class="token string">"&lt;|"</span><span class="token punctuation">,</span> <span class="token string">""</span><span class="token punctuation">)</span><span class="token punctuation">.</span>replace<span class="token punctuation">(</span><span class="token string">"|&gt;"</span><span class="token punctuation">,</span> <span class="token string">""</span><span class="token punctuation">)</span>
        chat_response <span class="token operator">=</span> chat_response_t
        temp_dir <span class="token operator">=</span> secrets<span class="token punctuation">.</span>token_hex<span class="token punctuation">(</span><span class="token number">20</span><span class="token punctuation">)</span>
        temp_dir <span class="token operator">=</span> Path<span class="token punctuation">(</span>uploaded_file_dir<span class="token punctuation">)</span> <span class="token operator">/</span> temp_dir
        temp_dir<span class="token punctuation">.</span>mkdir<span class="token punctuation">(</span>exist_ok<span class="token operator">=</span><span class="token boolean">True</span><span class="token punctuation">,</span> parents<span class="token operator">=</span><span class="token boolean">True</span><span class="token punctuation">)</span>
        <span class="token comment"># 截取音频文件</span>
        <span class="token keyword">for</span> pair <span class="token keyword">in</span> ts_float_pair<span class="token punctuation">:</span>
            audio_clip <span class="token operator">=</span> audio_file<span class="token punctuation">[</span>pair<span class="token punctuation">[</span><span class="token number">0</span><span class="token punctuation">]</span> <span class="token operator">*</span> <span class="token number">1000</span><span class="token punctuation">:</span> pair<span class="token punctuation">[</span><span class="token number">1</span><span class="token punctuation">]</span> <span class="token operator">*</span> <span class="token number">1000</span><span class="token punctuation">]</span>
            <span class="token comment"># 保存音频文件</span>
            name <span class="token operator">=</span> <span class="token string-interpolation"><span class="token string">f"tmp</span><span class="token interpolation"><span class="token punctuation">{<!-- --></span>secrets<span class="token punctuation">.</span>token_hex<span class="token punctuation">(</span><span class="token number">5</span><span class="token punctuation">)</span><span class="token punctuation">}</span></span><span class="token string">.</span><span class="token interpolation"><span class="token punctuation">{<!-- --></span><span class="token builtin">format</span><span class="token punctuation">}</span></span><span class="token string">"</span></span>
            filename <span class="token operator">=</span> temp_dir <span class="token operator">/</span> name
            audio_clip<span class="token punctuation">.</span>export<span class="token punctuation">(</span>filename<span class="token punctuation">,</span> <span class="token builtin">format</span><span class="token operator">=</span><span class="token builtin">format</span><span class="token punctuation">)</span>
            _chatbot<span class="token punctuation">[</span><span class="token operator">-</span><span class="token number">1</span><span class="token punctuation">]</span> <span class="token operator">=</span> <span class="token punctuation">(</span>_parse_text<span class="token punctuation">(</span>query<span class="token punctuation">)</span><span class="token punctuation">,</span> chat_response<span class="token punctuation">)</span>
            _chatbot<span class="token punctuation">.</span>append<span class="token punctuation">(</span><span class="token punctuation">(</span><span class="token boolean">None</span><span class="token punctuation">,</span> <span class="token punctuation">(</span><span class="token builtin">str</span><span class="token punctuation">(</span>filename<span class="token punctuation">)</span><span class="token punctuation">,</span><span class="token punctuation">)</span><span class="token punctuation">)</span><span class="token punctuation">)</span>
    <span class="token keyword">else</span><span class="token punctuation">:</span>
        _chatbot<span class="token punctuation">[</span><span class="token operator">-</span><span class="token number">1</span><span class="token punctuation">]</span> <span class="token operator">=</span> <span class="token punctuation">(</span>_parse_text<span class="token punctuation">(</span>query<span class="token punctuation">)</span><span class="token punctuation">,</span> response<span class="token punctuation">)</span>

    full_response <span class="token operator">=</span> _parse_text<span class="token punctuation">(</span>response<span class="token punctuation">)</span>

    task_history<span class="token punctuation">[</span><span class="token operator">-</span><span class="token number">1</span><span class="token punctuation">]</span> <span class="token operator">=</span> <span class="token punctuation">(</span>query<span class="token punctuation">,</span> full_response<span class="token punctuation">)</span>
    <span class="token keyword">print</span><span class="token punctuation">(</span><span class="token string">"Qwen-Audio-Chat: "</span> <span class="token operator">+</span> _parse_text<span class="token punctuation">(</span>full_response<span class="token punctuation">)</span><span class="token punctuation">)</span>
    <span class="token keyword">return</span> _chatbot

<span class="token keyword">def</span> <span class="token function">regenerate</span><span class="token punctuation">(</span>_chatbot<span class="token punctuation">,</span> task_history<span class="token punctuation">)</span><span class="token punctuation">:</span>
    <span class="token keyword">if</span> <span class="token keyword">not</span> task_history<span class="token punctuation">:</span>
        <span class="token keyword">return</span> _chatbot
    item <span class="token operator">=</span> task_history<span class="token punctuation">[</span><span class="token operator">-</span><span class="token number">1</span><span class="token punctuation">]</span>
    <span class="token keyword">if</span> item<span class="token punctuation">[</span><span class="token number">1</span><span class="token punctuation">]</span> <span class="token keyword">is</span> <span class="token boolean">None</span><span class="token punctuation">:</span>
        <span class="token keyword">return</span> _chatbot
    task_history<span class="token punctuation">[</span><span class="token operator">-</span><span class="token number">1</span><span class="token punctuation">]</span> <span class="token operator">=</span> <span class="token punctuation">(</span>item<span class="token punctuation">[</span><span class="token number">0</span><span class="token punctuation">]</span><span class="token punctuation">,</span> <span class="token boolean">None</span><span class="token punctuation">)</span>
    chatbot_item <span class="token operator">=</span> _chatbot<span class="token punctuation">.</span>pop<span class="token punctuation">(</span><span class="token operator">-</span><span class="token number">1</span><span class="token punctuation">)</span>
    <span class="token keyword">if</span> chatbot_item<span class="token punctuation">[</span><span class="token number">0</span><span class="token punctuation">]</span> <span class="token keyword">is</span> <span class="token boolean">None</span><span class="token punctuation">:</span>
        _chatbot<span class="token punctuation">[</span><span class="token operator">-</span><span class="token number">1</span><span class="token punctuation">]</span> <span class="token operator">=</span> <span class="token punctuation">(</span>_chatbot<span class="token punctuation">[</span><span class="token operator">-</span><span class="token number">1</span><span class="token punctuation">]</span><span class="token punctuation">[</span><span class="token number">0</span><span class="token punctuation">]</span><span class="token punctuation">,</span> <span class="token boolean">None</span><span class="token punctuation">)</span>
    <span class="token keyword">else</span><span class="token punctuation">:</span>
        _chatbot<span class="token punctuation">.</span>append<span class="token punctuation">(</span><span class="token punctuation">(</span>chatbot_item<span class="token punctuation">[</span><span class="token number">0</span><span class="token punctuation">]</span><span class="token punctuation">,</span> <span class="token boolean">None</span><span class="token punctuation">)</span><span class="token punctuation">)</span>
    <span class="token keyword">return</span> predict<span class="token punctuation">(</span>_chatbot<span class="token punctuation">,</span> task_history<span class="token punctuation">)</span>

<span class="token keyword">def</span> <span class="token function">add_text</span><span class="token punctuation">(</span>history<span class="token punctuation">,</span> task_history<span class="token punctuation">,</span> text<span class="token punctuation">)</span><span class="token punctuation">:</span>
    history <span class="token operator">=</span> history <span class="token operator">+</span> <span class="token punctuation">[</span><span class="token punctuation">(</span>_parse_text<span class="token punctuation">(</span>text<span class="token punctuation">)</span><span class="token punctuation">,</span> <span class="token boolean">None</span><span class="token punctuation">)</span><span class="token punctuation">]</span>
    task_history <span class="token operator">=</span> task_history <span class="token operator">+</span> <span class="token punctuation">[</span><span class="token punctuation">(</span>text<span class="token punctuation">,</span> <span class="token boolean">None</span><span class="token punctuation">)</span><span class="token punctuation">]</span>
    <span class="token keyword">return</span> history<span class="token punctuation">,</span> task_history<span class="token punctuation">,</span> <span class="token string">""</span>

<span class="token keyword">def</span> <span class="token function">add_file</span><span class="token punctuation">(</span>history<span class="token punctuation">,</span> task_history<span class="token punctuation">,</span> <span class="token builtin">file</span><span class="token punctuation">)</span><span class="token punctuation">:</span>
    history <span class="token operator">=</span> history <span class="token operator">+</span> <span class="token punctuation">[</span><span class="token punctuation">(</span><span class="token punctuation">(</span><span class="token builtin">file</span><span class="token punctuation">.</span>name<span class="token punctuation">,</span><span class="token punctuation">)</span><span class="token punctuation">,</span> <span class="token boolean">None</span><span class="token punctuation">)</span><span class="token punctuation">]</span>
    task_history <span class="token operator">=</span> task_history <span class="token operator">+</span> <span class="token punctuation">[</span><span class="token punctuation">(</span><span class="token punctuation">(</span><span class="token builtin">file</span><span class="token punctuation">.</span>name<span class="token punctuation">,</span><span class="token punctuation">)</span><span class="token punctuation">,</span> <span class="token boolean">None</span><span class="token punctuation">)</span><span class="token punctuation">]</span>
    <span class="token keyword">return</span> history<span class="token punctuation">,</span> task_history

<span class="token keyword">def</span> <span class="token function">add_mic</span><span class="token punctuation">(</span>history<span class="token punctuation">,</span> task_history<span class="token punctuation">,</span> <span class="token builtin">file</span><span class="token punctuation">)</span><span class="token punctuation">:</span>
    <span class="token keyword">if</span> <span class="token builtin">file</span> <span class="token keyword">is</span> <span class="token boolean">None</span><span class="token punctuation">:</span>
        <span class="token keyword">return</span> history<span class="token punctuation">,</span> task_history
    os<span class="token punctuation">.</span>rename<span class="token punctuation">(</span><span class="token builtin">file</span><span class="token punctuation">,</span> <span class="token builtin">file</span> <span class="token operator">+</span> <span class="token string">'.wav'</span><span class="token punctuation">)</span>
    <span class="token keyword">print</span><span class="token punctuation">(</span><span class="token string">"add_mic file:"</span><span class="token punctuation">,</span> <span class="token builtin">file</span><span class="token punctuation">)</span>
    <span class="token keyword">print</span><span class="token punctuation">(</span><span class="token string">"add_mic history:"</span><span class="token punctuation">,</span> history<span class="token punctuation">)</span>
    <span class="token keyword">print</span><span class="token punctuation">(</span><span class="token string">"add_mic task_history:"</span><span class="token punctuation">,</span> task_history<span class="token punctuation">)</span>
    <span class="token comment"># history = history + [((file.name,), None)]</span>
    <span class="token comment"># task_history = task_history + [((file.name,), None)]</span>
    task_history <span class="token operator">=</span> task_history <span class="token operator">+</span> <span class="token punctuation">[</span><span class="token punctuation">(</span><span class="token punctuation">(</span><span class="token builtin">file</span> <span class="token operator">+</span> <span class="token string">'.wav'</span><span class="token punctuation">,</span><span class="token punctuation">)</span><span class="token punctuation">,</span> <span class="token boolean">None</span><span class="token punctuation">)</span><span class="token punctuation">]</span>
    history <span class="token operator">=</span> history <span class="token operator">+</span> <span class="token punctuation">[</span><span class="token punctuation">(</span><span class="token punctuation">(</span><span class="token builtin">file</span> <span class="token operator">+</span> <span class="token string">'.wav'</span><span class="token punctuation">,</span><span class="token punctuation">)</span><span class="token punctuation">,</span> <span class="token boolean">None</span><span class="token punctuation">)</span><span class="token punctuation">]</span>
    <span class="token keyword">print</span><span class="token punctuation">(</span><span class="token string">"task_history"</span><span class="token punctuation">,</span> task_history<span class="token punctuation">)</span>
    <span class="token keyword">return</span> history<span class="token punctuation">,</span> task_history

<span class="token keyword">def</span> <span class="token function">reset_user_input</span><span class="token punctuation">(</span><span class="token punctuation">)</span><span class="token punctuation">:</span>
    <span class="token keyword">return</span> gr<span class="token punctuation">.</span>update<span class="token punctuation">(</span>value<span class="token operator">=</span><span class="token string">""</span><span class="token punctuation">)</span>

<span class="token keyword">def</span> <span class="token function">reset_state</span><span class="token punctuation">(</span>task_history<span class="token punctuation">)</span><span class="token punctuation">:</span>
    task_history<span class="token punctuation">.</span>clear<span class="token punctuation">(</span><span class="token punctuation">)</span>
    <span class="token keyword">return</span> <span class="token punctuation">[</span><span class="token punctuation">]</span>

<span class="token keyword">with</span> gr<span class="token punctuation">.</span>Blocks<span class="token punctuation">(</span><span class="token punctuation">)</span> <span class="token keyword">as</span> demo<span class="token punctuation">:</span>
    gr<span class="token punctuation">.</span>Markdown<span class="token punctuation">(</span><span class="token triple-quoted-string string">"""\

<p align=“center”><img src=“https://qianwen-res.oss-cn-beijing.aliyuncs.com/Qwen-Audio/logo.jpg” style=“height: 80px”/><p>“”“) ## todo
gr.Markdown(”“”<center><font size=8>Qwen-Audio-Chat Bot</center>“”“)
gr.Markdown(
”“”
<center><font size=3>This WebUI is based on Qwen-Audio-Chat, developed by Alibaba Cloud.
(本WebUI基于Qwen-Audio-Chat打造,实现聊天机器人功能。)</center>“”“
)
gr.Markdown(”“”
<center><font size=4>Qwen-Audio <a href=“https://modelscope.cn/models/qwen/Qwen-Audio/summary”>🤖 </a>
| <a href=“https://huggingface.co/Qwen/Qwen-Audio”>🤗</a>&nbsp |
Qwen-Audio-Chat <a href=“https://modelscope.cn/models/qwen/Qwen-Audio-Chat/summary”>🤖 </a> |
<a href=“https://huggingface.co/Qwen/Qwen-Audio-Chat”>🤗</a>&nbsp |
&nbsp<a href=“https://github.com/QwenLM/Qwen-Audio”>Github</a></center>“”"
)

    chatbot <span class="token operator">=</span> gr<span class="token punctuation">.</span>Chatbot<span class="token punctuation">(</span>label<span class="token operator">=</span><span class="token string">'Qwen-Audio-Chat'</span><span class="token punctuation">,</span> elem_classes<span class="token operator">=</span><span class="token string">"control-height"</span><span class="token punctuation">,</span> height<span class="token operator">=</span><span class="token number">750</span><span class="token punctuation">)</span>
    query <span class="token operator">=</span> gr<span class="token punctuation">.</span>Textbox<span class="token punctuation">(</span>lines<span class="token operator">=</span><span class="token number">2</span><span class="token punctuation">,</span> label<span class="token operator">=</span><span class="token string">'Input'</span><span class="token punctuation">)</span>
    task_history <span class="token operator">=</span> gr<span class="token punctuation">.</span>State<span class="token punctuation">(</span><span class="token punctuation">[</span><span class="token punctuation">]</span><span class="token punctuation">)</span>
    mic <span class="token operator">=</span> gr<span class="token punctuation">.</span>Audio<span class="token punctuation">(</span>source<span class="token operator">=</span><span class="token string">"microphone"</span><span class="token punctuation">,</span> <span class="token builtin">type</span><span class="token operator">=</span><span class="token string">"filepath"</span><span class="token punctuation">)</span>

    <span class="token keyword">with</span> gr<span class="token punctuation">.</span>Row<span class="token punctuation">(</span><span class="token punctuation">)</span><span class="token punctuation">:</span>
        empty_bin <span class="token operator">=</span> gr<span class="token punctuation">.</span>Button<span class="token punctuation">(</span><span class="token string">"🧹 Clear History (清除历史)"</span><span class="token punctuation">)</span>
        submit_btn <span class="token operator">=</span> gr<span class="token punctuation">.</span>Button<span class="token punctuation">(</span><span class="token string">"🚀 Submit (发送)"</span><span class="token punctuation">)</span>
        regen_btn <span class="token operator">=</span> gr<span class="token punctuation">.</span>Button<span class="token punctuation">(</span><span class="token string">"🤔️ Regenerate (重试)"</span><span class="token punctuation">)</span>
        addfile_btn <span class="token operator">=</span> gr<span class="token punctuation">.</span>UploadButton<span class="token punctuation">(</span><span class="token string">"📁 Upload (上传文件)"</span><span class="token punctuation">,</span> file_types<span class="token operator">=</span><span class="token punctuation">[</span><span class="token string">"audio"</span><span class="token punctuation">]</span><span class="token punctuation">)</span>

    mic<span class="token punctuation">.</span>change<span class="token punctuation">(</span>add_mic<span class="token punctuation">,</span> <span class="token punctuation">[</span>chatbot<span class="token punctuation">,</span> task_history<span class="token punctuation">,</span> mic<span class="token punctuation">]</span><span class="token punctuation">,</span> <span class="token punctuation">[</span>chatbot<span class="token punctuation">,</span> task_history<span class="token punctuation">]</span><span class="token punctuation">)</span>
    submit_btn<span class="token punctuation">.</span>click<span class="token punctuation">(</span>add_text<span class="token punctuation">,</span> <span class="token punctuation">[</span>chatbot<span class="token punctuation">,</span> task_history<span class="token punctuation">,</span> query<span class="token punctuation">]</span><span class="token punctuation">,</span> <span class="token punctuation">[</span>chatbot<span class="token punctuation">,</span> task_history<span class="token punctuation">]</span><span class="token punctuation">)</span><span class="token punctuation">.</span>then<span class="token punctuation">(</span>
        predict<span class="token punctuation">,</span> <span class="token punctuation">[</span>chatbot<span class="token punctuation">,</span> task_history<span class="token punctuation">]</span><span class="token punctuation">,</span> <span class="token punctuation">[</span>chatbot<span class="token punctuation">]</span><span class="token punctuation">,</span> show_progress<span class="token operator">=</span><span class="token boolean">True</span>
    <span class="token punctuation">)</span>
    submit_btn<span class="token punctuation">.</span>click<span class="token punctuation">(</span>reset_user_input<span class="token punctuation">,</span> <span class="token punctuation">[</span><span class="token punctuation">]</span><span class="token punctuation">,</span> <span class="token punctuation">[</span>query<span class="token punctuation">]</span><span class="token punctuation">)</span>
    empty_bin<span class="token punctuation">.</span>click<span class="token punctuation">(</span>reset_state<span class="token punctuation">,</span> <span class="token punctuation">[</span>task_history<span class="token punctuation">]</span><span class="token punctuation">,</span> <span class="token punctuation">[</span>chatbot<span class="token punctuation">]</span><span class="token punctuation">,</span> show_progress<span class="token operator">=</span><span class="token boolean">True</span><span class="token punctuation">)</span>
    regen_btn<span class="token punctuation">.</span>click<span class="token punctuation">(</span>regenerate<span class="token punctuation">,</span> <span class="token punctuation">[</span>chatbot<span class="token punctuation">,</span> task_history<span class="token punctuation">]</span><span class="token punctuation">,</span> <span class="token punctuation">[</span>chatbot<span class="token punctuation">]</span><span class="token punctuation">,</span> show_progress<span class="token operator">=</span><span class="token boolean">True</span><span class="token punctuation">)</span>
    addfile_btn<span class="token punctuation">.</span>upload<span class="token punctuation">(</span>add_file<span class="token punctuation">,</span> <span class="token punctuation">[</span>chatbot<span class="token punctuation">,</span> task_history<span class="token punctuation">,</span> addfile_btn<span class="token punctuation">]</span><span class="token punctuation">,</span> <span class="token punctuation">[</span>chatbot<span class="token punctuation">,</span> task_history<span class="token punctuation">]</span><span class="token punctuation">,</span> show_progress<span class="token operator">=</span><span class="token boolean">True</span><span class="token punctuation">)</span>

    gr<span class="token punctuation">.</span>Markdown<span class="token punctuation">(</span><span class="token triple-quoted-string string">"""\

<font size=2>Note: This demo is governed by the original license of Qwen-Audio.
We strongly advise users not to knowingly generate or allow others to knowingly generate harmful content,
including hate speech, violence, pornography, deception, etc.
(注:本演示受Qwen-Audio的许可协议限制。我们强烈建议,用户不应传播及不应允许他人传播以下内容,
包括但不限于仇恨言论、暴力、色情、欺诈相关的有害信息。)“”")

demo<span class="token punctuation">.</span>queue<span class="token punctuation">(</span><span class="token punctuation">)</span><span class="token punctuation">.</span>launch<span class="token punctuation">(</span>
    share<span class="token operator">=</span>args<span class="token punctuation">.</span>share<span class="token punctuation">,</span>
    inbrowser<span class="token operator">=</span>args<span class="token punctuation">.</span>inbrowser<span class="token punctuation">,</span>
    server_port<span class="token operator">=</span>args<span class="token punctuation">.</span>server_port<span class="token punctuation">,</span>
    server_name<span class="token operator">=</span>args<span class="token punctuation">.</span>server_name<span class="token punctuation">,</span>
    file_directories<span class="token operator">=</span><span class="token punctuation">[</span><span class="token string">"/tmp/"</span><span class="token punctuation">]</span>
<span class="token punctuation">)</span>

def main():
args = _get_args()

model<span class="token punctuation">,</span> tokenizer <span class="token operator">=</span> _load_model_tokenizer<span class="token punctuation">(</span>args<span class="token punctuation">)</span>

_launch_demo<span class="token punctuation">(</span>args<span class="token punctuation">,</span> model<span class="token punctuation">,</span> tokenizer<span class="token punctuation">)</span>

if name == main:
main()

  • 1
  • 2
  • 3
  • 4
  • 5
  • 6
  • 7
  • 8
  • 9
  • 10
  • 11
  • 12
  • 13
  • 14
  • 15
  • 16
  • 17
  • 18
  • 19
  • 20
  • 21
  • 22
  • 23
  • 24
  • 25
  • 26
  • 27
  • 28
  • 29
  • 30
  • 31
  • 32
  • 33
  • 34
  • 35
  • 36
  • 37
  • 38
  • 39
  • 40
  • 41
  • 42
  • 43
  • 44
  • 45
  • 46
  • 47
  • 48
  • 49
  • 50
  • 51
  • 52
  • 53
  • 54
  • 55
  • 56
  • 57
  • 58
  • 59
  • 60
  • 61
  • 62
  • 63
  • 64
  • 65
  • 66
  • 67
  • 68
  • 69
  • 70
  • 71
  • 72
  • 73
  • 74
  • 75
  • 76
  • 77
  • 78
  • 79
  • 80
  • 81
  • 82
  • 83
  • 84
  • 85
  • 86
  • 87
  • 88
  • 89
  • 90
  • 91
  • 92
  • 93
  • 94
  • 95
  • 96
  • 97
  • 98
  • 99
  • 100
  • 101
  • 102
  • 103
  • 104
  • 105
  • 106
  • 107
  • 108
  • 109
  • 110
  • 111
  • 112
  • 113
  • 114
  • 115
  • 116
  • 117
  • 118
  • 119
  • 120
  • 121
  • 122
  • 123
  • 124
  • 125
  • 126
  • 127
  • 128
  • 129
  • 130
  • 131
  • 132
  • 133
  • 134
  • 135
  • 136
  • 137
  • 138
  • 139
  • 140
  • 141
  • 142
  • 143
  • 144
  • 145
  • 146
  • 147
  • 148
  • 149
  • 150
  • 151
  • 152
  • 153
  • 154
  • 155
  • 156
  • 157
  • 158
  • 159
  • 160
  • 161
  • 162
  • 163
  • 164
  • 165
  • 166
  • 167
  • 168
  • 169
  • 170
  • 171
  • 172
  • 173
  • 174
  • 175
  • 176
  • 177
  • 178
  • 179
  • 180
  • 181
  • 182
  • 183
  • 184
  • 185
  • 186
  • 187
  • 188
  • 189
  • 190
  • 191
  • 192
  • 193
  • 194
  • 195
  • 196
  • 197
  • 198
  • 199
  • 200
  • 201
  • 202
  • 203
  • 204
  • 205
  • 206
  • 207
  • 208
  • 209
  • 210
  • 211
  • 212
  • 213
  • 214
  • 215
  • 216
  • 217
  • 218
  • 219
  • 220
  • 221
  • 222
  • 223
  • 224
  • 225
  • 226
  • 227
  • 228
  • 229
  • 230
  • 231
  • 232
  • 233
  • 234
  • 235
  • 236
  • 237
  • 238
  • 239
  • 240
  • 241
  • 242
  • 243
  • 244
  • 245
  • 246
  • 247
  • 248
  • 249
  • 250
  • 251
  • 252
  • 253
  • 254
  • 255
  • 256
  • 257
  • 258
  • 259
  • 260
  • 261
  • 262
  • 263
  • 264

五、对话聊天机器人运行实践

1、修改默认端口

注意下面代码中默认端口的设置,修改为6006
在这里插入图片描述

2、启动运行web chat机器人

执行以下命令启动对话聊天机器人

python /root/autodl-tmp/web_demo_audio.py

 
 
  • 1

启动成功如下:
在这里插入图片描述

3、端口代理映射

使用autoDL SSH隧道工具代理端口
在这里插入图片描述

4、访问web聊天对话界面

在浏览器中打开链接 http://localhost:6006/ ,即可看到聊天界面。
在这里插入图片描述

5、普通对话聊天

在这里插入图片描述

6、音频文件识别

1)通过“上传文件”按钮,上传前面准备的音频文件
https://github.com/QwenLM/Qwen-Audio/raw/main/assets/audio/1272-128104-0000.flac
在这里插入图片描述

2)基于音频文件进行对话聊天
基于音频文件,测试模型对音频文件内容的识别是否准确
在这里插入图片描述

结语

本文的探索之旅即将结束,但我们对 Qwen-Audio-Chat 模型的深入理解和应用实践才刚刚开始。通过本文的指导,我们不仅成功部署了基于此模型的对话机器人,更对智能对话系统的构建有了更深刻的认识。

随着技术的不断演进,我们期待对话机器人在更多场景下展现出其独特的价值,为人类社会带来便利和创新。同时,我们也鼓励读者继续探索和实践,以推动智能对话技术的发展,实现更自然、更智能的交互体验。

在这里插入图片描述
🎯🔖更多专栏系列文章:AI大模型提示工程完全指南AI大模型探索之路(零基础入门)AI大模型预训练微调进阶AI大模型开源精选实践AI大模型RAG应用探索实践🔥🔥🔥 其他专栏可以查看博客主页📑

😎 作者介绍:我是寻道AI小兵,资深程序老猿,从业10年+、互联网系统架构师,目前专注于AIGC的探索。
📖 技术交流:欢迎关注【小兵的AI视界】公众号或扫描下方👇二维码,加入技术交流群,开启编程探索之旅。
💘精心准备📚500本编程经典书籍、💎AI专业教程,以及高效AI工具。等你加入,与我们一同成长,共铸辉煌未来。
如果文章内容对您有所触动,别忘了点赞、⭐关注,收藏!加入我,让我们携手同行AI的探索之旅,一起开启智能时代的大门!

  • 0
    点赞
  • 0
    收藏
    觉得还不错? 一键收藏
  • 0
    评论
评论
添加红包

请填写红包祝福语或标题

红包个数最小为10个

红包金额最低5元

当前余额3.43前往充值 >
需支付:10.00
成就一亿技术人!
领取后你会自动成为博主和红包主的粉丝 规则
hope_wisdom
发出的红包
实付
使用余额支付
点击重新获取
扫码支付
钱包余额 0

抵扣说明:

1.余额是钱包充值的虚拟货币,按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载,可以购买VIP、付费专栏及课程。

余额充值