通义千问Qwen-Agent本地部署方法

云博士的AI课堂

于 2024-10-02 19:23:14 发布

阅读量441

点赞数 14

分类专栏：大模型技术文章标签：大模型通义千问 qwen-agent AI Agent 人工智能

本文链接：https://blog.csdn.net/l35633/article/details/142684066

版权

大模型技术专栏收录该内容

21 篇文章 0 订阅

订阅专栏

千问agent本地部署

项目介绍：Qwen-Agent 是一个基于 Qwen 的指令跟随、工具使用、计划和记忆能力来开发 LLM 应用程序的框架。它还附带了一些示例应用程序，例如浏览器助手、代码解释器和自定义助手。项目地址：QwenLM/Qwen-Agent: Agent framework and applications built upon Qwen1.5, featuring Function Calling, Code Interpreter, RAG, and Chrome extension. (github.com)
准备工具（本机）：anaconda，cuda12.2，git，科学上网工具（或者自行下载放入指定目录）
本地部署：

创建虚拟环境：

打开anaconda prompt
conda create -n qwen-agent python=3.10 -y #建立虚拟环境
conda activate qwen-agent #激活虚拟环境

安装pytorch：pip install torch==2.0 torchvision==0.17.0 torchaudio==2.2.0 --index-url https://download.pytorch.org/whl/cu121 #该命令行从官网获取，如果cuda版本与本机不同，可进入Previous PyTorch Versions | PyTorch获取与自己版本对应的下载命令。

安装 flash-attention（此过程较久，至少两个小时：

在项目地址下打开git bash

输入命令git clone GitHub - Dao-AILab/flash-attention: Fast and memory-efficient exact attention #将flash-atten克隆至本地
回到anaconda终端，通过命令cd flash-attention进入项目
pip uninstall -y ninja #之前没安装过依赖无需运行
pip install ninja #安装ninja
pip install -U "gradio>=4.0" "modelscope-studio>=0.2.1" #更改gradio为官方给出版本，否则可能出现位置错误。
pip install . #安装flash-atten

部署 Qwen 模型服务

安装依赖

在刚才的git界面输入git clone https://github.com/QwenLM/Qwen.git克隆项目
回到anaconda终端，通过命令cd Qwen进入项目
pip install -r requirements.txt -i https://pypi.tuna.tsinghua.edu.cn/simple #安装依赖
pip install fastapi uvicorn openai "pydantic>=2.3.0" sse_starlette

启动模型服务，通过 -c 参数指定模型版本

指定 --server-name 0.0.0.0 将允许其他机器访问您的模型服务

指定 --server-name 127.0.0.1 则只允许部署模型的机器自身访问该模型服

例：python openai_api.py --server-name 0.0.0.0 --server-port 7905 -c qwen/Qwen-7B-Chat

目前，支持指定的-c参数为以下模型，按照GPU显存开销从小到大排序：

qwen/Qwen-7B-Chat-Int4

qwen/Qwen-7B-Chat

qwen/Qwen-14B-Chat-Int4

qwen/Qwen-14B-Chat

部署 Qwen-Agent

安装依赖

git clone GitHub - QwenLM/Qwen-Agent: Agent framework and applications built upon Qwen>=2.0, featuring Function Calling, Code Interpreter, RAG, and Chrome extension.
cd Qwen-Agent
pip install -r requirements.txt

启动数据库服务，通过 --model_server 参数指定您在 Step 1 里部署好的模型服务

若 Step 1 的机器 IP 为 123.45.67.89，则可指定 --model_server http://123.45.67.89:7905/v1

若 Step 1 和 Step 2 是同一台机器，则可指定 --model_server http://127.0.0.1:7905/v1

python run_server.py --model_server http://127.0.0.1:7905/v1 --workstation_port 7864(步骤4中的模型服务也需运行)

浏览器访问Qwen-Agent

打开 http://127.0.0.1:7864/ 来使用工作台（Workstation）的创作模式（Editor模式）和对话模式（Chat模式）。

安装浏览器助手

安装BrowserQwen的Chrome插件（又称Chrome扩展程序）：

打开Chrome浏览器，进入扩展管理

确保右上角的开发者模式处于打开状态，之后点击加载已解压的扩展程序上传本项目下的 browser_qwen 目录并启用；

单击谷歌浏览器右上角扩展程序图标，将BrowserQwen固定在工具栏。

【注意】：安装Chrome插件后，需要刷新页面，插件才能生效。

当您想让Qwen阅读当前网页的内容时：

打开网页，点击屏幕上的 Add to Qwen's Reading List 按钮，以授权Qwen在后台分析本页面。

再单击浏览器右上角扩展程序栏的Qwen图标，便可以和Qwen交流当前页面的内容了。

案例复现

（可在github上找到源码，或者通过文章【通义千问—Qwen-Agent系列1】Qwen-Agent 快速开始&使用和开发过程-CSDN博客）

准备工作：

打开anaconda prompt
conda activate qwen-agent #进入上文的虚拟环境
D:
cd D:\liu\Project\qwen\Qwen-Agent #进入自己本地的Qwen-Agent文件夹

打开阿里云网站，登陆后搜索DashScope灵积模型服务，根据指引开通要求服务后得到如下界面：

左侧菜单栏管理中心下进入api-key管理
创建新的api并及时储存（关闭页面后无法再次查看，不小心关掉只能重新创建api，及时保存至本地）

案例一：创建一个能够读取PDF文件和利用工具的代理的过程，以及构建自定义工具，以下为详细介绍：

添加一个自定义工具：图片生成工具
使用到的LLM模型配置。
创建Agent，这里我们以“Assistant”代理为例，它能够使用工具和读取文件。
以聊天机器人的形式运行助理。

新建文件夹，例1.py，将下列代码粘贴至文件中

import pprint

import urllib.parse

import json5

from qwen_agent.agents import Assistant

from qwen_agent.tools.base import BaseTool, register_tool

# Step 1 (Optional): Add a custom tool named `my_image_gen`.

@register_tool('my_image_gen')

class MyImageGen(BaseTool):

# The `description` tells the agent the functionality of this tool.

description = 'AI painting (image generation) service, input text description, and return the image URL drawn based on text information.'

# The `parameters` tell the agent what input parameters the tool has.

parameters = [{

'name': 'prompt',

'type': 'string',

'description': 'Detailed description of the desired image content, in English',

'required': True

}]

def call(self, params: str, **kwargs) -> str:

# `params` are the arguments generated by the LLM agent.

prompt = json5.loads(params)['prompt']

# 对提示词进行URL编码

prompt = urllib.parse.quote(prompt)

return json5.dumps(

{'image_url': f'https://image.pollinations.ai/prompt/{prompt}'},

ensure_ascii=False)

# Step 2: Configure the LLM you are using.

# 这里是需要配置模型的地方。需要填写模型名字，以及model_server，即模型所在服务器名字，如果没有，也可以考虑使用api_key。

llm_cfg = {

# Use the model service provided by DashScope:

# model：模型名称

# model_server：模型所在的服务器

# api_key： 所使用到的api-key，可以显示的设置，也可以从环境变量中获取

'model': 'qwen-max',

'model_server': 'dashscope',

# 'api_key': 'YOUR_DASHSCOPE_API_KEY',

# It will use the `DASHSCOPE_API_KEY' environment variable if 'api_key' is not set here.

# Use a model service compatible with the OpenAI API, such as vLLM or Ollama:

# 'model': 'Qwen1.5-7B-Chat',

# 'model_server': 'http://localhost:8000/v1', # base_url, also known as api_base

# 'api_key': 'EMPTY',

# (Optional) LLM hyperparameters for generation:

# 用于调整生成参数的可选配置

'generate_cfg': {

'top_p': 0.8

}

# Step 3: Create an agent. Here we use the `Assistant` agent as an example, which is capable of using tools and reading files.

# agent的提示词指令

system_instruction = '''You are a helpful assistant.

After receiving the user's request, you should:

- first draw an image and obtain the image url,

- then run code `request.get(image_url)` to download the image,

- and finally select an image operation from the given document to process the image.

Please show the image using `plt.show()`.'''

# 工具列表，指定Assistant可以访问的工具，一个是自定义的工具，一个是代码执行器

tools = ['my_image_gen', 'code_interpreter'] # `code_interpreter` is a built-in tool for executing code.

# 助理可以读取的文件路径

files = ['./examples/resource/doc.pdf'] # Give the bot a PDF file to read.

# 初始化Assistant

bot = Assistant(llm=llm_cfg,

system_message=system_instruction,

function_list=tools,

files=files)

# Step 4: Run the agent as a chatbot.

messages = [] # This stores the chat history.

while True:

# For example, enter the query "draw a dog and rotate it 90 degrees".

query = input('user query: ')

# Append the user query to the chat history.

messages.append({'role': 'user', 'content': query})

response = []

for response in bot.run(messages=messages):

# Streaming output.

print('bot response:')

pprint.pprint(response, indent=2)

# Append the bot responses to the chat history.

messages.extend(response)

在step2中将api-key一行前的#删除，输入自己的api-key

在anaconda终端输入python +文件名运行，如python 1.py
输入示例问题：draw a dog and rotate it 90 degrees

小狗图片

旋转90度

案例二： 多ai群组交流

同案例一在同一文件夹中新建文件并贴入以下代码：

"""A group chat gradio demo"""

import json

import gradio as gr

import json5

from qwen_agent.agents import GroupChat, GroupChatCreator

from qwen_agent.agents.user_agent import PENDING_USER_INPUT

from qwen_agent.llm.schema import ContentItem, Message

def init_agent_service(cfgs):

llm_cfg = {'model': 'qwen-max'}

bot = GroupChat(agents=cfgs, llm=llm_cfg)

return bot

def init_agent_service_create():

llm_cfg = {'model': 'qwen-max'}

bot = GroupChatCreator(llm=llm_cfg)

return bot

# =========================================================

# Below is the gradio service: front-end and back-end logic

# =========================================================

app_global_para = {

'messages': [],

'messages_create': [],

'is_first_upload': False,

'uploaded_file': '',

'user_interrupt': True

}

# Initialized group chat configuration

CFGS = {

'background':

'一个陌生人互帮互助群聊',

'agents': [

{

'name': '小塘',

'description': '一个勤劳的打工人，每天沉迷工作，日渐消瘦。（这是一个真实用户）',

'is_human': True # mark this as a real person

{

'name': '甄嬛',

'description': '一位后宫妃嫔',

'instructions': '你是甄嬛，你正在想办法除掉皇后，你说话风格为文言文，每次说完话会调image_gen工具画一幅图，展示心情。',

'knowledge_files': [],

'selected_tools': ['image_gen']

{

'name': 'ikun',

'description': '熟悉蔡徐坤的动态',

'instructions': '你是蔡徐坤的粉丝，说话很简短，喜欢用颜文字表达心情，你最近迷恋看《甄嬛传》',

'knowledge_files': [],

'selected_tools': []

{

'name': '大头',

'description': '是一个体育生，不喜欢追星',

'instructions': '你是一个体育生，热爱运动，你不喜欢追星，你喜欢安利别人健身',

'knowledge_files': [],

'selected_tools': []

}

]

}

MAX_ROUND = 3

def app(cfgs):

# Todo: Reinstance every time or instance one time as global variable?

cfgs = json5.loads(cfgs)

bot = init_agent_service(cfgs=cfgs)

# Record all mentioned agents: reply in order

mentioned_agents_name = []

for i in range(MAX_ROUND):

messages = app_global_para['messages']

print(i, messages)

# Interrupt: there is new input from user

if i == 0:

app_global_para['user_interrupt'] = False

if i > 0 and app_global_para['user_interrupt']:

app_global_para['user_interrupt'] = False

print('GroupChat is interrupted by user input!')

# Due to the concurrency issue with Gradio, unable to call the second service simultaneously

for rsp in app(json.dumps(cfgs, ensure_ascii=False)):

yield rsp

break

# Record mentions into mentioned_agents_name list

content = ''

if messages:

if isinstance(messages[-1].content, list):

content = '\n'.join([x.text if x.text else '' for x in messages[-1].content]).strip()

else:

content = messages[-1].content.strip()

if '@' in content:

for x in content.split('@'):

for agent in cfgs['agents']:

if x.startswith(agent['name']):

if agent['name'] not in mentioned_agents_name:

mentioned_agents_name.append(agent['name'])

break

# Get one response from groupchat

response = []

try:

display_history = _get_display_history_from_message()

yield display_history

for response in bot.run(messages, need_batch_response=False, mentioned_agents_name=mentioned_agents_name):

if response:

if response[-1].content == PENDING_USER_INPUT:

# Stop printing the special message for mention human

break

incremental_history = []

for x in response:

function_display = ''

if x.function_call:

function_display = f'\nCall Function: {str(x.function_call)}'

incremental_history += [[None, f'{x.name}: {x.content}{function_display}']]

display_history = _get_display_history_from_message()

yield display_history + incremental_history

except Exception as ex:

raise ValueError(ex)

if not response:

# The topic ends

print('No one wants to talk anymore!')

break

if mentioned_agents_name:

assert response[-1].name == mentioned_agents_name[0]

mentioned_agents_name.pop(0)

if response and response[-1].content == PENDING_USER_INPUT:

# Terminate group chat and wait for user input

print('Waiting for user input!')

break

# Record the response to messages

app_global_para['messages'].extend(response)

def test():

app(cfgs=CFGS)

def app_create(history, now_cfgs):

now_cfgs = json5.loads(now_cfgs)

if not history:

yield history, json.dumps(now_cfgs, indent=4, ensure_ascii=False)

else:

if len(history) == 1:

new_cfgs = {'background': '', 'agents': []}

# The first time to create grouchat

exist_cfgs = now_cfgs['agents']

for cfg in exist_cfgs:

if 'is_human' in cfg and cfg['is_human']:

new_cfgs['agents'].append(cfg)

else:

new_cfgs = now_cfgs

app_global_para['messages_create'].append(Message('user', history[-1][0]))

response = []

try:

agent = init_agent_service_create()

for response in agent.run(messages=app_global_para['messages_create']):

display_content = ''

for rsp in response:

if rsp.name == 'role_config':

cfg = json5.loads(rsp.content)

old_pos = -1

for i, x in enumerate(new_cfgs['agents']):

if x['name'] == cfg['name']:

old_pos = i

break

if old_pos > -1:

new_cfgs['agents'][old_pos] = cfg

else:

new_cfgs['agents'].append(cfg)

display_content += f'\n\n{cfg["name"]}: {cfg["description"]}\n{cfg["instructions"]}'

elif rsp.name == 'background':

new_cfgs['background'] = rsp.content

display_content += f'\n群聊背景：{rsp.content}'

else:

display_content += f'\n{rsp.content}'

history[-1][1] = display_content.strip()

yield history, json.dumps(new_cfgs, indent=4, ensure_ascii=False)

except Exception as ex:

raise ValueError(ex)

app_global_para['messages_create'].extend(response)

def _get_display_history_from_message():

# Get display history from messages

display_history = []

for msg in app_global_para['messages']:

if isinstance(msg.content, list):

content = '\n'.join([x.text if x.text else '' for x in msg.content]).strip()

else:

content = msg.content.strip()

function_display = ''

if msg.function_call:

function_display = f'\nCall Function: {str(msg.function_call)}'

content = f'{msg.name}: {content}{function_display}'

display_history.append((content, None) if msg.name == 'user' else (None, content))

return display_history

def get_name_of_current_user(cfgs):

for agent in cfgs['agents']:

if 'is_human' in agent and agent['is_human']:

return agent['name']

return 'user'

def add_text(text, cfgs):

app_global_para['user_interrupt'] = True

content = [ContentItem(text=text)]

if app_global_para['uploaded_file'] and app_global_para['is_first_upload']:

app_global_para['is_first_upload'] = False # only send file when first upload

content.append(ContentItem(file=app_global_para['uploaded_file']))

app_global_para['messages'].append(

Message('user', content=content, name=get_name_of_current_user(json5.loads(cfgs))))

return _get_display_history_from_message(), None

def chat_clear():

app_global_para['messages'] = []

return None

def chat_clear_create():

app_global_para['messages_create'] = []

return None, None

def add_file(file):

app_global_para['uploaded_file'] = file.name

app_global_para['is_first_upload'] = True

return file.name

def add_text_create(history, text):

history = history + [(text, None)]

return history, gr.update(value='', interactive=False)

with gr.Blocks(theme='soft') as demo:

display_config = gr.Textbox(

label= # noqa

'Current GroupChat: (If editing, please maintain this JSON format)',

value=json.dumps(CFGS, indent=4, ensure_ascii=False),

interactive=True)

with gr.Tab('Chat', elem_id='chat-tab'):

with gr.Column():

chatbot = gr.Chatbot(

[],

elem_id='chatbot',

height=750,

show_copy_button=True,

)

with gr.Row():

with gr.Column(scale=3, min_width=0):

auto_speak_button = gr.Button('Randomly select an agent to speak first')

auto_speak_button.click(app, display_config, chatbot)

with gr.Column(scale=10):

chat_txt = gr.Textbox(

show_label=False,

placeholder='Chat with Qwen...',

container=False,

)

with gr.Column(scale=1, min_width=0):

chat_clr_bt = gr.Button('Clear')

chat_txt.submit(add_text, [chat_txt, display_config], [chatbot, chat_txt],

queue=False).then(app, display_config, chatbot)

chat_clr_bt.click(chat_clear, None, [chatbot], queue=False)

demo.load(chat_clear, None, [chatbot], queue=False)

with gr.Tab('Create', elem_id='chat-tab'):

with gr.Column(scale=9, min_width=0):

chatbot = gr.Chatbot(

[],

elem_id='chatbot0',

height=750,

show_copy_button=True,

)

with gr.Row():

with gr.Column(scale=13):

chat_txt = gr.Textbox(

show_label=False,

placeholder='Chat with Qwen...',

container=False,

)

with gr.Column(scale=1, min_width=0):

chat_clr_bt = gr.Button('Clear')

txt_msg = chat_txt.submit(add_text_create, [chatbot, chat_txt], [chatbot, chat_txt],

queue=False).then(app_create, [chatbot, display_config],

[chatbot, display_config])

txt_msg.then(lambda: gr.update(interactive=True), None, [chat_txt], queue=False)

chat_clr_bt.click(chat_clear_create, None, [chatbot, chat_txt], queue=False)

demo.load(chat_clear_create, None, [chatbot, chat_txt], queue=False)

if __name__ == '__main__':

demo.queue().launch()

在根据以下指引找到文件qwen_dashscope.py：->qwen_agent->llm->qwen_dashscope.py;打开后在代码的第23行输入自己的api-key

在根目录下运行命令python + 文件名（如：python group_chat.py)，点击地址打开网页

进行对话，小塘为使用者

案例三：根据浏览过的网页、PDFs进行长文创作（接下来三个示例均来源于Qwen-Agent/browser_qwen_cn.md at main · QwenLM/Qwen-Agent (github.com)）

首先打开run_sever.py文件，找到代码的第27行并输入自己的api-key

通过上面的部署，我们已经准备好了浏览器拓展，接下来我们在终端输入命令：python run_server.py --llm qwen-max --model_server dashscope --workstation_port 7864

打开谷歌浏览器，随意打开一个网页，点击右上角加入千问的阅读列表并确认

点击右上角打开千问扩展，输入问题

案例四：提取浏览内容使用代码解释器画图

回到终端点击7864地址进入

在最下面选择Code Interpreter，然后可以选择上传文件，也可以直接将数据以文字方式输入

案例五：上传文件、多轮对话利用代码解释器分析数据

打开文件

云博士的AI课堂

关注

14
点赞
踩
9

收藏

觉得还不错? 一键收藏
0
评论
复制链接

分享到 QQ

分享到新浪微博

扫一扫

专栏目录