千问agent本地部署
- 项目介绍:Qwen-Agent 是一个基于 Qwen 的指令跟随、工具使用、计划和记忆能力来开发 LLM 应用程序的框架。它还附带了一些示例应用程序,例如浏览器助手、代码解释器和自定义助手。项目地址:QwenLM/Qwen-Agent: Agent framework and applications built upon Qwen1.5, featuring Function Calling, Code Interpreter, RAG, and Chrome extension. (github.com)
- 准备工具(本机):anaconda,cuda12.2,git, 科学上网工具(或者自行下载放入指定目录)
- 本地部署:
- 创建虚拟环境:
- 打开anaconda prompt
- conda create -n qwen-agent python=3.10 -y #建立虚拟环境
- conda activate qwen-agent #激活虚拟环境
- 安装pytorch:pip install torch==2.0 torchvision==0.17.0 torchaudio==2.2.0 --index-url https://download.pytorch.org/whl/cu121 #该命令行从官网获取,如果cuda版本与本机不同,可进入Previous PyTorch Versions | PyTorch获取与自己版本对应的下载命令。
- 安装 flash-attention(此过程较久,至少两个小时:
- 在项目地址下打开git bash
- 输入命令git clone GitHub - Dao-AILab/flash-attention: Fast and memory-efficient exact attention #将flash-atten克隆至本地
- 回到anaconda终端,通过命令cd flash-attention进入项目
- pip uninstall -y ninja #之前没安装过依赖无需运行
- pip install ninja #安装ninja
- pip install -U "gradio>=4.0" "modelscope-studio>=0.2.1" #更改gradio为官方给出版本,否则可能出现位置错误。
- pip install . #安装flash-atten
- 部署 Qwen 模型服务
- 安装依赖
- 在刚才的git界面输入git clone https://github.com/QwenLM/Qwen.git克隆项目
- 回到anaconda终端,通过命令cd Qwen进入项目
- pip install -r requirements.txt -i https://pypi.tuna.tsinghua.edu.cn/simple #安装依赖
- pip install fastapi uvicorn openai "pydantic>=2.3.0" sse_starlette
- 启动模型服务,通过 -c 参数指定模型版本
指定 --server-name 0.0.0.0 将允许其他机器访问您的模型服务
指定 --server-name 127.0.0.1 则只允许部署模型的机器自身访问该模型服
例:python openai_api.py --server-name 0.0.0.0 --server-port 7905 -c qwen/Qwen-7B-Chat
目前,支持指定的-c参数为以下模型,按照GPU显存开销从小到大排序:
qwen/Qwen-7B-Chat-Int4
qwen/Qwen-7B-Chat
qwen/Qwen-14B-Chat-Int4
qwen/Qwen-14B-Chat
- 部署 Qwen-Agent
- 安装依赖
- git clone GitHub - QwenLM/Qwen-Agent: Agent framework and applications built upon Qwen>=2.0, featuring Function Calling, Code Interpreter, RAG, and Chrome extension.
- cd Qwen-Agent
- pip install -r requirements.txt
启动数据库服务,通过 --model_server 参数指定您在 Step 1 里部署好的模型服务
若 Step 1 的机器 IP 为 123.45.67.89,则可指定 --model_server http://123.45.67.89:7905/v1
若 Step 1 和 Step 2 是同一台机器,则可指定 --model_server http://127.0.0.1:7905/v1
python run_server.py --model_server http://127.0.0.1:7905/v1 --workstation_port 7864(步骤4中的模型服务也需运行)
- 浏览器访问Qwen-Agent
打开 http://127.0.0.1:7864/ 来使用工作台(Workstation)的创作模式(Editor模式)和对话模式(Chat模式)。
- 安装浏览器助手
安装BrowserQwen的Chrome插件(又称Chrome扩展程序):
打开Chrome浏览器,进入扩展管理
确保右上角的开发者模式 处于打开状态,之后点击加载已解压的扩展程序 上传本项目下的 browser_qwen 目录并启用;
单击谷歌浏览器右上角扩展程序图标,将BrowserQwen固定在工具栏。
【注意】:安装Chrome插件后,需要刷新页面,插件才能生效。
当您想让Qwen阅读当前网页的内容时:
打开网页,点击屏幕上的 Add to Qwen's Reading List 按钮,以授权Qwen在后台分析本页面。
再单击浏览器右上角扩展程序栏的Qwen图标,便可以和Qwen交流当前页面的内容了。
案例复现
(可在github上找到源码,或者通过文章【通义千问—Qwen-Agent系列1】Qwen-Agent 快速开始&使用和开发过程-CSDN博客)
准备工作:
- 打开anaconda prompt
- conda activate qwen-agent #进入上文的虚拟环境
- D:
- cd D:\liu\Project\qwen\Qwen-Agent #进入自己本地的Qwen-Agent文件夹
- 打开阿里云网站,登陆后搜索DashScope灵积模型服务,根据指引开通要求服务后得到如下界面:
- 左侧菜单栏管理中心下进入api-key管理
- 创建新的api并及时储存(关闭页面后无法再次查看,不小心关掉只能重新创建api,及时保存至本地)
案例一:创建一个能够读取PDF文件和利用工具的代理的过程,以及构建自定义工具,以下为详细介绍:
- 添加一个自定义工具:图片生成工具
- 使用到的LLM模型配置。
- 创建Agent,这里我们以“Assistant”代理为例,它能够使用工具和读取文件。
- 以聊天机器人的形式运行助理。
- 新建文件夹,例1.py,将下列代码粘贴至文件中
import pprint
import urllib.parse
import json5
from qwen_agent.agents import Assistant
from qwen_agent.tools.base import BaseTool, register_tool
# Step 1 (Optional): Add a custom tool named `my_image_gen`.
@register_tool('my_image_gen')
class MyImageGen(BaseTool):
# The `description` tells the agent the functionality of this tool.
description = 'AI painting (image generation) service, input text description, and return the image URL drawn based on text information.'
# The `parameters` tell the agent what input parameters the tool has.
parameters = [{
'name': 'prompt',
'type': 'string',
'description': 'Detailed description of the desired image content, in English',
'required': True
}]
def call(self, params: str, **kwargs) -> str:
# `params` are the arguments generated by the LLM agent.
prompt = json5.loads(params)['prompt']
# 对提示词进行URL编码
prompt = urllib.parse.quote(prompt)
#
return json5.dumps(
{'image_url': f'https://image.pollinations.ai/prompt/{prompt}'},
ensure_ascii=False)
# Step 2: Configure the LLM you are using.
# 这里是需要配置模型的地方。需要填写模型名字,以及model_server,即模型所在服务器名字,如果没有,也可以考虑使用api_key。
llm_cfg = {
# Use the model service provided by DashScope:
# model:模型名称
# model_server:模型所在的服务器
# api_key: 所使用到的api-key,可以显示的设置,也可以从环境变量中获取
'model': 'qwen-max',
'model_server': 'dashscope',
# 'api_key': 'YOUR_DASHSCOPE_API_KEY',
# It will use the `DASHSCOPE_API_KEY' environment variable if 'api_key' is not set here.
# Use a model service compatible with the OpenAI API, such as vLLM or Ollama:
# 'model': 'Qwen1.5-7B-Chat',
# 'model_server': 'http://localhost:8000/v1', # base_url, also known as api_base
# 'api_key': 'EMPTY',
# (Optional) LLM hyperparameters for generation:
# 用于调整生成参数的可选配置
'generate_cfg': {
'top_p': 0.8
}
}
# Step 3: Create an agent. Here we use the `Assistant` agent as an example, which is capable of using tools and reading files.
# agent的提示词指令
system_instruction = '''You are a helpful assistant.
After receiving the user's request, you should:
- first draw an image and obtain the image url,
- then run code `request.get(image_url)` to download the image,
- and finally select an image operation from the given document to process the image.
Please show the image using `plt.show()`.'''
# 工具列表,指定Assistant可以访问的工具,一个是自定义的工具,一个是代码执行器
tools = ['my_image_gen', 'code_interpreter'] # `code_interpreter` is a built-in tool for executing code.
# 助理可以读取的文件路径
files = ['./examples/resource/doc.pdf'] # Give the bot a PDF file to read.
# 初始化Assistant
bot = Assistant(llm=llm_cfg,
system_message=system_instruction,
function_list=tools,
files=files)
# Step 4: Run the agent as a chatbot.
messages = [] # This stores the chat history.
while True:
# For example, enter the query "draw a dog and rotate it 90 degrees".
query = input('user query: ')
# Append the user query to the chat history.
messages.append({'role': 'user', 'content': query})
response = []
for response in bot.run(messages=messages):
# Streaming output.
print('bot response:')
pprint.pprint(response, indent=2)
# Append the bot responses to the chat history.
messages.extend(response)
- 在step2中将api-key一行前的#删除,输入自己的api-key
- 在anaconda终端输入python +文件名运行,如python 1.py
- 输入示例问题:draw a dog and rotate it 90 degrees
- 小狗图片
- 旋转90度
案例二: 多ai群组交流
- 同案例一在同一文件夹中新建文件并贴入以下代码:
"""A group chat gradio demo"""
import json
import gradio as gr
import json5
from qwen_agent.agents import GroupChat, GroupChatCreator
from qwen_agent.agents.user_agent import PENDING_USER_INPUT
from qwen_agent.llm.schema import ContentItem, Message
def init_agent_service(cfgs):
llm_cfg = {'model': 'qwen-max'}
bot = GroupChat(agents=cfgs, llm=llm_cfg)
return bot
def init_agent_service_create():
llm_cfg = {'model': 'qwen-max'}
bot = GroupChatCreator(llm=llm_cfg)
return bot
# =========================================================
# Below is the gradio service: front-end and back-end logic
# =========================================================
app_global_para = {
'messages': [],
'messages_create': [],
'is_first_upload': False,
'uploaded_file': '',
'user_interrupt': True
}
# Initialized group chat configuration
CFGS = {
'background':
'一个陌生人互帮互助群聊',
'agents': [
{
'name': '小塘',
'description': '一个勤劳的打工人,每天沉迷工作,日渐消瘦。(这是一个真实用户)',
'is_human': True # mark this as a real person
},
{
'name': '甄嬛',
'description': '一位后宫妃嫔',
'instructions': '你是甄嬛,你正在想办法除掉皇后,你说话风格为文言文,每次说完话会调image_gen工具画一幅图,展示心情。',
'knowledge_files': [],
'selected_tools': ['image_gen']
},
{
'name': 'ikun',
'description': '熟悉蔡徐坤的动态',
'instructions': '你是蔡徐坤的粉丝,说话很简短,喜欢用颜文字表达心情,你最近迷恋看《甄嬛传》',
'knowledge_files': [],
'selected_tools': []
},
{
'name': '大头',
'description': '是一个体育生,不喜欢追星',
'instructions': '你是一个体育生,热爱运动,你不喜欢追星,你喜欢安利别人健身',
'knowledge_files': [],
'selected_tools': []
}
]
}
MAX_ROUND = 3
def app(cfgs):
# Todo: Reinstance every time or instance one time as global variable?
cfgs = json5.loads(cfgs)
bot = init_agent_service(cfgs=cfgs)
# Record all mentioned agents: reply in order
mentioned_agents_name = []
for i in range(MAX_ROUND):
messages = app_global_para['messages']
print(i, messages)
# Interrupt: there is new input from user
if i == 0:
app_global_para['user_interrupt'] = False
if i > 0 and app_global_para['user_interrupt']:
app_global_para['user_interrupt'] = False
print('GroupChat is interrupted by user input!')
# Due to the concurrency issue with Gradio, unable to call the second service simultaneously
for rsp in app(json.dumps(cfgs, ensure_ascii=False)):
yield rsp
break
# Record mentions into mentioned_agents_name list
content = ''
if messages:
if isinstance(messages[-1].content, list):
content = '\n'.join([x.text if x.text else '' for x in messages[-1].content]).strip()
else:
content = messages[-1].content.strip()
if '@' in content:
for x in content.split('@'):
for agent in cfgs['agents']:
if x.startswith(agent['name']):
if agent['name'] not in mentioned_agents_name:
mentioned_agents_name.append(agent['name'])
break
# Get one response from groupchat
response = []
try:
display_history = _get_display_history_from_message()
yield display_history
for response in bot.run(messages, need_batch_response=False, mentioned_agents_name=mentioned_agents_name):
if response:
if response[-1].content == PENDING_USER_INPUT:
# Stop printing the special message for mention human
break
incremental_history = []
for x in response:
function_display = ''
if x.function_call:
function_display = f'\nCall Function: {str(x.function_call)}'
incremental_history += [[None, f'{x.name}: {x.content}{function_display}']]
display_history = _get_display_history_from_message()
yield display_history + incremental_history
except Exception as ex:
raise ValueError(ex)
if not response:
# The topic ends
print('No one wants to talk anymore!')
break
if mentioned_agents_name:
assert response[-1].name == mentioned_agents_name[0]
mentioned_agents_name.pop(0)
if response and response[-1].content == PENDING_USER_INPUT:
# Terminate group chat and wait for user input
print('Waiting for user input!')
break
# Record the response to messages
app_global_para['messages'].extend(response)
def test():
app(cfgs=CFGS)
def app_create(history, now_cfgs):
now_cfgs = json5.loads(now_cfgs)
if not history:
yield history, json.dumps(now_cfgs, indent=4, ensure_ascii=False)
else:
if len(history) == 1:
new_cfgs = {'background': '', 'agents': []}
# The first time to create grouchat
exist_cfgs = now_cfgs['agents']
for cfg in exist_cfgs:
if 'is_human' in cfg and cfg['is_human']:
new_cfgs['agents'].append(cfg)
else:
new_cfgs = now_cfgs
app_global_para['messages_create'].append(Message('user', history[-1][0]))
response = []
try:
agent = init_agent_service_create()
for response in agent.run(messages=app_global_para['messages_create']):
display_content = ''
for rsp in response:
if rsp.name == 'role_config':
cfg = json5.loads(rsp.content)
old_pos = -1
for i, x in enumerate(new_cfgs['agents']):
if x['name'] == cfg['name']:
old_pos = i
break
if old_pos > -1:
new_cfgs['agents'][old_pos] = cfg
else:
new_cfgs['agents'].append(cfg)
display_content += f'\n\n{cfg["name"]}: {cfg["description"]}\n{cfg["instructions"]}'
elif rsp.name == 'background':
new_cfgs['background'] = rsp.content
display_content += f'\n群聊背景:{rsp.content}'
else:
display_content += f'\n{rsp.content}'
history[-1][1] = display_content.strip()
yield history, json.dumps(new_cfgs, indent=4, ensure_ascii=False)
except Exception as ex:
raise ValueError(ex)
app_global_para['messages_create'].extend(response)
def _get_display_history_from_message():
# Get display history from messages
display_history = []
for msg in app_global_para['messages']:
if isinstance(msg.content, list):
content = '\n'.join([x.text if x.text else '' for x in msg.content]).strip()
else:
content = msg.content.strip()
function_display = ''
if msg.function_call:
function_display = f'\nCall Function: {str(msg.function_call)}'
content = f'{msg.name}: {content}{function_display}'
display_history.append((content, None) if msg.name == 'user' else (None, content))
return display_history
def get_name_of_current_user(cfgs):
for agent in cfgs['agents']:
if 'is_human' in agent and agent['is_human']:
return agent['name']
return 'user'
def add_text(text, cfgs):
app_global_para['user_interrupt'] = True
content = [ContentItem(text=text)]
if app_global_para['uploaded_file'] and app_global_para['is_first_upload']:
app_global_para['is_first_upload'] = False # only send file when first upload
content.append(ContentItem(file=app_global_para['uploaded_file']))
app_global_para['messages'].append(
Message('user', content=content, name=get_name_of_current_user(json5.loads(cfgs))))
return _get_display_history_from_message(), None
def chat_clear():
app_global_para['messages'] = []
return None
def chat_clear_create():
app_global_para['messages_create'] = []
return None, None
def add_file(file):
app_global_para['uploaded_file'] = file.name
app_global_para['is_first_upload'] = True
return file.name
def add_text_create(history, text):
history = history + [(text, None)]
return history, gr.update(value='', interactive=False)
with gr.Blocks(theme='soft') as demo:
display_config = gr.Textbox(
label= # noqa
'Current GroupChat: (If editing, please maintain this JSON format)',
value=json.dumps(CFGS, indent=4, ensure_ascii=False),
interactive=True)
with gr.Tab('Chat', elem_id='chat-tab'):
with gr.Column():
chatbot = gr.Chatbot(
[],
elem_id='chatbot',
height=750,
show_copy_button=True,
)
with gr.Row():
with gr.Column(scale=3, min_width=0):
auto_speak_button = gr.Button('Randomly select an agent to speak first')
auto_speak_button.click(app, display_config, chatbot)
with gr.Column(scale=10):
chat_txt = gr.Textbox(
show_label=False,
placeholder='Chat with Qwen...',
container=False,
)
with gr.Column(scale=1, min_width=0):
chat_clr_bt = gr.Button('Clear')
chat_txt.submit(add_text, [chat_txt, display_config], [chatbot, chat_txt],
queue=False).then(app, display_config, chatbot)
chat_clr_bt.click(chat_clear, None, [chatbot], queue=False)
demo.load(chat_clear, None, [chatbot], queue=False)
with gr.Tab('Create', elem_id='chat-tab'):
with gr.Column(scale=9, min_width=0):
chatbot = gr.Chatbot(
[],
elem_id='chatbot0',
height=750,
show_copy_button=True,
)
with gr.Row():
with gr.Column(scale=13):
chat_txt = gr.Textbox(
show_label=False,
placeholder='Chat with Qwen...',
container=False,
)
with gr.Column(scale=1, min_width=0):
chat_clr_bt = gr.Button('Clear')
txt_msg = chat_txt.submit(add_text_create, [chatbot, chat_txt], [chatbot, chat_txt],
queue=False).then(app_create, [chatbot, display_config],
[chatbot, display_config])
txt_msg.then(lambda: gr.update(interactive=True), None, [chat_txt], queue=False)
chat_clr_bt.click(chat_clear_create, None, [chatbot, chat_txt], queue=False)
demo.load(chat_clear_create, None, [chatbot, chat_txt], queue=False)
if __name__ == '__main__':
demo.queue().launch()
- 在根据以下指引找到文件qwen_dashscope.py:->qwen_agent->llm->qwen_dashscope.py;打开后在代码的第23行输入自己的api-key
- 在根目录下运行命令python + 文件名(如:python group_chat.py),点击地址打开网页
- 进行对话,小塘为使用者
案例三:根据浏览过的网页、PDFs进行长文创作(接下来三个示例均来源于Qwen-Agent/browser_qwen_cn.md at main · QwenLM/Qwen-Agent (github.com))
- 首先打开run_sever.py文件,找到代码的第27行并输入自己的api-key
- 通过上面的部署,我们已经准备好了浏览器拓展,接下来我们在终端输入命令:python run_server.py --llm qwen-max --model_server dashscope --workstation_port 7864
- 打开谷歌浏览器,随意打开一个网页,点击右上角加入千问的阅读列表并确认
- 点击右上角打开千问扩展,输入问题
案例四:提取浏览内容使用代码解释器画图
- 回到终端点击7864地址进入
- 在最下面选择Code Interpreter,然后可以选择上传文件,也可以直接将数据以文字方式输入
案例五:上传文件、多轮对话利用代码解释器分析数据
- 打开文件