通义千问Qwen-Agent本地部署方法

千问agent本地部署

  1. 创建虚拟环境:
  1. 打开anaconda prompt
  2. conda create -n qwen-agent python=3.10 -y #建立虚拟环境
  3. conda activate qwen-agent #激活虚拟环境
  1. 安装pytorch:pip install torch==2.0 torchvision==0.17.0 torchaudio==2.2.0 --index-url https://download.pytorch.org/whl/cu121 #该命令行从官网获取,如果cuda版本与本机不同,可进入Previous PyTorch Versions | PyTorch获取与自己版本对应的下载命令。

  1. 安装 flash-attention(此过程较久,至少两个小时:
  1. 在项目地址下打开git bash

  1. 输入命令git clone GitHub - Dao-AILab/flash-attention: Fast and memory-efficient exact attention  #将flash-atten克隆至本地
  2. 回到anaconda终端,通过命令cd flash-attention进入项目
  3. pip uninstall -y ninja #之前没安装过依赖无需运行
  4. pip install ninja #安装ninja
  5. pip install -U "gradio>=4.0" "modelscope-studio>=0.2.1" #更改gradio为官方给出版本,否则可能出现位置错误。
  6. pip install . #安装flash-atten
  1. 部署 Qwen 模型服务
  1. 安装依赖
  1. 在刚才的git界面输入git clone https://github.com/QwenLM/Qwen.git克隆项目
  2. 回到anaconda终端,通过命令cd Qwen进入项目
  3. pip install -r requirements.txt -i https://pypi.tuna.tsinghua.edu.cn/simple #安装依赖
  4. pip install fastapi uvicorn openai "pydantic>=2.3.0" sse_starlette
  1. 启动模型服务,通过 -c 参数指定模型版本

指定 --server-name 0.0.0.0 将允许其他机器访问您的模型服务

指定 --server-name 127.0.0.1 则只允许部署模型的机器自身访问该模型服

例:python openai_api.py --server-name 0.0.0.0 --server-port 7905 -c qwen/Qwen-7B-Chat

目前,支持指定的-c参数为以下模型,按照GPU显存开销从小到大排序:

qwen/Qwen-7B-Chat-Int4

qwen/Qwen-7B-Chat

qwen/Qwen-14B-Chat-Int4

qwen/Qwen-14B-Chat

  1. 部署 Qwen-Agent
  1. 安装依赖
  1. git clone GitHub - QwenLM/Qwen-Agent: Agent framework and applications built upon Qwen>=2.0, featuring Function Calling, Code Interpreter, RAG, and Chrome extension.
  2. cd Qwen-Agent
  3. pip install -r requirements.txt

启动数据库服务,通过 --model_server 参数指定您在 Step 1 里部署好的模型服务

若 Step 1 的机器 IP 为 123.45.67.89,则可指定 --model_server http://123.45.67.89:7905/v1

若 Step 1 和 Step 2 是同一台机器,则可指定 --model_server http://127.0.0.1:7905/v1

python run_server.py --model_server http://127.0.0.1:7905/v1 --workstation_port 7864(步骤4中的模型服务也需运行)

  1. 浏览器访问Qwen-Agent

打开 http://127.0.0.1:7864/ 来使用工作台(Workstation)的创作模式(Editor模式)和对话模式(Chat模式)。

  1. 安装浏览器助手

安装BrowserQwen的Chrome插件(又称Chrome扩展程序):

打开Chrome浏览器,进入扩展管理

确保右上角的开发者模式 处于打开状态,之后点击加载已解压的扩展程序 上传本项目下的 browser_qwen 目录并启用;

 

单击谷歌浏览器右上角扩展程序图标,将BrowserQwen固定在工具栏。

【注意】:安装Chrome插件后,需要刷新页面,插件才能生效。

当您想让Qwen阅读当前网页的内容时:

打开网页,点击屏幕上的 Add to Qwen's Reading List 按钮,以授权Qwen在后台分析本页面。

再单击浏览器右上角扩展程序栏的Qwen图标,便可以和Qwen交流当前页面的内容了。

案例复现

(可在github上找到源码,或者通过文章【通义千问—Qwen-Agent系列1】Qwen-Agent 快速开始&使用和开发过程-CSDN博客

准备工作:

  1. 打开anaconda prompt
  2. conda activate qwen-agent       #进入上文的虚拟环境
  3. D:
  4. cd D:\liu\Project\qwen\Qwen-Agent      #进入自己本地的Qwen-Agent文件夹

  1. 打开阿里云网站,登陆后搜索DashScope灵积模型服务,根据指引开通要求服务后得到如下界面:

  1. 左侧菜单栏管理中心下进入api-key管理
  2. 创建新的api并及时储存(关闭页面后无法再次查看,不小心关掉只能重新创建api,及时保存至本地)

案例一:创建一个能够读取PDF文件和利用工具的代理的过程,以及构建自定义工具,以下为详细介绍:

  1. 添加一个自定义工具:图片生成工具
  2. 使用到的LLM模型配置。
  3. 创建Agent,这里我们以“Assistant”代理为例,它能够使用工具和读取文件。
  4. 以聊天机器人的形式运行助理。
  1. 新建文件夹,例1.py,将下列代码粘贴至文件中

import pprint

import urllib.parse

import json5

from qwen_agent.agents import Assistant

from qwen_agent.tools.base import BaseTool, register_tool

# Step 1 (Optional): Add a custom tool named `my_image_gen`.

@register_tool('my_image_gen')

class MyImageGen(BaseTool):

    # The `description` tells the agent the functionality of this tool.

    description = 'AI painting (image generation) service, input text description, and return the image URL drawn based on text information.'

    # The `parameters` tell the agent what input parameters the tool has.

    parameters = [{

        'name': 'prompt',

        'type': 'string',

        'description': 'Detailed description of the desired image content, in English',

        'required': True

    }]

    def call(self, params: str, **kwargs) -> str:

        # `params` are the arguments generated by the LLM agent.

        prompt = json5.loads(params)['prompt']

        # 对提示词进行URL编码

        prompt = urllib.parse.quote(prompt)

        #

        return json5.dumps(

            {'image_url': f'https://image.pollinations.ai/prompt/{prompt}'},

            ensure_ascii=False)

# Step 2: Configure the LLM you are using.

# 这里是需要配置模型的地方。需要填写模型名字,以及model_server,即模型所在服务器名字,如果没有,也可以考虑使用api_key

llm_cfg = {

    # Use the model service provided by DashScope:

        # model:模型名称

        # model_server:模型所在的服务器

        # api_key 所使用到的api-key,可以显示的设置,也可以从环境变量中获取

       

    'model': 'qwen-max',

    'model_server': 'dashscope',

    # 'api_key': 'YOUR_DASHSCOPE_API_KEY',

    # It will use the `DASHSCOPE_API_KEY' environment variable if 'api_key' is not set here.

    # Use a model service compatible with the OpenAI API, such as vLLM or Ollama:

    # 'model': 'Qwen1.5-7B-Chat',

    # 'model_server': 'http://localhost:8000/v1',  # base_url, also known as api_base

    # 'api_key': 'EMPTY',

    # (Optional) LLM hyperparameters for generation:

    # 用于调整生成参数的可选配置

    'generate_cfg': {

        'top_p': 0.8

    }

}

# Step 3: Create an agent. Here we use the `Assistant` agent as an example, which is capable of using tools and reading files.

# agent的提示词指令

system_instruction = '''You are a helpful assistant.

After receiving the user's request, you should:

- first draw an image and obtain the image url,

- then run code `request.get(image_url)` to download the image,

- and finally select an image operation from the given document to process the image.

Please show the image using `plt.show()`.'''

# 工具列表,指定Assistant可以访问的工具,一个是自定义的工具,一个是代码执行器

tools = ['my_image_gen', 'code_interpreter']  # `code_interpreter` is a built-in tool for executing code.

# 助理可以读取的文件路径

files = ['./examples/resource/doc.pdf']  # Give the bot a PDF file to read.

# 初始化Assistant

bot = Assistant(llm=llm_cfg,

                system_message=system_instruction,

                function_list=tools,

                files=files)

# Step 4: Run the agent as a chatbot.

messages = []  # This stores the chat history.

while True:

    # For example, enter the query "draw a dog and rotate it 90 degrees".

    query = input('user query: ')

    # Append the user query to the chat history.

    messages.append({'role': 'user', 'content': query})

    response = []

    for response in bot.run(messages=messages):

        # Streaming output.

        print('bot response:')

        pprint.pprint(response, indent=2)

    # Append the bot responses to the chat history.

    messages.extend(response)

  1. 在step2中将api-key一行前的#删除,输入自己的api-key

  1. 在anaconda终端输入python +文件名运行,如python 1.py
  2. 输入示例问题:draw a dog and rotate it 90 degrees

  1. 小狗图片

  1. 旋转90度

案例二: ai群组交流

  1. 同案例一在同一文件夹中新建文件并贴入以下代码:

"""A group chat gradio demo"""

import json

import gradio as gr

import json5

from qwen_agent.agents import GroupChat, GroupChatCreator

from qwen_agent.agents.user_agent import PENDING_USER_INPUT

from qwen_agent.llm.schema import ContentItem, Message

def init_agent_service(cfgs):

    llm_cfg = {'model': 'qwen-max'}

    bot = GroupChat(agents=cfgs, llm=llm_cfg)

    return bot

def init_agent_service_create():

    llm_cfg = {'model': 'qwen-max'}

    bot = GroupChatCreator(llm=llm_cfg)

    return bot

# =========================================================

# Below is the gradio service: front-end and back-end logic

# =========================================================

app_global_para = {

    'messages': [],

    'messages_create': [],

    'is_first_upload': False,

    'uploaded_file': '',

    'user_interrupt': True

}

# Initialized group chat configuration

CFGS = {

    'background':

        '一个陌生人互帮互助群聊',

    'agents': [

        {

            'name': '小塘',

            'description': '一个勤劳的打工人,每天沉迷工作,日渐消瘦。(这是一个真实用户)',

            'is_human': True  # mark this as a real person

        },

        {

            'name': '甄嬛',

            'description': '一位后宫妃嫔',

            'instructions': '你是甄嬛,你正在想办法除掉皇后,你说话风格为文言文,每次说完话会调image_gen工具画一幅图,展示心情。',

            'knowledge_files': [],

            'selected_tools': ['image_gen']

        },

        {

            'name': 'ikun',

            'description': '熟悉蔡徐坤的动态',

            'instructions': '你是蔡徐坤的粉丝,说话很简短,喜欢用颜文字表达心情,你最近迷恋看《甄嬛传》',

            'knowledge_files': [],

            'selected_tools': []

        },

        {

            'name': '大头',

            'description': '是一个体育生,不喜欢追星',

            'instructions': '你是一个体育生,热爱运动,你不喜欢追星,你喜欢安利别人健身',

            'knowledge_files': [],

            'selected_tools': []

        }

    ]

}

MAX_ROUND = 3

def app(cfgs):

    # Todo: Reinstance every time or instance one time as global variable?

    cfgs = json5.loads(cfgs)

    bot = init_agent_service(cfgs=cfgs)

    # Record all mentioned agents: reply in order

    mentioned_agents_name = []

    for i in range(MAX_ROUND):

        messages = app_global_para['messages']

        print(i, messages)

        # Interrupt: there is new input from user

        if i == 0:

            app_global_para['user_interrupt'] = False

        if i > 0 and app_global_para['user_interrupt']:

            app_global_para['user_interrupt'] = False

            print('GroupChat is interrupted by user input!')

            # Due to the concurrency issue with Gradio, unable to call the second service simultaneously

            for rsp in app(json.dumps(cfgs, ensure_ascii=False)):

                yield rsp

            break

        # Record mentions into mentioned_agents_name list

        content = ''

        if messages:

            if isinstance(messages[-1].content, list):

                content = '\n'.join([x.text if x.text else '' for x in messages[-1].content]).strip()

            else:

                content = messages[-1].content.strip()

        if '@' in content:

            for x in content.split('@'):

                for agent in cfgs['agents']:

                    if x.startswith(agent['name']):

                        if agent['name'] not in mentioned_agents_name:

                            mentioned_agents_name.append(agent['name'])

                        break

        # Get one response from groupchat

        response = []

        try:

            display_history = _get_display_history_from_message()

            yield display_history

            for response in bot.run(messages, need_batch_response=False, mentioned_agents_name=mentioned_agents_name):

                if response:

                    if response[-1].content == PENDING_USER_INPUT:

                        # Stop printing the special message for mention human

                        break

                    incremental_history = []

                    for x in response:

                        function_display = ''

                        if x.function_call:

                            function_display = f'\nCall Function: {str(x.function_call)}'

                        incremental_history += [[None, f'{x.name}: {x.content}{function_display}']]

                    display_history = _get_display_history_from_message()

                    yield display_history + incremental_history

        except Exception as ex:

            raise ValueError(ex)

        if not response:

            # The topic ends

            print('No one wants to talk anymore!')

            break

        if mentioned_agents_name:

            assert response[-1].name == mentioned_agents_name[0]

            mentioned_agents_name.pop(0)

        if response and response[-1].content == PENDING_USER_INPUT:

            # Terminate group chat and wait for user input

            print('Waiting for user input!')

            break

        # Record the response to messages

        app_global_para['messages'].extend(response)

def test():

    app(cfgs=CFGS)

def app_create(history, now_cfgs):

    now_cfgs = json5.loads(now_cfgs)

    if not history:

        yield history, json.dumps(now_cfgs, indent=4, ensure_ascii=False)

    else:

        if len(history) == 1:

            new_cfgs = {'background': '', 'agents': []}

            # The first time to create grouchat

            exist_cfgs = now_cfgs['agents']

            for cfg in exist_cfgs:

                if 'is_human' in cfg and cfg['is_human']:

                    new_cfgs['agents'].append(cfg)

        else:

            new_cfgs = now_cfgs

        app_global_para['messages_create'].append(Message('user', history[-1][0]))

        response = []

        try:

            agent = init_agent_service_create()

            for response in agent.run(messages=app_global_para['messages_create']):

                display_content = ''

                for rsp in response:

                    if rsp.name == 'role_config':

                        cfg = json5.loads(rsp.content)

                        old_pos = -1

                        for i, x in enumerate(new_cfgs['agents']):

                            if x['name'] == cfg['name']:

                                old_pos = i

                                break

                        if old_pos > -1:

                            new_cfgs['agents'][old_pos] = cfg

                        else:

                            new_cfgs['agents'].append(cfg)

                        display_content += f'\n\n{cfg["name"]}: {cfg["description"]}\n{cfg["instructions"]}'

                    elif rsp.name == 'background':

                        new_cfgs['background'] = rsp.content

                        display_content += f'\n群聊背景:{rsp.content}'

                    else:

                        display_content += f'\n{rsp.content}'

                history[-1][1] = display_content.strip()

                yield history, json.dumps(new_cfgs, indent=4, ensure_ascii=False)

        except Exception as ex:

            raise ValueError(ex)

        app_global_para['messages_create'].extend(response)

def _get_display_history_from_message():

    # Get display history from messages

    display_history = []

    for msg in app_global_para['messages']:

        if isinstance(msg.content, list):

            content = '\n'.join([x.text if x.text else '' for x in msg.content]).strip()

        else:

            content = msg.content.strip()

        function_display = ''

        if msg.function_call:

            function_display = f'\nCall Function: {str(msg.function_call)}'

        content = f'{msg.name}: {content}{function_display}'

        display_history.append((content, None) if msg.name == 'user' else (None, content))

    return display_history

def get_name_of_current_user(cfgs):

    for agent in cfgs['agents']:

        if 'is_human' in agent and agent['is_human']:

            return agent['name']

    return 'user'

def add_text(text, cfgs):

    app_global_para['user_interrupt'] = True

    content = [ContentItem(text=text)]

    if app_global_para['uploaded_file'] and app_global_para['is_first_upload']:

        app_global_para['is_first_upload'] = False  # only send file when first upload

        content.append(ContentItem(file=app_global_para['uploaded_file']))

    app_global_para['messages'].append(

        Message('user', content=content, name=get_name_of_current_user(json5.loads(cfgs))))

    return _get_display_history_from_message(), None

def chat_clear():

    app_global_para['messages'] = []

    return None

def chat_clear_create():

    app_global_para['messages_create'] = []

    return None, None

def add_file(file):

    app_global_para['uploaded_file'] = file.name

    app_global_para['is_first_upload'] = True

    return file.name

def add_text_create(history, text):

    history = history + [(text, None)]

    return history, gr.update(value='', interactive=False)

with gr.Blocks(theme='soft') as demo:

    display_config = gr.Textbox(

        label=  # noqa

        'Current GroupChat: (If editing, please maintain this JSON format)',

        value=json.dumps(CFGS, indent=4, ensure_ascii=False),

        interactive=True)

    with gr.Tab('Chat', elem_id='chat-tab'):

        with gr.Column():

            chatbot = gr.Chatbot(

                [],

                elem_id='chatbot',

                height=750,

                show_copy_button=True,

            )

            with gr.Row():

                with gr.Column(scale=3, min_width=0):

                    auto_speak_button = gr.Button('Randomly select an agent to speak first')

                    auto_speak_button.click(app, display_config, chatbot)

                with gr.Column(scale=10):

                    chat_txt = gr.Textbox(

                        show_label=False,

                        placeholder='Chat with Qwen...',

                        container=False,

                    )

                with gr.Column(scale=1, min_width=0):

                    chat_clr_bt = gr.Button('Clear')

            chat_txt.submit(add_text, [chat_txt, display_config], [chatbot, chat_txt],

                            queue=False).then(app, display_config, chatbot)

            chat_clr_bt.click(chat_clear, None, [chatbot], queue=False)

        demo.load(chat_clear, None, [chatbot], queue=False)

    with gr.Tab('Create', elem_id='chat-tab'):

        with gr.Column(scale=9, min_width=0):

            chatbot = gr.Chatbot(

                [],

                elem_id='chatbot0',

                height=750,

                show_copy_button=True,

            )

            with gr.Row():

                with gr.Column(scale=13):

                    chat_txt = gr.Textbox(

                        show_label=False,

                        placeholder='Chat with Qwen...',

                        container=False,

                    )

                with gr.Column(scale=1, min_width=0):

                    chat_clr_bt = gr.Button('Clear')

            txt_msg = chat_txt.submit(add_text_create, [chatbot, chat_txt], [chatbot, chat_txt],

                                      queue=False).then(app_create, [chatbot, display_config],

                                                        [chatbot, display_config])

            txt_msg.then(lambda: gr.update(interactive=True), None, [chat_txt], queue=False)

            chat_clr_bt.click(chat_clear_create, None, [chatbot, chat_txt], queue=False)

        demo.load(chat_clear_create, None, [chatbot, chat_txt], queue=False)

if __name__ == '__main__':

    demo.queue().launch()

  1. 在根据以下指引找到文件qwen_dashscope.py->qwen_agent->llm->qwen_dashscope.py;打开后在代码的第23行输入自己的api-key

  1. 在根目录下运行命令python + 文件名(如:python group_chat.py),点击地址打开网页

  1. 进行对话,小塘为使用者

案例三:根据浏览过的网页、PDFs进行长文创作(接下来三个示例均来源于Qwen-Agent/browser_qwen_cn.md at main · QwenLM/Qwen-Agent (github.com)

  1. 首先打开run_sever.py文件,找到代码的第27行并输入自己的api-key

  1. 通过上面的部署,我们已经准备好了浏览器拓展,接下来我们在终端输入命令:python run_server.py --llm qwen-max --model_server dashscope --workstation_port 7864

  1. 打开谷歌浏览器,随意打开一个网页,点击右上角加入千问的阅读列表并确认

  1. 点击右上角打开千问扩展,输入问题

案例四:提取浏览内容使用代码解释器画图 

  1. 回到终端点击7864地址进入

  1. 在最下面选择Code Interpreter,然后可以选择上传文件,也可以直接将数据以文字方式输入

案例五:上传文件、多轮对话利用代码解释器分析数据

  1. 打开文件

评论
添加红包

请填写红包祝福语或标题

红包个数最小为10个

红包金额最低5元

当前余额3.43前往充值 >
需支付:10.00
成就一亿技术人!
领取后你会自动成为博主和红包主的粉丝 规则
hope_wisdom
发出的红包
实付
使用余额支付
点击重新获取
扫码支付
钱包余额 0

抵扣说明:

1.余额是钱包充值的虚拟货币,按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载,可以购买VIP、付费专栏及课程。

余额充值