本文章以qwen-agent项目中examples/assistant_add_custom_tool.py的运行过程为切入点,分析其在运行过程中对于后台大模型服务调用时的prompt,来理解agent的实现思路
examples/assistant_add_custom_tool.py期望达到的效果是可以运用代码生成器,图片生成器实现从文本到图片的过程。
代码解析
agent主要的能力来源于各种工具的使用
下面是工具的创建过程
# 设计新的工具,依据工具的注册规范,提供工具功能描述,入参,函数实现功能
@register_tool('my_image_gen')
class MyImageGen(BaseTool):
description = 'AI painting (image generation) service, input text description, and return the image URL drawn based on text information.'
parameters = [{
'name': 'prompt',
'type': 'string',
'description': 'Detailed description of the desired image content, in English',
'required': True,
}]
def call(self, params: str, **kwargs) -> str:
prompt = json5.loads(params)['prompt']
prompt = urllib.parse.quote(prompt)
return json.dumps(
{'image_url': f'https://image.pollinations.ai/prompt/{prompt}'},
ensure_ascii=False,
)
bot的创建,设置好使用模型的路径,系统指令,可以使用的工具池
# 大模型服务,工具池,系统指令
def init_agent_service():
# llm_cfg = {'model': 'qwen-max'}
llm_cfg = {
'model': 'Qwen1.5-14B',
'model_server': 'http://10.112.2.35:8688/v1'
}
system = ("According to the user's request, you first draw a picture and then automatically "
'run code to download the picture and select an image operation from the given document '
'to process the image')
tools = [
'my_image_gen',
'code_interpreter',
] # code_interpreter is a built-in tool in Qwen-Agent
bot = Assistant(
llm=llm_cfg,
name='AI painting',
description='AI painting service',
system_message=system,
function_list=tools,
files=[os.path.join(ROOT_RESOURCE, 'doc.pdf')],
)
return bot
第一次大模型调用prompt
对输入query进行信息提取和指令提取
<|im_start|>system
You are a helpful assistant.<|im_end|>
<|im_start|>user
Please extract the key information fragments that can help retrieval and the task description in the question, and give them in JSON format:
{"information": ["information fragment 1", "information fragment 2"], "instruction": ["instruction fragment 1", "instruction fragment 2"]}.
If it is a question, the default task description is: Answer the question
Question: What is MMDET.UTILS?
Result: {"information": ["What is MMDET.UTILS"], "instruction": ["Answer the question"]}
Observation: ...
Question: Summarize
Result: {"information": [], "instruction": ["Summarize"]}
Observation: ...
Question: Describe in great detail 2.1 DATA, 2.2 TOKENIZATION, 2.3 ARCHITECTURE. Also, can you incorporate the methods from this paper?
Result: {"information": ["2.1 DATA, 2.2 TOKENIZATION, 2.3 ARCHITECTURE"], "instruction": ["Describe in great detail", "Also, can you incorporate the methods from this paper?"]}
Observation: ...
Question: Help me count the performance of membership levels.
Result: {"information": ["the performance of membership levels"], "instruction": ["Help me count"]}
Observation: ...
Question: draw a dog
Result:
<|im_end|>
<|im_start|>assistant
第二次大模型调用prompt
对输入query进行关键词提取,呈现时以中文和英文两种方式呈现
<|im_start|>system
You are a helpful assistant.<|im_end|>
<|im_start|>user
Please extract keywords from the question, both in Chinese and English, and supplement them appropriately with relevant keywords that are not in the question.
Try to divide keywords into verb, noun, or adjective types and avoid long phrases (The aim is to better match and retrieve semantically related but differently phrased relevant information).
Keywords are provided in JSON format, such as {"keywords_zh": ["关键词1", "关键词2"], "keywords_en": ["keyword 1", "keyword 2"]}
Question: Who are the authors of this article?
Keywords: {"keywords_zh": ["作者"], "keywords_en": ["author"]}
Observation: ...
Question: Explain Figure 1
Keywords: {"keywords_zh": ["图一", "图 1"], "keywords_en": ["Figure 1"]}
Observation: ...
Question: core formula
Keywords: {"keywords_zh": ["核心公式", "公式"], "keywords_en": ["core formula", "formula", "equation"]}
Observation: ...
Question: draw a dog
Keywords:
<|im_end|>
<|im_start|>assistant
第一次与第二次大模型调用都是对于输入query的信息改写,已提供更为明确的指令信息,
第三次大模型调用开始都是对于工具的使用,会设置最大工具使用次数,任何一次工具使用结果表现出不再进行工具使用,跳出工具使用循环,结束整个流程
第三次大模型调用prompt
<|im_start|>system
According to the user's request, you first draw a picture and then automatically run code to download the picture and select an image operation from the given document to process the image
# Knowledge Base
## The content from [文件](doc.pdf):
```
# Python Image Processing Tutorial: Downloading Images and Performing Flip Operations
In this tutorial, we will learn how to download images using Python and perform basic image
operations such as flipping and rotating using the Pillow library.
## Prerequisites
Before we begin, make sure you have the following libraries installed in your Python environment:
- `requests`: for downloading images
- `Pillow`: for image processing
If you haven't installed these libraries yet, you can install them using pip:
```bash
pip install requests Pillow
```
## Step 1: Downloading an Image
First, we need to download an image. We will use the `requests` library to accomplish this task.
```
import requests
def download_image(url, filename):
response = requests.get(url)
if response.status_code == 200:
with open(filename, 'wb') as file:
file.write(response.content)
else:
print(f"Error:Failed to download image from {url}")
# Example usage
image_url = "https://example.com/image.jpg" # Replace with the URL of the image you want to download
filename = "downloaded_image.jpg"
download_image(image_url, filename)
```
## Step 2: Opening and Displaying the Image
Next, we will use the `Pillow` library to open and display the image we just downloaded.
```
from PIL import Image
def open_and_show_image(filename):
image = Image.open(filename)
image.show()
# Exampleusage
open_and_show_image(filename)
```
## Step 3: Flipping and Rotating the Image
Now we can perform flip and rotate operations on the image. The `Pillow` library provides several
methods for image manipulation.
```
def flip_image(filename, mode='horizontal'):
image = Image.open(filename)
if mode == 'horizontal':
flipped_image = image.transpose(Image.FLIP_LEFT_RIGHT)
elif mode == 'vertical':
flipped_image = image.transpose(Image.FLIP_TOP_BOTTOM)
else:
print("Error: Mode should be 'horizontal' or 'vertical'")
return
flipped_image.show()
return flipped_image
def rotate_image(filename, degrees):
image = Image.open(filename)
rotated_image = image.rotate(degrees)
rotated_image.show()
return rotated_image
# Example usage
flipped_image = flip_image(filename, mode='horizontal') # Horizontally flip
flipped_image.save("flipped_horizontal.jpg") # Save the horizontally flipped image
flipped_image = flip_image(filename, mode='vertical') # Vertically flip
flipped_image.save("flipped_vertical.jpg") # Save the vertically flipped image
rotated_image = rotate_image(filename, 90) # Rotate by 90 degrees
rotated_image.save("rotated_90.jpg") # Save the rotated image
```
## Step 4: Saving the Modified Image
In the examples above, we have seen how to save flipped and rotated images. You can use the
`save` method to save any modified image.
```
# Save the image
def save_image(image, filename):
image.save(filename)
# Example usage
save_image(flipped_image, "flipped_image.jpg")
save_image(rotated_image, "rotated_image.jpg")
```
By now, you have learned how to download images using Python and perform basic image
operations using the Pillow library. You can extend these basics to implement more complex image
processing functions as needed.
```
# Tools
## You have access to the following tools:
### my_image_gen
my_image_gen: AI painting (image generation) service, input text description, and return the image URL drawn based on text information. Parameters: [{"name": "prompt", "type": "string", "description": "Detailed description of the desired image content, in English", "required": true}] Format the arguments as a JSON object.
### code_interpreter
code_interpreter: Python代码沙盒,可用于执行Python代码。 Parameters: [{"name": "code", "type": "string", "description": "待执行的代码", "required": true}] 此工具的输入应为Markdown代码块。
## When you need to call a tool, please insert the following command in your reply, which can be called zero or multiple times according to your needs:
✿FUNCTION✿: The tool to use, should be one of [my_image_gen,code_interpreter]
✿ARGS✿: The input of the tool
✿RESULT✿: Tool results
✿RETURN✿: Reply based on tool results. Images need to be rendered as <|im_end|>
<|im_start|>user
draw a dog<|im_end|>
<|im_start|>assistant
prompt中有相当大的一分部在以COT的方式介绍如何完成画图的整个流程,第一步,第二步,第三步的方式告知应该怎么思考解决方案,同时如何在了解工具的功能,工具的入参的情况下,怎么给输出工具的入参以使用工具的功能。
后半段可以看到是对工具池中工具的说明,提到了功能,入参,输出
最后几行告知了大模型的输出格式
第四次大模型调用prompt
第四次和第三次的prompt差别主要在最后添加上了第三部的执行结果,后续大模型调用都是如此
<|im_start|>system
According to the user's request, you first draw a picture and then automatically run code to download the picture and select an image operation from the given document to process the image
# Knowledge Base
## The content from [文件](doc.pdf):
```
# Python Image Processing Tutorial: Downloading Images and Performing Flip Operations
In this tutorial, we will learn how to download images using Python and perform basic image
operations such as flipping and rotating using the Pillow library.
## Prerequisites
Before we begin, make sure you have the following libraries installed in your Python environment:
- `requests`: for downloading images
- `Pillow`: for image processing
If you haven't installed these libraries yet, you can install them using pip:
```bash
pip install requests Pillow
```
## Step 1: Downloading an Image
First, we need to download an image. We will use the `requests` library to accomplish this task.
```
import requests
def download_image(url, filename):
response = requests.get(url)
if response.status_code == 200:
with open(filename, 'wb') as file:
file.write(response.content)
else:
print(f"Error:Failed to download image from {url}")
# Example usage
image_url = "https://example.com/image.jpg" # Replace with the URL of the image you want to download
filename = "downloaded_image.jpg"
download_image(image_url, filename)
```
## Step 2: Opening and Displaying the Image
Next, we will use the `Pillow` library to open and display the image we just downloaded.
```
from PIL import Image
def open_and_show_image(filename):
image = Image.open(filename)
image.show()
# Example usage
open_and_show_image(filename)
```
## Step 3: Flipping and Rotating the Image
Now we can perform flip and rotate operations on the image. The `Pillow` library provides several
methods for image manipulation.
```
def flip_image(filename, mode='horizontal'):
image = Image.open(filename)
if mode == 'horizontal':
flipped_image = image.transpose(Image.FLIP_LEFT_RIGHT)
elif mode == 'vertical':
flipped_image = image.transpose(Image.FLIP_TOP_BOTTOM)
else:
print("Error: Mode should be 'horizontal' or 'vertical'")
return
flipped_image.show()
return flipped_image
def rotate_image(filename, degrees):
image = Image.open(filename)
rotated_image = image.rotate(degrees)
rotated_image.show()
return rotated_image
# Example usage
flipped_image = flip_image(filename, mode='horizontal') # Horizontally flip
flipped_image.save("flipped_horizontal.jpg") # Save the horizontally flipped image
flipped_image = flip_image(filename, mode='vertical') # Vertically flip
flipped_image.save("flipped_vertical.jpg") # Save the vertically flipped image
rotated_image = rotate_image(filename, 90) # Rotate by 90 degrees
rotated_image.save("rotated_90.jpg") # Save the rotated image
```
## Step 4: Saving the Modified Image
In the examples above, we have seen how to save flipped and rotated images. You can use the `save` method to save any modified image.
```
# Save the image
def save_image(image, filename):
image.save(filename)
# Example usage
save_image(flipped_image, "flipped_image.jpg")
save_image(rotated_image, "rotated_image.jpg")
```
By now, you have learned how to download images using Python and perform basic image
operations using the Pillow library. You can extend these basics to implement more complex image
processing functions as needed.
```
# Tools
## You have access to the following tools:
### my_image_gen
my_image_gen: AI painting (image generation) service, input text description, and return the image URL drawn based on text information. Parameters: [{"name": "prompt", "type": "string", "description": "Detailed description of the desired image content, in English", "required": true}] Format the arguments as a JSON object.
### code_interpreter
code_interpreter: Python代码沙盒,可用于执行Python代码。 Parameters: [{"name": "code", "type": "string", "description": "待执行的代码", "required": true}] 此工具的输入应为Markdown代码块。
## When you need to call a tool, please insert the following command in your reply, which can be called zero or multiple times according to your needs:
✿FUNCTION✿: The tool to use, should be one of [my_image_gen,code_interpreter]
✿ARGS✿: The input of the tool
✿RESULT✿: Tool results
✿RETURN✿: Reply based on tool results. Images need to be rendered as <|im_end|>
<|im_start|>user
draw a dog
✿FUNCTION✿: code_interpreter
✿ARGS✿:
```py
from PIL import Image, ImageDraw
# Create a new image with a white background
width, height = 400, 400
image = Image.new('RGB', (width, height), (255, 255, 255))
draw = ImageDraw.Draw(image)
# Draw a simple dog using a rectangle and a couple of ovals
# The dog
✿RESULT✿: error:
```
Cell In[2], line 2
```py ^SyntaxError: invalid syntax```
✿RETURN✿<|im_end|>
<|im_start|>assistant
第五次大模型调用prompt
<|im_start|>system
According to the user's request, you first draw a picture and then automatically run code to download the picture and select an image operation from the given document to process the image
# Knowledge Base
## The content from [文件](doc.pdf):
```
# Python Image Processing Tutorial: Downloading Images and Performing Flip Operations
In this tutorial, we will learn how to download images using Python and perform basic image
operations such as flipping and rotating using the Pillow library.
## Prerequisites
Before we begin, make sure you have the following libraries installed in your Python environment:
- `requests`: for downloading images
- `Pillow`: for image processing
If you haven't installed these libraries yet, you can install them using pip:
```bash
pip install requests Pillow
```
## Step 1: Downloading an Image
First, we need to download an image. We will use the `requests` library to accomplish this task.
```
import requests
def download_image(url, filename):
response = requests.get(url)
if response.status_code == 200:
with open(filename, 'wb') as file:
file.write(response.content)
else:
print(f"Error:Failed to download image from {url}")
# Example usage
image_url = "https://example.com/image.jpg" # Replace with the URL of the image you want to download
filename = "downloaded_image.jpg"
download_image(image_url, filename)
```
## Step 2: Opening and Displaying the Image
Next, we will use the `Pillow` library to open and display the image we just downloaded.
```
from PIL import Image
def open_and_show_image(filename):
image = Image.open(filename)
image.show()
# Example usage
open_and_show_image(filename)
```
## Step 3: Flipping and Rotating the Image
Now we can perform flip and rotate operations on the image. The `Pillow` library provides several
methods for image manipulation.
```
def flip_image(filename, mode='horizontal'):
image = Image.open(filename)
if mode == 'horizontal':
flipped_image = image.transpose(Image.FLIP_LEFT_RIGHT)
elif mode == 'vertical':
flipped_image = image.transpose(Image.FLIP_TOP_BOTTOM)
else:
print("Error: Mode should be 'horizontal' or 'vertical'")
return
flipped_image.show()
return flipped_image
def rotate_image(filename, degrees):
image = Image.open(filename)
rotated_image = image.rotate(degrees)
rotated_image.show()
return rotated_image
# Example usage
flipped_image = flip_image(filename, mode='horizontal') # Horizontally flip
flipped_image.save("flipped_horizontal.jpg") # Save the horizontally flipped image
flipped_image = flip_image(filename, mode='vertical') # Vertically flip
flipped_image.save("flipped_vertical.jpg") # Save the vertically flipped image
rotated_image = rotate_image(filename, 90) # Rotate by 90 degrees
rotated_image.save("rotated_90.jpg") # Save the rotated image
```
## Step 4: Saving the Modified Image
In the examples above, we have seen how to save flipped and rotated images. You can use the `save` method to save any modified image.
```
# Save the image
def save_image(image, filename):
image.save(filename)
# Example usage
save_image(flipped_image, "flipped_image.jpg")
save_image(rotated_image, "rotated_image.jpg")
```
By now, you have learned how to download images using Python and perform basic image
operations using the Pillow library. You can extend these basics to implement more complex image
processing functions as needed.
```
# Tools
## You have access to the following tools:
### my_image_gen
my_image_gen: AI painting (image generation) service, input text description, and return the image URL drawn based on text information. Parameters: [{"name": "prompt", "type": "string", "description": "Detailed description of the desired image content, in English", "required": true}] Format the arguments as a JSON object.
### code_interpreter
code_interpreter: Pyth on代码沙盒,可用于执行Python代码。 Parameters: [{"name": "code", "type": "string", "description": "待执行的代码", "required": true}] 此工具的输入应为Markdown代码块。
## When you need to call a tool, please insert the following command in your reply, which can be called zero or multiple times according to your needs:
✿FUNCTION✿: The tool to use, should be one of [my_image_gen,code_interpreter]
✿ARGS✿: The input of the tool
✿RESULT✿: Tool results
✿RETURN✿: Reply based on tool results. Images need to be rendered as <|im_end|>
<|im_start|>user
draw a dog
✿FUNCTION✿: code_interpreter
✿ARGS✿:
```py
from PIL import Image, ImageDraw
# Create a new image with a white background
width, height = 400, 400
image = Image.new('RGB', (width, height), (255, 255, 255))
draw = ImageDraw.Draw(image)
# Draw a simple dog using a rectangle and a couple of ovals
# The dog
✿RESULT✿: error:
```
Cell In[2], line 2
```py
^
SyntaxError: invalid syntax
```
✿RETURN✿: Apologies for the error. Let's try a different approach to draw a dog using the `PIL` library. Since it's quite complex to draw adetailed dog using Python code directly, I'll use the `my_image_gen` tool instead to generate an image of a dog for you.
✿FUNCTION✿: my_image_gen
✿ARGS✿: {"prompt": "a friendly dog running on a grassy field"}
✿RESULT✿: {"image_url": "https://image.pollinations.ai/prompt/a%20friendly%20dog%20running%20on%20a%20grassy%20field"}
✿RETURN✿<|im_end|>
<|im_start|>assistant
到这个时候我们发现其实整个过程已经可以结束了,因为我们想要的图片已经生成,得到imgae_url。
这次大模型调用得到的结果大致如下:
✿FUNCTION✿:
✿ARGS✿:
✿RESULT✿:
✿RETURN✿:
qwen-agent的判断是否继续执行代码如下:
可以发现主要根据是否有function_name也就是模型输出中的✿FUNCTION✿: 是否有值来判断是否还继续使用工具
use_tool, tool_name, tool_args, _ = self._detect_tool(out)
print(use_tool, tool_name, tool_args)
print(type(use_tool), type(tool_name), type(tool_args))
if use_tool:
tool_result = self._call_tool(tool_name, tool_args, messages=messages, **kwargs)
fn_msg = Message(
role=FUNCTION,
name=tool_name,
content=tool_result,)
messages.append(fn_msg)
response.append(fn_msg)
yield response
used_any_tool = True
def _detect_tool(self, message: Message) -> Tuple[bool, str, str, str]:
"""A built-in tool call detection for func_call format message.
Args:
message: one message generated by LLM.
Returns:
Need to call tool or not, tool name, tool args, text replies.
"""
func_name = None
func_args = None
if message.function_call:
func_call = message.function_call
func_name = func_call.name
func_args = func_call.arguments
text = message.content
if not text:
text = ''
return (func_name is not None), func_name, func_args, text