最近学习了一下 Agent
开发,虽然只学了一点皮毛,但是跃跃欲试,想要做一个Agent
试试。
AI Agent 是指一种能与环境交互、收集数据并利用这些数据执行任务以满足特定目标的软件程序。是一种基于人工智能和自动化原理的计算范式,用于设计和实现能够自主执行任务、感知环境并与其他实体交互的软件或硬件组件。
本文利用大语言模型 deepseek
的API
,借助两个函数自动查询fishbase
内某类鱼的信息,然后根据查到的信息回答我们的信息。
Fishbase中相关信息的获取
我们首先获取fishbase
中相关鱼类的信息,主要利用爬虫:
import requests
from bs4 import BeautifulSoup
import pandas as pd
import time
import os
import json
pages = list(range(1,333))
def get_pages(page):
url = f"https://fishbase.org/ComNames/ScriptList.php?resultPage={page}&script=Chinese"
headers = {
'User-Agent': 'Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/133.0.0.0 Safari/537.36'}
species = []
herfs = []
try:
response = requests.get(url, headers=headers)
soup = BeautifulSoup(response.text, 'html.parser')
tbody = soup.find_all('tbody')[0]
rows = tbody.find_all('tr')
for row in rows:
columns = row.find_all('td')
second_td_content = columns[1].get_text(strip=True)
a_tag = columns[1].find('a')
link = a_tag.get('href').split('=')[1]
species.append(second_td_content)
herfs.append(link)
data = pd.DataFrame({'species': species, 'herfs': herfs})
data.to_csv('data/' + str(page) + ".csv", index=False)
except Exception as e:
data = pd.DataFrame({'species': species, 'herfs': herfs})
data.to_csv('data/' + str(page) + ".csv", index=False)
print(page)
for page in pages:
get_pages(page)
time.sleep(1)
D = []
filenames = os.listdir('data')
for filename in filenames:
data = pd.read_csv('data/' + filename)
D.append(data)
D = pd.concat(D)
D.drop_duplicates(inplace=True)
D.reset_index(drop=True, inplace=True)
D = {s:h for s, h in zip(D['species'].values.tolist(), D['herfs'].values.tolist())}
with open("data/species.json", "w") as f:
json.dump(D, f)
通过这个函数,我们将爬取的关于fishbase
中鱼种的信息存储在species.json
的文件中。
我们的json
文件中每个键值对为:
物种的拉丁文名称:在fishbase中的编号
Tools
我们下面需要编写一些tools
,来供Agent
进行调用:
import requests
from bs4 import BeautifulSoup
import json
# 导入我们上一步得到的数据
with open("data/species.json", "r") as f:
fishbase = json.load(f)
def get_id(species):
"""
根据给出的物种的拉丁文名称查询其在fishbase中的id
:param species: 物种的拉丁文名称,字符串类型
:return: 物种在fishbase中的id
"""
return fishbase[species]
def get_information(species_id):
"""
根据某一鱼类在fishbase中的id查询该鱼类在fishbase中的各种信息
:param species_id: 某一鱼类在fishbase中的id
:return: 鱼类的信息,包含其分类、生物学以及生活史特征,以及模型对其的估计
""" information = {}
url = f"https://fishbase.org/summary/SpeciesSummary.php?id={species_id}&lang=English"
headers = {
'User-Agent': 'Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/133.0.0.0 Safari/537.36'
}
response = requests.get(url, headers=headers)
soup = BeautifulSoup(response.text, 'lxml')
target_div = soup.find('div', id='ss-main')
cata = target_div.find_all('div', class_='smallSpace',recursive=False)
information["Classification / Names"] = cata[0].get_text().replace("\n", "").replace("\r", "").replace("\xa0", "").replace("\t", "")
information["Environment: milieu / climate zone / depth range / distribution range"] = cata[1].get_text().replace("\n", "").replace("\r", "").replace("\xa0", "").replace("\t", "")
information["Distribution"] = cata[2].get_text().replace("\n", "").replace("\r", "").replace("\xa0", "").replace("\t", "")
information["Size / Weight / Age"] = cata[3].get_text().replace("\n", "").replace("\r", "").replace("\xa0", "").replace("\t", "").replace(" ", "")
information["Short description"] = cata[4].get_text().replace("\n", "").replace("\r", "").replace("\xa0", "").replace("\t", "")
information["Biology"] = cata[5].get_text().replace("\n", "").replace("\r", "").replace("\xa0", "").replace("\t", "")
information["Life cycle and mating behavior"] = cata[6].get_text().replace("\n", "").replace("\r", "").replace("\xa0", "").replace("\t", "")
information["Human uses"] = cata[8].get_text().replace("\n", "").replace("\r", "").replace("\xa0", "").replace("\t", "")
if information["Human uses"].startswith("FAO"):
information["Human uses"] = ""
if cata[-1].get_text() == "\n":
information["Estimates based on models"] = cata[-2].get_text().replace("\n", "").replace("\r", "").replace("\xa0", "").replace("\t", "")
else:
information["Estimates based on models"] = cata[-1].get_text().replace("\n", "").replace("\r", "").replace("\xa0", "").replace("\t", "")
return information
我们提供两个函数:
get_id
:根据拉丁文名称获取fishbase
的id
get_information
:根据id
获取鱼类信息
构建Agent
# 使用deepseek API
from openai import OpenAI
from swarm import Swarm, Agent
client = OpenAI(api_key="You API", base_url="https://api.deepseek.com")
swarm_client = Swarm(client)
# 构建agent
agent_fish = Agent(
name="fishbase",
model="deepseek-chat",
instructions="你是一个鱼类生物学家,我们可以问你关于鱼类的问题",
functions=[get_id, get_information],
)
system_message = "你只能通过fishbase中查询得到的信息来回答用户的问题,查询到的信息为字典类型,如果fishbase中的信息无法查询到,你可以说不知道,不要自己编造信息,要用中文回答用户问题。如果查询不到,或者查询过程中出错了,就说不知道"
user_message = "请你告诉我一些关于Aaptosyax grypus的鱼类的分布"
response = swarm_client.run(agent_fish, [{"role": "system", "content": system_message}, {"role": "user", "content": user_message}])
# 查看结果
response
我们看一下response
Response(messages=[{'content': '', 'refusal': None, 'role': 'assistant', 'audio': None, 'function_call': None, 'tool_calls': [{'id': 'call_0_0e6fe3d8-c39e-4671-a463-0caf33ba6e31', 'function': {'arguments': '{"species":"Aaptosyax grypus"}', 'name': 'get_id'}, 'type': 'function', 'index': 0}], 'sender': 'fishbase'}, {'role': 'tool', 'tool_call_id': 'call_0_0e6fe3d8-c39e-4671-a463-0caf33ba6e31', 'tool_name': 'get_id', 'content': '16239'}, {'content': '', 'refusal': None, 'role': 'assistant', 'audio': None, 'function_call': None, 'tool_calls': [{'id': 'call_0_ff3bfc87-ff02-41c4-b356-237286d66087', 'function': {'arguments': '{"species_id":"16239"}', 'name': 'get_information'}, 'type': 'function', 'index': 0}], 'sender': 'fishbase'}, {'role': 'tool', 'tool_call_id': 'call_0_ff3bfc87-ff02-41c4-b356-237286d66087', 'tool_name': 'get_information', 'content': "{'Classification / Names': 'Teleostei (teleosts) > Cypriniformes (Carps) > Cyprinidae (Minnows or carps) > Cyprininae Etymology: Aaptosyax: Greek, aaptos =giant, terrible + Greek, ykis, yaina, yanis, syaina, syax and syakon = a kind of sole; but it is also related to yaina =hyena and wild pig;grypus: Named for its strongly curved jaws (Ref. 34719). ', 'Environment: milieu / climate zone / depth range / distribution range': 'Freshwater; pelagic; potamodromous (Ref. 51243). Tropical (Ref. 71989)', 'Distribution': 'Asia: Mekong River. ', 'Size / Weight / Age': 'Maturity: Lm? range ? - ? cm Max length : 130 cm SL male/unsexed; (Ref. 9497); max. published weight: 30.0 kg (Ref. 9497)', 'Short description': 'Well-developed adipose eye-lid, covering most of eye except pupil in large adults, less extensive in juveniles; presence of a large symphyseal knob in lower jaw fitting in a median notch in upper jaw (Ref. 43281).', 'Biology': 'Inhabits mainstreams of middle reaches in deep rocky rapids. Juveniles occur in tributaries (Ref. 58784). A large fast-swimming predator, feeding on fish of the middle and the upper water levels. Although most common along the Thai-Lao border at the mouth of the Mun River, its numbers have drastically decreased in recent years. This is perhaps due to dam construction or excessive gill netting, to which active pursuit predators, like this species, are particularly vulnerable (Ref. 12693). Undertakes upstream migration at the same time as Probarbus sp. in December-February (Ref. 37770) which may be related to spawning activity (Ref. 9497). Attains over 30 kg (Ref. 9497).', 'Life cycle and mating behavior': '', 'Human uses': 'Fisheries: subsistence fisheries', 'Estimates based on models': 'Phylogenetic diversity index (Ref. 82804):PD50 = 1.0000 [Uniqueness, from 0.5 = low to 2.0 = high].Bayesian length-weight: a=0.01122 (0.00521 - 0.02417), b=3.02 (2.85 - 3.19), in cm total length, based on LWR estimates for this (Sub)family-body shape (Ref. 93245).Trophic level (Ref. 69278):4.5 ±0.80 se; based on food items.Resilience (Ref. 120179):Very Low, minimum population doubling time more than 14 years (Preliminary K or Fecundity.).Fishing Vulnerability (Ref. 59153):Very high vulnerability (90 of 100).Price category (Ref. 80766):Unknown.'}"}, {'content': ' Aaptosyax grypus 是一种分布于亚洲湄公河的淡水鱼类。它主要栖息在湄公河中游的深水急流中,幼鱼则出现在支流中。这种鱼是一种大型的快速游动捕食者,以中层和上层水域的鱼类为食。尽管在泰国-老挝边境的湄公河入口处较为常见,但由于水坝建设和过度捕捞,其数量近年来急剧减少。这种鱼在12月至2月期间会进行上游迁徙,可能与产卵活动有关。Aaptosyax grypus 的最大体长可达130厘米,最大体重可达30公斤。', 'refusal': None, 'role': 'assistant', 'audio': None, 'function_call': None, 'tool_calls': None, 'sender': 'fishbase'}], agent=Agent(name='fishbase', model='deepseek-chat', instructions='你是一个鱼类生物学家,我们可以问你关于鱼类的问题', functions=[<function get_id at 0x0000013F1C0FF6A0>, <function get_information at 0x0000013F1B218E00>], tool_choice=None, parallel_tool_calls=True), context_variables={})
回答为:
Aaptosyax grypus 是一种分布于亚洲湄公河的淡水鱼类。它主要栖息在湄公河中游的深水急流中,幼鱼则出现在支流中。这种鱼是一种大型的快速游动捕食者,以中层和上层水域的鱼类为食。尽管在泰国-老挝边境的湄公河入口处较为常见,但由于水坝建设和过度捕捞,其数量近年来急剧减少。这种鱼在12月至2月期间会进行上游迁徙,可能与产卵活动有关。Aaptosyax grypus 的最大体长可达130厘米,最大体重可达30公斤。
可以看到也调用了我们提供的函数,并且自动确定了参数:
- ‘function’: {‘arguments’: ‘{“species”:“Aaptosyax grypus”}’, ‘name’: ‘get_id’}
- ‘function’: {‘arguments’: ‘{“species_id”:“16239”}’, ‘name’: ‘get_information’}