构建 LLM 代理 (Agents) 的定制工具--自定义工具

king_21e

已于 2024-07-25 17:45:56 修改

阅读量1.7k

点赞数 28

文章标签： python linux 服务器

于 2024-07-25 15:49:12 首次发布

本文链接：https://blog.csdn.net/king_21e/article/details/140669531

版权

提示：文章写完后，目录可以自动生成，如何生成可参考右边的帮助文档

文章目录

一、Building Custom Tools for LLM Agents？
二、构建工具
- 1.简单计算器工具
2.带有多个参数的工具

一、Building Custom Tools for LLM Agents？

使用代理 (Agents) 可以让 LLM 访问工具。这些工具提供了无限的可能性。有了工具，LLM 可以搜索网络、进行数学计算、运行代码等等。

LangChain 库提供了大量预置的工具。然而，在许多真实项目中，我们通常会发现现有工具只能满足有限的需求。这意味着我们必须修改现有工具或完全构建新的工具。

本章将探讨如何在 LangChain 中为代理 (Agents) 构建定制工具。我们将从几个简单的工具开始，以帮助我们理解典型的工具构建模式，然后再转向使用其他机器学习模型构建更复杂的工具，以获得更多的功能，比如描述图像。

二、构建工具

1.简单计算器工具

这个工具是一个简单的计算器，根据圆的半径计算圆的周长。代码如下（示例）：

from langchain.tools import BaseTool
from math import pi
from typing import Union
 
class CircumferenceTool(BaseTool):
      name = "Circumference calculator"
      description = "use this tool when you need to calculate a circumference using the radius of a circle"

    def _run(self, radius: Union[int, float]):
        return float(radius)\*2.0\*pi

    def _arun(self, radius: int):
        raise NotImplementedError("This tool does not support async")

在这里，我们使用 LangChain 的 BaseTool 对象初始化了自定义的 CircumferenceTool 类。我们可以将 BaseTool 视为 LangChain 工具的必要模板。

LangChain 要求工具具有两个属性，即 name 和 description 参数。
description 是工具的自然语言描述，LLM 根据它来决定是否需要使用该工具。工具描述应该非常明确，说明它们的功能、使用时机以及不适用的情况。
在我们的 description 中，我们没有定义不适用该工具的情况。这是因为 LLM 似乎能够识别何时需要使用此工具。在描述中添加“何时不使用”的说明对于避免工具被过度使用是有帮助的。

我们有两个方法，_run 和 _arun。当使用工具时，默认会调用 _run 方法。当需要异步使用工具时，会调用 _arun 方法。本章不涉及异步工具，所以我们用 NotImplementedError 对其进行了初始化。

为对话代理 (Agents) 初始化 LLM 和对话内存。对于 LLM，我们将使用 OpenAI 的 gpt-3.5-turbo 模型。

from langchain.chat_models import ChatOpenAI
from langchain.chains.conversation.memory import ConversationBufferWindowMemory

# initialize LLM (we use ChatOpenAI because we'll later define a `chat` agent)
llm = ChatOpenAI(
        openai_api_key="OPENAI_API_KEY",
        temperature=0,
        model_name='gpt-3.5-turbo'
)

# initialize conversational memory
conversational_memory = ConversationBufferWindowMemory(
        memory_key='chat_history',
        k=5,
        return_messages=True
)

我们将 LLM 初始化为 temperature 为 0。在使用工具时，较低的 temperature 对于减少生成文本中的“随机性”或“创造性”非常有用;在 conversation_memory 对象中，我们将 k 设置为 5，以“记住”前五个人工智能交互。

现在我们可以初始化代理 (Agents) 本身了。它需要已经初始化的 llm 和 conversational_memory。它还需要一个要使用的 tools 列表。我们有一个工具，但我们仍然将它放入列表中。

from langchain.agents import initialize_agent

tools = [CircumferenceTool()]

# initialize agent with tools
agent = initialize_agent(
    agent='chat-conversational-react-description',
    tools=tools,
    llm=llm,
    verbose=True,
    max_iterations=3,
    early_stopping_method='generate',
    memory=conversational_memory
)

chat-conversation-react-description 代理 (Agents) 类型告诉我们一些关于此代理 (Agents) 的信息，包括：
chat 表示正在使用的 LLM 是一个聊天模型。gpt-4 和 gpt-3.5-turbo 都是聊天模型，因为它们消耗对话历史并生成对话响应。而像 text-davinci-003 这样的模型不是聊天模型，因为它不是设计成这种方式使用的。
conversational 表示我们将包含 conversation_memory。
react 指的是 ReAct 框架，它通过使模型能够 “与自己对话”，实现了多步推理和工具使用的能力。
description 告诉我们 LLM/代理 (Agents) 将根据工具的描述来决定使用哪个工具——我们在之前的工具定义中创建了这些描述。

代理 (Agents) 计算圆的周长。

agent("can you calculate the circumference of a circle that has a radius of 7.81mm")

我们可以在 AgentExecutor Chain 的输出中看到代理 (Agents) 直接跳到 Final Answer 操作：

{ "action": "Final Answer", "action_input": "The circumference of a circle with a radius of 7.81mm is approximately 49.03mm." }

Final Answer 操作是代理 (Agents) 在决定完成推理和操作步骤并获得所有所需信息以回答用户查询时使用的操作。这意味着代理 (Agents) 决定不使用圆周计算器工具。
LLM 在数学方面通常表现不佳，但这并不能阻止它尝试进行数学计算。问题是由于 LLM 对其数学能力过于自信所致。为了解决这个问题，我们必须告诉模型它不能进行数学计算。首先，让我们看一下当前使用的提示文本：该处使用的url网络请求的数据。

# existing prompt
print(agent.agent.llm_chain.prompt.messages[0].prompt.template)

我们将添加一句话，告诉模型它在数学方面是 “糟糕透顶的 terrible at math”，永远不应该尝试进行数学计算。将此添加到原始提示文本中后，我们使用 agent.agent.create_prompt 创建新的提示文本，这将为我们的代理 (Agents) 创建正确的提示结构，包括工具描述。然后，我们更新 agent.agent.llm_chain.prompt。我们可以看到代理 (Agents) 现在使用了 Circumference calculator 工具，并因此得到了正确的答案。

sys_msg = """Assistant is a large language model trained by OpenAI.

Assistant is designed to be able to assist with a wide range of tasks, from answering simple questions to providing in-depth explanations and discussions on a wide range of topics. As a language model, Assistant is able to generate human-like text based on the input it receives, allowing it to engage in natural-sounding conversations and provide responses that are coherent and relevant to the topic at hand.

Assistant is constantly learning and improving, and its capabilities are constantly evolving. It is able to process and understand large amounts of text, and can use this knowledge to provide accurate and informative responses to a wide range of questions. Additionally, Assistant is able to generate its own text based on the input it receives, allowing it to engage in discussions and provide explanations and descriptions on a wide range of topics.

Unfortunately, Assistant is terrible at maths. When provided with math questions, no matter how simple, assistant always refers to it's trusty tools and absolutely does NOT try to answer math questions by itself

Overall, Assistant is a powerful system that can help with a wide range of tasks and provide valuable insights and information on a wide range of topics. Whether you need help with a specific question or just want to have a conversation about a particular topic, Assistant is here to assist.
"""

new_prompt = agent.agent.create_prompt(
    system_message=sys_msg,
    tools=tools
)

agent.agent.llm_chain.prompt = new_prompt

agent("can you calculate the circumference of a circle that has a radius of 7.81mm")

2.带有多个参数的工具

为了演示如何实现这一点，我们将构建一个斜边计算器。该工具将帮助我们计算给定三角形边长和/或角度组合的三角形斜边。我们在这里需要多个输入，因为我们使用不同的值（边和角度）来计算三角形斜边。此外，并不需要所有值。我们可以使用任意两个或更多个参数来计算斜边。


from typing import Optional
from math import sqrt, cos, sin

desc = (
    "use this tool when you need to calculate the length of a hypotenuse"
    "given one or two sides of a triangle and/or an angle (in degrees). "
    "To use the tool, you must provide at least two of the following parameters "
    "['adjacent_side', 'opposite_side', 'angle']."
)

class PythagorasTool(BaseTool):
    name = "Hypotenuse calculator"
    description = desc
    
    def _run(
        self,
        adjacent_side: Optional[Union[int, float]] = None,
        opposite_side: Optional[Union[int, float]] = None,
        angle: Optional[Union[int, float]] = None
    ):
        # check for the values we have been given
        if adjacent_side and opposite_side:
            return sqrt(float(adjacent_side)\*\*2 + float(opposite_side)\*\*2)
        elif adjacent_side and angle:
            return adjacent_side / cos(float(angle))
        elif opposite_side and angle:
            return opposite_side / sin(float(angle))
        else:
            return "Could not calculate the hypotenuse of the triangle. Need two or more of `adjacent_side`, `opposite_side`, or `angle`."
    
    def _arun(self, query: str):
        raise NotImplementedError("This tool does not support async")

tools = [PythagorasTool()]