LangChain - Tool Calling 工具调用

富婆E

已于 2024-07-08 10:15:06 修改

阅读量4.6k

点赞数 48

分类专栏： # AI 文档翻译文章标签： langchain redis 数据库

于 2024-05-25 11:15:00 首次发布

本文链接：https://blog.csdn.net/lovechris00/article/details/139119969

版权

AI 文档翻译专栏收录该内容

104 篇文章

订阅专栏

LLM、AIGC、RAG 开发交流裙：377891973

文章目录

本文翻译整理自：Tool Calling with LangChain
https://blog.langchain.dev/tool-calling-with-langchain/

TLDR：我们正在引入一个 AIMessage 的新tool_calls属性。越来越多的 LLM 提供商正在公开 API 以实现可靠的工具调用。新属性的目标是提供一个与工具调用交互的标准接口。

它完全向后兼容，并且支持所有具有本机工具调用支持的模型。
为了访问这些最新功能，您需要升级您的langchain_core和合作伙伴的软件包版本。

YouTube 演练 : https://youtu.be/zCwuAlpQKTM?ref=blog.langchain.dev

Python：

显示工具调用能力状态的聊天模型列表 : https://python.langchain.com/docs/integrations/chat
工具调用讲解新的工具调用接口 : https://python.langchain.com/docs/modules/model_io/chat/function_calling
工具调用代理展示了如何创建使用标准化工具调用接口的代理 : https://python.langchain.com/docs/modules/agents/agent_types/tool_calling
LangGraph notebook展示了如何创建使用标准化工具调用接口的LangGraph代理 : https://github.com/langchain-ai/langchain/blob/master/cookbook/tool_call_messages.ipynb

JS：

显示工具调用能力状态的聊天模型列表
工具调用讲解新的工具调用接口
工具调用代理展示了如何创建使用标准化工具调用接口的代理

介绍

大型语言模型 (LLM) 可以通过工具调用功能与外部数据源交互。
工具调用是一种强大的技术，允许开发人员构建复杂的应用程序，这些应用程序可以利用 LLM 来访问、交互和操作数据库、文件和 API 等外部资源。

提供商已经将本机工具调用功能引入到他们的模型中。
在实践中，当 LLM 为提示提供自动完成功能时，除了纯文本之外，它还可以返回工具调用列表。
大约一年前，OpenAI 率先发布了“函数调用”，并在 11 月份迅速演变为“工具调用”。
此后，其他模型提供商也紧随其后：Gemini（12 月）、Mistral（2 月）、Fireworks（3 月）、Together（3 月）、Groq（4 月）、Cohere（4 月）和 Anthropic（4 月）。

所有这些提供商都暴露了略有不同的接口（特别是：OpenAI、Anthropic 和 Gemini，这三个性能最高的模型是不兼容的）。
我们听到社区希望有一个用于工具调用的标准化接口，以便在这些提供程序之间轻松切换，我们很高兴今天发布该接口。

标准接口包括：

ChatModel.bind_tools()：将工具定义附加到模型调用的方法。
AIMessage.tool_calls：从模型返回的属性AIMessage，用于轻松访问模型决定进行的工具调用。
create_tool_calling_agent()``bind_tools：一个代理构造函数，可与实现 bind_tools 并返回的任何模型一起使用tool_calls。

让我们看一下每个组件。

组件

1、`ChatModel.bind_tools(...)`

为了允许模型使用工具，我们需要告诉它哪些工具可用。
我们通过指定将工具定义列表传递给模型来实现此目的，其中包括工具参数的架构。
工具定义的确切格式取决于模型提供者 - OpenAI 需要一个包含 “name”, “description”, 和 “parameters” 键的字典，而 Anthropic 需要 “name”, “description”, 和 “input_schema”.。

ChatModel.bind_tools提供由所有工具调用模型实现的标准接口，使您可以指定模型可以使用哪些工具。
您不仅可以传入原始工具定义（字典），还可以传入从中派生工具定义的对象：即 Pydantic 类、LangChain 工具和任意函数。
这使得创建可与任何工具调用模型一起使用的通用工具定义变得很容易：

from langchain_anthropic import ChatAnthropic
from langchain_core.pydantic_v1 import BaseModel, Field
from langchain_core.tools import tool

# ✅ Pydantic class
class multiply(BaseModel):
    """Return product of 'x' and 'y'."""
    x: float = Field(..., description="First factor")
    y: float = Field(..., description="Second factor")
    
# ✅ LangChain tool
@tool
def exponentiate(x: float, y: float) -> float:
    """Raise 'x' to the 'y'."""
    return x**y
    
# ✅ Function

def subtract(x: float, y: float) -> float:
    """Subtract 'x' from 'y'."""
    return y-x
    
# ✅ OpenAI-format dict
# Could also pass in a JSON schema with "title" and "description" 
add = {
  "name": "add",
  "description": "Add 'x' and 'y'.",
  "parameters": {
    "type": "object",
    "properties": {
      "x": {"type": "number", "description": "First number to add"},
      "y": {"type": "number", "description": "Second number to add"}
    },
    "required": ["x", "y"]
  }
}

llm = ChatAnthropic(model="claude-3-sonnet-20240229", temperature=0)

# Whenever we invoke `llm_with_tool`, all three of these tool definitions
# are passed to the model.
llm_with_tools = llm.bind_tools([multiply, exponentiate, add, subtract])

如果我们想使用不同的工具调用模型，我们的代码看起来非常相似：

from langchain_openai import ChatOpenAI

llm = ChatOpenAI(model="gpt-4-turbo", temperature=0)
llm_with_tools = llm.bind_tools([multiply, exponentiate, add, subtract])

那么，呼叫是什么llm_with_tools样子的呢？这就是问题AIMessage.tool_calls所在。

2、`AIMessage.tool_calls`

在使用工具调用模型之前，模型返回的任何工具调用都可以在 AIMessage.additional_kwargs 或 AIMessage.content 中找到，具体取决于模型提供者的 API，并遵循提供者特定的格式。
也就是说，您需要自定义逻辑来从不同模型的输出中提取工具调用。
现在，AIMessage.tool_calls提供了用于获取模型工具调用的标准化接口。
因此，在使用绑定工具调用模型后，您将获得以下形式的输出：

llm_with_tools.invoke([
	("system", "You're a helpful assistant"), 
	("human", "what's 5 raised to the 2.743"),
])

# 👀 Notice the tool_calls attribute 👀

# -> AIMessage(
# 	  content=..., 
# 	  additional_kwargs={...},
# 	  tool_calls=[{'name': 'exponentiate', 'args': {'y': 2.743, 'x': 5.0}, 'id': '54c166b2-f81a-481a-9289-eea68fc84e4f'}]
# 	  response_metadata={...}, 
# 	  id='...'
#   )

其中AIMessage具有一个tool_calls: List[ToolCall]属性，如果有任何工具调用，该属性将被填充，并且将遵循工具调用的标准接口：

class ToolCall(TypedDict):
  name: str
  args: Dict[str, Any]
		id: Optional[str]

也就是说，无论您是调用 Anthropic、OpenAI、Gemini 等，只要有工具调用，它就会在AIMessage.tool_calls 作为 ToolCall。

我们添加了一些其他属性来处理流式工具调用块和无效工具调用。
在此处阅读工具调用文档中的更多信息。

3、`create_tool_calling_agent()`

LLM 工具调用能力最强大、最明显的用途之一是构建代理。
LangChain 已经有一个create_openai_tools_agent()构造函数，可以轻松构建具有遵循 OpenAI 工具调用 API 的工具调用模型的代理，但这不适用于 Anthropic 和 Gemini 等模型。
由于有了新的bind_tools()和tool_calls接口，我们添加了一个create_tool_calling_agent()可与任何工具调用模型一起使用的工具。

from langchain_core.prompts import ChatPromptTemplate
from langchain_core.runnables import ConfigurableField
from langchain_core.tools import tool
from langchain.agents import create_tool_calling_agent, AgentExecutor

@tool
def multiply(x: float, y: float) -> float:
    """Multiply 'x' times 'y'."""
    return x * y

@tool
def exponentiate(x: float, y: float) -> float:
    """Raise 'x' to the 'y'."""
    return x**y

@tool
def add(x: float, y: float) -> float:
    """Add 'x' and 'y'."""
    return x + y

prompt = ChatPromptTemplate.from_messages([
    ("system", "you're a helpful assistant"), 
    ("human", "{input}"), 
    ("placeholder", "{agent_scratchpad}"),
])

tools = [multiply, exponentiate, add]


llm = ChatAnthropic(model="claude-3-sonnet-20240229", temperature=0)


agent = create_tool_calling_agent(llm, tools, prompt)
agent_executor = AgentExecutor(agent=agent, tools=tools, verbose=True)

agent_executor.invoke({"input": "what's 3 plus 5 raised to the 2.743. also what's 17.24 - 918.1241", })

我们可以使用 VertexAI 代替

from langchain_google_vertexai import ChatVertexAI

llm = ChatVertexAI(
	model="gemini-pro", 
	temperature=0, 
	convert_system_message_to_human=True
)
agent = create_tool_calling_agent(llm, tools, prompt)
agent_executor = AgentExecutor(agent=agent, tools=tools, verbose=True)

agent_executor.invoke({"input": "what's 3 plus 5 raised to the 2.743. also what's 17.24 - 918.1241", })

或者OpenAI

llm = ChatOpenAI(model="gpt-3.5-turbo-0125", temperature=0)

agent = create_tool_calling_agent(llm, tools, prompt)
agent_executor = AgentExecutor(agent=agent, tools=tools, verbose=True)

agent_executor.invoke({"input": "what's 3 plus 5 raised to the 2.743. also what's 17.24 - 918.1241", })

有关新代理的完整文档，请参阅 https://python.langchain.com/docs/modules/agents/agent_types/tool_calling/。 s

三、LangGraph

如果您还没有检查过LangGraph，那么您绝对应该检查一下。
它是 LangChain 的扩展，可以轻松构建任意代理和多代理流。
正如您可以想象的那样，使用新tool_calls界面还可以使构建 LangGraph 代理或流时的工作变得更加简单。
查看此处的笔记本，了解如何tool_calls在 LangGraph 代理中使用的详细演练。

`with_structured_output`

我们最近发布了ChatModel.with_structured_output()用于从模型获取结构化输出的接口，这是非常相关的。
虽然确切的实现因模型提供者而异，但对于支持它的大多数模型来说with_structured_output 是建立在工具调用之上的。
在底层，with_structured_output 使用 bind_tools将给定的结构化输出模式传递给模型。

那么什么时候应该使用with_structured_output绑定工具和直接读取工具调用呢？

with_structured_output始终返回您指定的架构中的结构化输出。
当您想强制 LLM 输出与特定架构匹配的信息时，这很有用。这对于信息提取任务很有用。

bind_tools更通用，可以选择特定工具 - 或没有工具，或多个工具！当您希望 LLM 在如何响应方面具有更大的灵活性时，这非常有用 - 例如，在代理应用程序中，您需要选择要调用的工具，但也需要响应用户。