基于langchain0.2.0框架，实现可以调用外部工具来裁切对齐人脸的agent，使用RunnableWithMessageHistory实现记忆功能并可以进行多轮对话，包括详细步骤和解读。

lang.y

于 2024-07-14 16:33:56 发布

阅读量706

点赞数 13

文章标签： python langchain

本文链接：https://blog.csdn.net/yue_la/article/details/140417079

版权

概述

整个程序基于langchain0.2.0版本，介绍常用的几个模块的使用，社区很多langchain的文章都是0.1.0的，有些过时了，本文langchain语法部分参考langchain官方文档。整个项目由五部分组成，分别是tool，llm，prompt，agent和memory，tool部分调用的工具由FFHQFaceAlignment构成，在输入输出部分稍作修改。大家可以根据自己需求修改函数传入参数，create_tool_calling_agent支持多参数传递，可以解析出用户需求中的多个参数。也可以实现多工具，只需要调整tool模块即可。

实现效果：

agent分析出我的任务需要调用的函数和参数，自动调用函数并传参。具有记忆功能，而且可以实现多轮对话。这篇文章只介绍项目用到的代码，langchain框架下各部分的使用，会再写文章另作介绍，如有问题感谢指出。

Tools

工具定义共有三种方式：

使用@tool装饰器 -- 定义自定义工具的最简单方法。
使用 StructuredTool.from_function 类方法 -- 这类似于 @tool 装饰器，但允许对同步和异步实现进行更多配置和规范。
通过从 BaseTool 进行子类化 -- 这是最灵活的方法，它提供了最大程度的控制，但代价是需要更多的工作量和代码。

工具的参数：

参数	类型	描述
name	str	在提供给 LLM 或代理的一组工具中必须是唯一的。
description	str	描述该工具的功能。由 LLM 或代理用作上下文。
return_direct	boolean	仅与代理相关。如果为 True，则在调用给定工具后，代理将停止并将结果直接返回给用户。
args_schema	Pytantic BaseModel	可选但推荐，可用于提供更多信息（例如，少量示例）或验证预期参数

本文使用BaseTool类来进行工具定义。

from langchain.pydantic_v1 import BaseModel, Field
class Input(BaseModel):
    input_dir: str = Field(description = 'image path')

使用BaseModel来验证预期参数

from langchain.tools import BaseTool
from typing import Type
class Image_Align(BaseTool):
    name = 'image_crop_and_align'
    description = "use this tool when you need to crop and align images,the input is image path,the output is croped and aligned image path"
    args_schema: Type[BaseModel] = Input

分别定义了工具的name，description， args_schema。需要注意name如果由多个单词组成，中间不能使用空格，description非常重要，会影响agent对tool的调用。

def _run(self, input_dir):
        transform_size = 256

        # Get input/output directories
        input_dir = osp.abspath(osp.expanduser(input_dir))
        output_dir = osp.join(osp.split(input_dir)[0], "{}_aligned".format(osp.split(input_dir)[1]))

        print(f"#. 创建输出目录: {output_dir}")
        os.makedirs(output_dir, exist_ok=True)

函数部分详情参考另一篇文章FFHQFaceAlignment，里面有详细介绍，需要对函数输入做一些修改，代码里已经展示。

tools = [Image_Align()]

到此tool部分完毕。

LLM

from langchain_openai import ChatOpenAI
import os
os.environ['OPENAI_API_KEY']="your_api_key"
os.environ['OPENAI_API_BASE'] = "base_url"

llm = ChatOpenAI(model='gpt-3.5-turbo-0125')

使用聊天模型，填入你的api_key，如果是中转api，还需要base_url，可在某电商平台获取。

Prompt

from langchain_core.prompts import ChatPromptTemplate, MessagesPlaceholder

prompt = ChatPromptTemplate.from_messages([
    ('system', 'you are a helpful assistant,if user ask you question about math,you must use tool'),
    MessagesPlaceholder(variable_name='chat_history'),
    ('human', '{user_input}'),
    MessagesPlaceholder(variable_name='agent_scratchpad')
])

有两个MessagesPlaceholder，第一个位置是用于存储历史消息记录，另一个是用于存储agent的中间步骤，缺一不可。

Agent

from langchain.agents import create_tool_calling_agent

agent = create_tool_calling_agent(llm, tools, prompt)

from langchain.agents import AgentExecutor

agent_executor = AgentExecutor(agent=agent, tools=tools, verbose=True)

agent只能分析出用户需求需要用到的函数和参数，但不会执行函数，函数的执行依靠AgentExecutor，verbose=True会显示agent的中间步骤，就是最上面图中绿色的部分。

Memory

from langchain_core.chat_history import BaseChatMessageHistory
from langchain_core.runnables.history import RunnableWithMessageHistory
from typing import List
from langchain_core.pydantic_v1 import BaseModel, Field
from langchain_core.messages import BaseMessage

store = {}

class InMemoryHistory(BaseChatMessageHistory, BaseModel):
    """In memory implementation of chat message history."""

    messages: List[BaseMessage] = Field(default_factory=list)

    def add_messages(self, messages: List[BaseMessage]) -> None:
        """Add a list of messages to the store"""
        self.messages.extend(messages)

    def clear(self) -> None:
        self.messages = []

def get_session_history(session_id: str) -> BaseChatMessageHistory:
    if session_id not in store:
        store[session_id] = InMemoryHistory()
    return store[session_id]

这是实现memory的关键，先定义了一个InMemoryHistory类，定义了一个messages列表，和两个方法，分别用于添加历史消息到列表中和清除列表中的历史消息。又定义了一个方法，它返回一个InMemoryHistory类型，在store中创建一个InMemoryHistory对象。

agent_with_chat_history = RunnableWithMessageHistory(
    agent_executor,
    get_session_history,
    input_messages_key="user_input",
    history_messages_key="chat_history",
)

session_id = '1'

def run():
    while True:
        user_input = input('user:')
        if user_input.lower() in ['exit','quit', 'over']:
            break

        response = agent_with_chat_history.invoke(
            {'user_input': user_input},
            config={"configurable": {"session_id": session_id}},
        )
        print(f"ai  : {response['output']}")

RunnableWithMessageHistory：

runnable：

在 RunnableWithMessageHistory 类中，runnable 参数是基础可运行对象，它必须满足以下输入和输出要求：

输入：

消息序列：一系列 BaseMessages 对象。
字典：
- 单个键包含所有消息。
- 一个键包含当前输入的字符串或消息，另一个键包含历史消息。如果输入键指向一个字符串，它将在历史记录中被视为 HumanMessage。

输出：

字符串：可以视为 AIMessage。
BaseMessage 或消息序列。
字典：包含 BaseMessage 或消息序列的键。

get_session_history：

get_session_history 是一个函数，用于返回新的 BaseChatMessageHistory 实例

具体再另写文章说吧。

获取

代码已上传至GitHubyuelang222/crop-and-align-face-image-agent: An agent can crop and align face images, which is implemented based on LangChain 0.2.0 (github.com)

lang.y

关注

13
点赞
踩
12

收藏

觉得还不错? 一键收藏
0
评论
基于langchain0.2.0框架，实现可以调用外部工具来裁切对齐人脸的agent，使用RunnableWithMessageHistory实现记忆功能并可以进行多轮对话，包括详细步骤和解读。

整个程序基于langchain0.2.0版本，介绍常用的几个模块的使用，社区很多langchain的文章都是0.1.0的，有些过时了，本文langchain语法部分参考langchain官方文档。实现一个可以裁剪对齐人脸的agent
复制链接

扫一扫