LangChain-v0.2文档翻译：2.6、教程-构建一个会话式RAG应用程序

Hugo_Hoo

已于 2024-05-31 16:34:50 修改

阅读量110

点赞数

分类专栏： LangChain-v0.2文档翻译文章标签： langchain python 数据库

于 2024-05-30 17:38:15 首次发布

原文链接：https://python.langchain.com/v0.2/docs/tutorials/qa_chat_history/

版权

LangChain-v0.2文档翻译专栏收录该内容

30 篇文章 5 订阅

订阅专栏

介绍
教程
2.1. 构建一个简单的 LLM 应用程序
2.2. 构建一个聊天机器人
2.3. 构建向量存储库和检索器
2.4. 构建一个代理
2.5. 构建检索增强生成 (RAG) 应用程序
2.6. 构建一个会话式RAG应用程序（点击查看原文）

在许多问答应用中，我们希望允许用户进行来回的对话，这意味着应用程序需要对过去的问题和答案有一些“记忆”，并且需要一些逻辑来将这些记忆融入到当前的思考中。

本文将重点介绍如何添加整合历史消息的逻辑。关于聊天历史管理的更多细节在这里有介绍。

我们将介绍两种方法：

链式调用(Chains)，我们总是执行一个检索步骤；
代理(Agents)，我们给大型语言模型(LLM)决定是否以及如何执行检索步骤（或多个步骤）的自由。

对于外部知识源，我们将使用由Lilian Weng在RAG教程中写的关于LLM驱动的自治代理的博客文章。

设置

依赖项

我们将在本教程中使用OpenAI嵌入和Chroma向量存储，但这里展示的所有内容都适用于任何Embeddings， VectorStore 或 Retriever。

我们将使用以下包：

%pip install --upgrade --quiet langchain langchain-community langchainhub langchain-chroma bs4

我们需要设置环境变量OPENAI_API_KEY，这可以直接完成，也可以从.env文件中加载：

import getpass
import os

os.environ["OPENAI_API_KEY"] = getpass.getpass()

# import dotenv

# dotenv.load_dotenv()

LangSmith

使用LangChain构建的许多应用程序将包含多个步骤，多次调用LLM。随着这些应用程序变得越来越复杂，能够检查链或代理内部的确切情况变得至关重要。做到这一点的最佳方式是使用LangSmith。

注意LangSmith不是必需的，但它很有帮助。如果你想使用LangSmith，在上述链接注册后，请确保设置你的环境变量以开始记录跟踪：

os.environ["LANGCHAIN_TRACING_V2"] = "true"
os.environ["LANGCHAIN_API_KEY"] = getpass.getpass()

链式调用

让我们首先回顾一下我们在RAG教程中构建的基于LLM驱动的自治代理的博客文章的问答应用。

OpenAI

pip install -qU langchain-openai

import getpass
import os

os.environ["OPENAI_API_KEY"] = getpass.getpass()

from langchain_openai import ChatOpenAI

llm = ChatOpenAI(model="gpt-3.5-turbo-0125")

import bs4
from langchain import hub
from langchain.chains import create_retrieval_chain
from langchain.chains.combine_documents import create_stuff_documents_chain
from langchain_chroma import Chroma
from langchain_community.document_loaders import WebBaseLoader
from langchain_core.output_parsers import StrOutputParser
from langchain_core.prompts import ChatPromptTemplate
from langchain_core.runnables import RunnablePassthrough
from langchain_openai import OpenAIEmbeddings
from langchain_text_splitters import RecursiveCharacterTextSplitter

# 1. 加载、分块并索引博客内容，创建一个检索器。
loader = WebBaseLoader(
    web_paths=("https://lilianweng.github.io/posts/2023-06-23-agent/",),
    bs_kwargs=dict(
        parse_only=bs4.SoupStrainer(
            class_=("post-content", "post-title", "post-header")
        )
    ),
)

docs = loader.load()

text_splitter = RecursiveCharacterTextSplitter(chunk_size=1000, chunk_overlap=200)
splits = text_splitter.split_documents(docs)
vectorstore = Chroma.from_documents(documents=splits, embedding=OpenAIEmbeddings())
retriever = vectorstore.as_retriever()

# 2. 将检索器整合到一个问答链中。
system_prompt =
    ("You are an assistant for question-answering tasks. "
     "Use the following pieces of retrieved context to answer "
     "the question. If you don't know the answer, say that you "
     "don't know. Use three sentences maximum and keep the "
     "answer concise."
     "\n\n"

     "{context}"
)

prompt = ChatPromptTemplate.from_messages(
    [
        ("system", system_prompt),
        ("human", "{input}"),
    ]
)

question_answer_chain = create_stuff_documents_chain(llm, prompt)
rag_chain = create_retrieval_chain(retriever, question_answer_chain)

response = rag_chain.invoke({"input": "What is Task Decomposition?"})
response["answer"]

'Task decomposition involves breaking down complex tasks into smaller and simpler steps to make them more manageable. This process can be achieved through techniques like Chain of Thought (CoT) or Tree of Thoughts, which help agents plan and execute tasks effectively by dividing them into sequential subgoals. Task decomposition can be facilitated by using prompting techniques, task-specific instructions, or human inputs to guide the agent through the steps required to accomplish a task.'

注意我们使用了内置的链式构造函数create_stuff_documents_chain和create_retrieval_chain，因此我们解决方案的基本要素是：

检索器（retriever）；
提示（prompt）；
大型语言模型(LLM)。

这将简化整合聊天历史的流程。

添加聊天历史

我们构建的链直接使用输入查询来检索相关上下文。但在对话环境中，用户查询可能需要对话上下文才能被理解。例如，考虑以下交流：

人类：“什么是任务分解？”

AI：“任务分解涉及将复杂任务分解为更小、更简单的步骤，使其更容易管理。”

人类：“它的常见的方法是什么？”

为了回答第二个问题，我们的系统需要理解“它”指的是“任务分解”。

我们需要更新我们现有应用的两件事：

提示：更新我们的提示以支持历史消息作为输入。
上下文化问题：添加一个子链，它采用最新用户问题和聊天历史，并在它引用历史信息中的任何信息时重新表述问题。这可以被简单地认为是构建一个新的“历史感知”检索器。而之前我们有：
- query -> retriever
现在我们将有：
- (query, conversation history) -> LLM -> rephrased query -> retriever

上下文化问题

首先，我们需要定义一个子链，它接收历史消息和最新用户问题，并在它引用历史信息中的任何信息时重新表述问题。

我们将使用一个提示，其中包含一个名为“chat_history”的MessagesPlaceholder变量。这允许我们使用“chat_history”输入键将消息列表传递给提示，并且这些消息将在系统消息之后和包含最新问题的人类消息之前插入。

注意我们利用了一个辅助函数create_history_aware_retriever来进行这一步，它管理chat_history为空的情况，否则按顺序应用prompt | llm | StrOutputParser() | retriever。

create_history_aware_retriever构建了一个接受input和chat_history作为输入的链，并具有与检索器相同的输出模式。

from langchain.chains import create_history_aware_retriever
from langchain_core.prompts import MessagesPlaceholder

contextualize_q_system_prompt =
    ("Given a chat history and the latest user question "
     "which might reference context in the chat history, "
     "formulate a standalone question which can be understood "
     "without the chat history. Do NOT answer the question, "
     "just reformulate it if needed and otherwise return it as is."
)

contextualize_q_prompt = ChatPromptTemplate.from_messages(
    [
        ("system", contextualize_q_system_prompt),
        MessagesPlaceholder("chat_history"),
        ("human", "{input}"),
    ]
)
history_aware_retriever = create_history_aware_retriever(
    llm, retriever, contextualize_q_prompt
)

这个链将输入查询的重述预处理我们的检索器，以便检索过程融入了对话的上下文。

现在我们可以构建我们完整的问答链。这就像更新检索器为我们新的history_aware_retriever一样简单。

同样，我们将使用create_stuff_documents_chain来生成一个question_answer_chain，输入键context、chat_history和input—它接受检索到的上下文以及对话历史和查询来生成答案。

我们使用create_retrieval_chain构建我们最后的rag_chain。这个链按顺序应用history_aware_retriever和question_answer_chain，保留检索到的上下文等中间输出，以方便使用。它有输入键input和chat_history，并在其输出中包括input、chat_history、context和answer。

from langchain.chains import create_retrieval_chain
from langchain.chains.combine_documents import create_stuff_documents_chain

qa_prompt = ChatPromptTemplate.from_messages(
    [
        ("system", system_prompt),
        MessagesPlaceholder("chat_history"),
        ("human", "{input}"),
   ]
)

question_answer_chain = create_stuff_documents_chain(llm, qa_prompt)

rag_chain = create_retrieval_chain(history_aware_retriever, question_answer_chain)

让我们尝试一下。下面我们问一个问题和一个需要上下文化才能返回合理响应的后续问题。因为我们的链包括一个"chat_history"输入，调用者需要管理聊天历史。我们可以通过将输入和输出消息追加到列表中来实现这一点：

from langchain_core.messages import AIMessage, HumanMessage

chat_history = []

question = "What is Task Decomposition?"
ai_msg_1 = rag_chain.invoke({"input": question, "chat_history": chat_history})
chat_history.extend(
    [
        HumanMessage(content=question),
        AIMessage(content=ai_msg_1["answer"]),
    ]
)

second_question = "What are common ways of doing it?"
ai_msg_2 = rag_chain.invoke({"input": second_question, "chat_history": chat_history)

print(ai_msg_2["answer"])

Task decomposition can be done in several common ways, such as using Language Model (LLM) with simple prompting like "Steps for XYZ" or asking for subgoals to achieve a specific task. Task-specific instructions can also be provided, like requesting a story outline for writing a novel. Additionally, human inputs can be utilized to decompose tasks into smaller components effectively.

聊天历史的有状态管理

我们已经讨论了如何添加应用程序逻辑来整合历史输出，但我们仍然在手动更新聊天历史并在每个输入中插入它。在真正的问答应用中，我们将需要一种持久化聊天历史的方式，以及一种自动插入和更新它的方式。

为此，我们可以使用：

BaseChatMessageHistory：存储聊天历史。
RunnableWithMessageHistory：LCEL链和BaseChatMessageHistory的包装器，它处理将聊天历史注入输入并在每次调用后更新它。

有关如何使用这些类共同创建有状态的对话链的详细演示，请访问如何添加消息历史(记忆)LCEL页面。

下面，我们实现了第二个选项的简单示例，其中聊天历史存储在简单的字典中。LangChain通过Redis和其他技术管理记忆集成，以提供更强大持久化。

RunnableWithMessageHistory的实例为您管理聊天历史。它们接受一个配置，其中包含一个键(默认为"session_id")，该键指定要获取和预处理到输入中的对话历史，并将输出追加到相同的对话历史。下面是一个示例：

from langchain_community.chat_message_histories import ChatMessageHistory
from langchain_core.chat_history import BaseChatMessageHistory
from langchain_core.runnables.history import RunnableWithMessageHistory

store = {}

def get_session_history(session_id: str) -> BaseChatMessageHistory:
    if session_id not in store:
        store[session_id] = ChatMessageHistory()
    return store[session_id]

conversational_rag_chain = RunnableWithMessageHistory(
    rag_chain,
    get_session_history,
    input_messages_key="input",
    history_messages_key="chat_history",
    output_messages_key="answer",
)

conversational_rag_chain.invoke(
    {"input": "What is Task Decomposition?"},
    config={
        "configurable": {"session_id": "abc123"}
    },  # 在`store`中构建一个键"abc123"。
)["answer"]

'Task decomposition involves breaking down complex tasks into smaller and simpler steps to make them more manageable for an agent or model. This process helps in guiding the agent through the various subgoals required to achieve the overall task efficiently. Different techniques like Chain of Thought and Tree of Thoughts can be used to decompose tasks into manageable components.'

conversational_rag_chain.invoke(
    {"input": "What are common ways of doing it?"},
    config={"configurable": {"session_id": "abc123"}},
)["answer"]

'Task decomposition can be achieved through various methods such as using prompting techniques like "Steps for XYZ" to guide the model through subgoals, providing task-specific instructions like "Write a story outline" for specific tasks, or incorporating human inputs to break down complex tasks. These approaches help in dividing a large task into smaller, more manageable components for better understanding and execution.'

可以在store字典中检查对话历史：

for message in store["abc123"].messages:
    if isinstance(message, AIMessage):
        prefix = "AI"
    else:
        prefix = "User"

    print(f"{prefix}: {message.content}\n")

User: What is Task Decomposition?

AI: Task decomposition involves breaking down complex tasks into smaller and simpler steps to make them more manageable for an agent or model. This process helps in guiding the agent through the various subgoals required to achieve the overall task efficiently. Different techniques like Chain of Thought and Tree of Thoughts can be used to decompose tasks into manageable components.

User: What are common ways of doing it?

AI: Task decomposition can be achieved through various methods such as using prompting techniques like "Steps for XYZ" to guide the model through subgoals, providing task-specific instructions like "Write a story outline" for specific tasks, or incorporating human inputs to break down complex tasks. These approaches help in dividing a large task into smaller, more manageable components for better understanding and execution.

整合在一起

在这里插入图片描述

为了方便，我们将所有必要的步骤整合在一个代码单元格中：

import bs4
from langchain.chains import create_history_aware_retriever, create_retrieval_chain
from langchain.chains.combine_documents import create_stuff_documents_chain
from langchain_chroma import Chroma
from langchain_community.chat_message_histories import ChatMessageHistory
from langchain_community.document_loaders import WebBaseLoader
from langchain_core.chat_history import BaseChatMessageHistory
from langchain_core.prompts import ChatPromptTemplate, MessagesPlaceholder
from langchain_core.runnables.history import RunnableWithMessageHistory
from langchain_openai import ChatOpenAI, OpenAIEmbeddings
from langchain_text_splitters import RecursiveCharacterTextSplitter

llm = ChatOpenAI(model="gpt-3.5-turbo", temperature=0)

### Construct retriever ###
loader = WebBaseLoader(
    web_paths=("https://lilianweng.github.io/posts/2023-06-23-agent/",),
    bs_kwargs=dict(
        parse_only=bs4.SoupStrainer(
            class_=("post-content", "post-title", "post-header")
        )
    ),
)
docs = loader.load()

text_splitter = RecursiveCharacterTextSplitter(chunk_size=1000, chunk_overlap=200)
splits = text_splitter.split_documents(docs)
vectorstore = Chroma.from_documents(documents=splits, embedding=OpenAIEmbeddings())
retriever = vectorstore.as_retriever()

### Contextualize question ###
contextualize_q_system_prompt =
    ("Given a chat history and the latest user question "
     "which might reference context in the chat history, "
     "formulate a standalone question which can be understood "
     "without the chat history. Do NOT answer the question, "
     "just reformulate it if needed and otherwise return it as is."
)
contextualize_q_prompt = ChatPromptTemplate.from_messages(
    [
        ("system", contextualize_q_system_prompt),
        MessagesPlaceholder("chat_history"),
        ("human", "{input}"),
    ]
)
history_aware_retriever = create_history_aware_retriever(
    llm, retriever, contextualize_q_prompt
)

### Answer question ###
system_prompt =
    ("You are an assistant for question-answering tasks. "
     "Use the following pieces of retrieved context to answer "
     "the question. If you don't know the answer, say that you "
     "don't know. Use three sentences maximum and keep the "
     "answer concise."
     "\n\n"

     "{context}"
)
qa_prompt = ChatPromptTemplate.from_messages(
    [
        ("system", system_prompt),
        MessagesPlaceholder("chat_history"),
        ("human", "{input}"),
    ]
)
question_answer_chain = create_stuff_documents_chain(llm, qa_prompt)
rag_chain = create_retrieval_chain(history_aware_retriever, question_answer_chain)

### Statefully manage chat history ###
store = {}

def get_session_history(session_id: str) -> BaseChatMessageHistory:
    if session_id not in store:
        store[session_id] = ChatMessageHistory()
    return store[session_id]

conversational_rag_chain = RunnableWithMessageHistory(
    rag_chain,
    get_session_history,
    input_messages_key="input",
    history_messages_key="chat_history",
    output_messages_key="answer",
)

conversational_rag_chain.invoke(
    {"input": "What is Task Decomposition?"},
    config={
        "configurable": {"session_id": "abc123"}
    },  # 在`store`中构建一个键"abc123"。
)["answer"]

'Task decomposition involves breaking down a complex task into smaller and simpler steps to make it more manageable. This process helps agents or models tackle difficult tasks by dividing them into more easily achievable subgoals. Task decomposition can be done through techniques like Chain of Thought or Tree of Thoughts, which guide the model in thinking step by step or exploring multiple reasoning possibilities at each step.'

conversational_rag_chain.invoke(
    {"input": "What are common ways of doing it?"},
    config={"configurable": {"session_id": "abc123"}},
)["answer"]

"Common ways of task decomposition include using techniques like Chain of Thought (CoT) or Tree of Thoughts to guide models in breaking down complex tasks into smaller steps. This can be achieved through simple prompting with LLMs, task-specific instructions, or human inputs to help the model understand and navigate the task effectively. Task decomposition aims to enhance model performance on complex tasks by utilizing more test-time computation and shedding light on the model's thinking process."

代理

代理利用LLM的推理能力在执行期间做出决策。使用代理允许您将检索过程的一些自由裁量权转移出去。尽管它们的行为不如链式调用可预测，但在此背景下它们提供了一些优势：

代理直接为检索器生成输入，而无需我们明确构建上下文化，就像我们上面所做的那样；
代理可以执行多个检索步骤来服务于查询，或者根本不执行检索步骤（例如，回应用户的一般问候）。

检索工具

代理可以访问“工具”并管理其执行。在这种情况下，我们将把我们的检索器转换为LangChain工具，由代理来使用：

from langchain.tools.retriever import create_retriever_tool

tool = create_retriever_tool(
    retriever,
    "blog_post_retriever",
    "Searches and returns excerpts from the Autonomous Agents blog post.",
)
tools = [tool]

工具是LangChain Runnables，并实现通常的接口：

tool.invoke("task decomposition")

'Tree of Thoughts (Yao et al. 2023) extends CoT by exploring multiple reasoning possibilities at each step. It first decomposes the problem into multiple thought steps and generates multiple thoughts per step, creating a tree structure. The search process can be BFS (breadth-first search) or DFS (depth-first search) with each state evaluated by a classifier (via a prompt) or majority vote.

Task decomposition can be done (1) by LLM with simple prompting like "Steps for XYZ.\n1.", "What are the subgoals for achieving XYZ?", (2) by using task-specific instructions; e.g. "Write a story outline." for writing a novel, or (3) with human inputs.

Fig. 1. Overview of a LLM-powered autonomous agent system.

Component One: Planning#
A complicated task usually involves many steps. An agent needs to know what they are and plan ahead.

Task Decomposition#
Chain of thought (CoT; Wei et al. 2022) has become a standard prompting technique for enhancing model performance on complex tasks. The model is instructed to “think step by step” to utilize more test-time computation to decompose hard tasks into smaller and simpler steps. CoT transforms big tasks into multiple manageable tasks and shed lights into an interpretation of the model’s thinking process.

(3) Task execution: Expert models execute on the specific tasks and log results.

Instruction:
With the input and the inference results, the AI assistant needs to describe the process and results. The previous stages can be formed as - User Input: {{ User Input }}, Task Planning: {{ Tasks }}, Model Selection: {{ Model Assignment }}, Task Execution: {{ Predictions }}. You must first answer the user's request in a straightforward manner. Then describe the task process and show your analysis and model inference results to the user in the first person. If inference results contain a file path, must tell the user the complete file path.

Fig. 11. Illustration of how HuggingGPT works. (Image source: Shen et al. 2023)

The system comprises of 4 stages:
(1) Task planning: LLM works as the brain and parses the user requests into multiple tasks. There are four attributes associated with each task: task type, ID, dependencies, and arguments. They use few-shot examples to guide LLM to do task parsing and planning.

Instruction:'

代理构造器

现在我们已经定义了工具和LLM，我们可以创建代理。我们将使用LangGraph来构建代理。
目前我们正在使用一个高级接口来构建代理，但LangGraph的好处在于这个高级接口背后有一个低级、高度可控的API，以防你想要修改代理逻辑。

from langgraph.prebuilt import chat_agent_executor

agent_executor = chat_agent_executor.create_tool_calling_executor(llm, tools)

我们现在可以尝试一下。注意，到目前为止它还不是有状态的（我们仍然需要添加内存）

query = "What is Task Decomposition?"

for s in agent_executor.stream(
    {"messages": [HumanMessage(content=query)]},
):
    print(s)
    print("----")

{'agent': {'messages': [AIMessage(content='', additional_kwargs={'tool_calls': [{'id': 'call_demTlnha4vYA1IH6CByYupBQ', 'function': {'arguments': '{"query":"Task Decomposition"}', 'name': 'blog_post_retriever'}, 'type': 'function'}]}, response_metadata={'token_usage': {'completion_tokens': 19, 'prompt_tokens': 68, 'total_tokens': 87}, 'model_name': 'gpt-3.5-turbo', 'system_fingerprint': 'fp_3b956da36b', 'finish_reason': 'tool_calls', 'logprobs': None}, id='run-d1c3f3da-be18-46a5-b3a8-4621ba1f7f2a-0', tool_calls=[{'name': 'blog_post_retriever', 'args': {'query': 'Task Decomposition'}, 'id': 'call_demTlnha4vYA1IH6CByYupBQ'}])]}}]}

----


{'action': {'messages': [ToolMessage(content='Fig. 1. Overview of a LLM-powered autonomous agent system.
Component One: Planning#

A complicated task usually involves many steps. An agent needs to know what they are and plan ahead.

Task Decomposition#

Chain of thought (CoT; Wei et al. 2022) has become a standard prompting technique for enhancing model performance on complex tasks. The model is instructed to “think step by step” to utilize more test-time computation to decompose hard tasks into smaller and simpler steps. CoT transforms big tasks into multiple manageable tasks and shed lights into an interpretation of the model’s thinking process.

Tree of Thoughts (Yao et al. 2023) extends CoT by exploring multiple reasoning possibilities at each step. It first decomposes the problem into multiple thought steps and generates multiple thoughts per step, creating a tree structure. The search process can be BFS (breadth-first search) or DFS (depth-first search) with each state evaluated by a classifier (via a prompt) or majority vote.

Task decomposition can be done (1) by LLM with simple prompting like "Steps for XYZ.\n1.", "What are the subgoals for achieving XYZ?", (2) by using task-specific instructions; e.g. "Write a story outline." for writing a novel, or (3) with human inputs.

(3) Task execution: Expert models execute on the specific tasks and log results.

Instruction:

With the input and the inference results, the AI assistant needs to describe the process and results. The previous stages can be formed as - User Input: {{ User Input }}, Task Planning: {{ Tasks }}, Model Selection: {{ Model Assignment }}, Task Execution: {{ Predictions }}. You must first answer the user's request in a straightforward manner. Then describe the task process and show your analysis and model inference results to the user in the first person. If inference results contain a file path, must tell the user the complete file path.

Fig. 11. Illustration of how HuggingGPT works. (Image source: Shen et al. 2023)

The system comprises of 4 stages:

(1) Task planning: LLM works as the brain and parses the user requests
into multiple tasks. There are four attributes associated with each task: task type, ID, dependencies, and arguments. They use few-shot examples to guide LLM to do task parsing and planning.

Instruction:', name='blog_post_retriever', id='e83e4002-33d2-46ff-82f4-fddb3035fb6a', tool_call_id='call_demTlnha4vYA1IH6CByYupBQ')]}]}

----

{'agent': {'messages': [AIMessage(content='Task decomposition is a technique used in autonomous agent systems to break down complex tasks into smaller and simpler steps. This approach helps agents better understand and plan for the various steps involved in completing a task. One common method for task decomposition is the Chain of Thought (CoT) technique, where models are prompted to "think step by step" to decompose hard tasks into manageable steps. Another approach, known as Tree of Thoughts, extends CoT by exploring multiple reasoning possibilities at each step and creating a tree structure of tasks.

Task decomposition can be achieved through various methods, such as using simple prompts for language models, task-specific instructions, or human inputs. By breaking down tasks into smaller components, agents can effectively plan and execute tasks with greater efficiency.

In summary, task decomposition is a valuable strategy for autonomous agents to tackle complex tasks by breaking them down into smaller, more manageable steps.', response_metadata={'token_usage': {'completion_tokens': 177, 'prompt_tokens': 588, 'total_tokens': 765}, 'model_name': 'gpt-3.5-turbo', 'system_fingerprint': 'fp_3b956da36b', 'finish_reason': 'stop', 'logprobs': None}, id='run-808f32b9-ae61-4f31-a55a-f30643594282-0')]}]}

----

LangGraph自带持久性，所以我们不需要使用ChatMessageHistory！相反，我们可以直接将一个检查点器传递给我们的LangGraph代理

from langgraph.checkpoint.sqlite import SqliteSaver

memory = SqliteSaver.from_conn_string(":memory:")

agent_executor = chat_agent_executor.create_tool_calling_executor(
    llm, tools, checkpointer=memory
)

这就是构建一个对话式RAG代理所需的一切。

让我们观察它的行为。注意，如果我们输入一个不需要检索步骤的查询，代理就不会执行检索步骤：

config = {"configurable": {"thread_id": "abc123"}}

for s in agent_executor.stream(
    {"messages": [HumanMessage(content="Hi! I'm bob")]}, config=config
):
    print(s)
    print("----")

{'agent': {'messages': [AIMessage(content='Hello Bob! How can I assist you today?', response_metadata={'token_usage': {'completion_tokens': 11, 'prompt_tokens': 67, 'total_tokens': 78}, 'model_name': 'gpt-3.5-turbo', 'system_fingerprint': 'fp_3b956da36b', 'finish_reason': 'stop', 'logprobs': None}, id='run-1451e59b-b135-4776-985d-4759338ffee5-0')]}]}

----

此外，如果我们输入一个需要检索步骤的查询，代理会为工具生成输入：

query = "What is Task Decomposition?"

for s in agent_executor.stream(
    {"messages": [HumanMessage(content=query)]}, config=config
):
    print(s)
    print("----")

{'agent': {'messages': [AIMessage(content='', additional_kwargs={'tool_calls': [{'id': 'call_ab2x4iUPSWDAHS5txL7PspSK', 'function': {'arguments': '{"query":"Task Decomposition"}', 'name': 'blog_post_retriever'}, 'type': 'function'}]}, response_metadata={'token_usage': {'completion_tokens': 19, 'prompt_tokens': 91, 'total_tokens': 110}, 'model_name': 'gpt-3.5-turbo', 'system_fingerprint': 'fp_3b956da36b', 'finish_reason': 'tool_calls', 'logprobs': None}, id='run-f76b5813-b41c-4d0d-9ed2-667b988d885e-0', tool_calls=[{'name': 'blog_post_retriever', 'args': {'query': 'Task Decomposition'}, 'id': 'call_ab2x4iUPSWDAHS5txL7PspSK'}])]}}]}

----


{'action': {'messages': [ToolMessage(content='Fig. 1. Overview of a LLM-powered autonomous agent system.
Component One: Planning#

A complicated task usually involves many steps. An agent needs to know what they are and plan ahead.

Task Decomposition#

Chain of thought (CoT; Wei et al. 2022) has become a standard prompting technique for enhancing model performance on complex tasks. The model is instructed to “think step by step” to utilize more test-time computation to decompose hard tasks into smaller and simpler steps. CoT transforms big tasks into multiple manageable tasks and shed lights into an interpretation of the model’s thinking process.

Tree of Thoughts (Yao et al. 2023) extends CoT by exploring multiple reasoning possibilities at each step. It first decomposes the problem into multiple thought steps and generates multiple thoughts per step, creating a tree structure. The search process can be BFS (breadth-first search) or DFS (depth-first search) with each state evaluated by a classifier (via a prompt) or majority vote.

Task decomposition can be done (1) by LLM with simple prompting like "Steps for XYZ.\n1.", "What are the subgoals for achieving XYZ?", (2) by using task-specific instructions; e.g. "Write a story outline." for writing a novel, or (3) with human inputs.

(3) Task execution: Expert models execute on the specific tasks and log results.

Instruction:

With the input and the inference results, the AI assistant needs to describe the process and results. The previous stages can be formed as - User Input: {{ User Input }}, Task Planning: {{ Tasks }}, Model Selection: {{ Model Assignment }}, Task Execution: {{ Predictions }}. You must first answer the user's request in a straightforward manner. Then describe the task process and show your analysis and model inference results to the user in the first person. If inference results contain a file path, must tell the user the complete file path.', name='blog_post_retriever', id='e0895fa5-5d41-4be0-98db-10a83d42fc2f', tool_call_id='call_ab2x4iUPSWDAHS5txL7PspSK')]}]}

----


{'agent': {'messages': [AIMessage(content='Task decomposition is a technique used in complex tasks where the task is broken down into smaller and simpler steps. This approach helps in managing and solving difficult tasks by dividing them into more manageable components. One common method for task decomposition is the Chain of Thought (CoT) technique, which prompts the model to think step by step and decompose hard tasks into smaller steps. Another extension of CoT is the Tree of Thoughts, which explores multiple reasoning possibilities at each step by creating a tree structure of thought steps.

Task decomposition can be achieved through various methods, such as using language models with simple prompting, task-specific instructions, or human inputs. By breaking down tasks into smaller components, agents can better plan and execute complex tasks effectively.

If you would like more detailed information or examples related to task decomposition, feel free to ask!', response_metadata={'token_usage': {'completion_tokens': 165, 'prompt_tokens': 611, 'total_tokens': 776}, 'model_name': 'gpt-3.5-turbo', 'system_fingerprint': 'fp_3b956da36b', 'finish_reason': 'stop', 'logprobs': None}, id='run-13296566-8577-4d65-982b-a39718988ca3-0')]}]}

----

上面，代理没有把我们的查询逐字插入工具中，而是删除了“什么”和“是”等不必要的词。

同样的原理允许代理在必要时使用对话的上下文：

query = "What according to the blog post are common ways of doing it? redo the search"

for s in agent_executor.stream(
    {"messages": [HumanMessage(content=query)]}, config=config
):
    print(s)
    print("----")

{'agent': {'messages': [AIMessage(content='', additional_kwargs={'tool_calls': [{'id': 'call_KvoiamnLfGEzMeEMlV3u0TJ7', 'function': {'arguments': '{"query":"common ways of task decomposition"}', 'name': 'blog_post_retriever'}, 'type': 'function'}]}, response_metadata={'token_usage': {'completion_tokens': 21, 'prompt_tokens': 930, 'total_tokens': 951}, 'model_name': 'gpt-3.5-turbo', 'system_fingerprint': 'fp_3b956da36b', 'finish_reason': 'tool_calls', 'logprobs': None}, id='run-dd842071-6dbd-4b68-8657-892eaca58638-0', tool_calls=[{'name': 'blog_post_retriever', 'args': {'query': 'common ways of task decomposition'}, 'id': 'call_KvoiamnLfGEzMeEMlV3u0TJ7'}])]}}]}

----


{'action': {'messages': [ToolMessage(content='Tree of Thoughts (Yao et al. 2023) extends CoT by exploring multiple reasoning possibilities at each step. It first decomposes the problem into multiple thought steps and generates multiple thoughts per step, creating a tree structure. The search process can be BFS (breadth-first search) or DFS (depth-first search) with each state evaluated by a classifier (via a prompt) or majority vote.

Task decomposition can be done (1) by LLM with simple prompting like "Steps for XYZ.\n1.", "What are the subgoals for achieving XYZ?", (2) by using task-specific instructions; e.g. "Write a story outline." for writing a novel, or (3) with human inputs.

Fig. 1. Overview of a LLM-powered autonomous agent system.

Component One: Planning#

A complicated task usually involves many steps. An agent needs to know what they are and plan ahead.

Task Decomposition#

Chain of thought (CoT; Wei et al. 2022) has become a standard prompting technique for enhancing model performance on complex tasks. The model is instructed to “think step by step” to utilize more test-time computation to decompose hard tasks into smaller and simpler steps. CoT transforms big tasks into multiple manageable tasks and shed lights into an interpretation of the model’s thinking process.

Resources:

1. Internet access for searches and information gathering.

2. Long Term memory management.

3. GPT-3.5 powered Agents for delegation of simple tasks.

4. File output.

Performance Evaluation:

1. Continuously review and analyze your actions to ensure you are performing to the best of your abilities.

2. Constructively self-criticize your big-picture behavior constantly.

3. Reflect on past decisions and strategies to refine your approach.

4. Every command has a cost, so be smart and efficient. Aim to complete tasks in the least number of steps.

(3) Task execution: Expert models execute on the specific tasks and log results.

Instruction:

With the input and the inference results, the AI assistant needs to describe the process and results. The previous stages can be formed as - User Input: {{ User Input }}, Task Planning: {{ Tasks }}, Model Selection: {{ Model Assignment }}, Task Execution: {{ Predictions }}. You must first answer the user's request in a straightforward manner. Then describe the task process and show your analysis and model inference results to the user in the first person. If inference results contain a file path, must tell the user the complete file path.', name='blog_post_retriever', id='c749bb8e-c8e0-4fa3-bc11-3e2e0651880b', tool_call_id='call_KvoiamnLfGEzMeEMlV3u0TJ7')]}]}

----


{'agent': {'messages': [AIMessage(content='According to the blog post, common ways of task decomposition include:

1. Using language models with simple prompting like "Steps for XYZ" or "What are the subgoals for achieving XYZ?"

2. Utilizing task-specific instructions, for example, using "Write a story outline" for writing a novel.

3. Involving human inputs in the task decomposition process.

These methods help in breaking down complex tasks into smaller and more manageable steps, facilitating better planning and execution of the overall task.', response_metadata={'token_usage': {'completion_tokens': 100, 'prompt_tokens': 1475, 'total_tokens': 1575}, 'model_name': 'gpt-3.5-turbo', 'system_fingerprint': 'fp_3b956da36b', 'finish_reason': 'stop', 'logprobs': None}, id='run-98b765b3-f1a6-4c9a-ad0f-2db7950b900f-0')]}]}

----

请注意，代理能够推断出我们查询中的“它”指的是“任务分解”，并生成了一个合理的搜索查询 - 在本例中为“任务分解的常见方法”。

整合在一起

为了方便，我们将所有必要的步骤整合在一个代码单元格中：

import bs4
from langchain.agents import AgentExecutor, create_tool_calling_agent
from langchain.tools.retriever import create_retriever_tool
from langchain_chroma import Chroma
from langchain_community.chat_message_histories import ChatMessageHistory
from langchain_community.document_loaders import WebBaseLoader
from langchain_core.chat_history import BaseChatMessageHistory
from langchain_core.prompts import ChatPromptTemplate, MessagesPlaceholder
from langchain_core.runnables.history import RunnableWithMessageHistory
from langchain_openai import ChatOpenAI, OpenAIEmbeddings
from langchain_text_splitters import RecursiveCharacterTextSplitter
from langgraph.checkpoint.sqlite import SqliteSaver

memory = SqliteSaver.from_conn_string(":memory:")
llm = ChatOpenAI(model="gpt-3.5-turbo", temperature=0)

### Construct retriever ###
loader = WebBaseLoader(
    web_paths=("https://lilianweng.github.io/posts/2023-06-23-agent/",),
    bs_kwargs=dict(
        parse_only=bs4.SoupStrainer(
            class_=("post-content", "post-title", "post-header")
        )
    ),
)
docs = loader.load()

text_splitter = RecursiveCharacterTextSplitter(chunk_size=1000, chunk_overlap=200)
splits = text_splitter.split_documents(docs)
vectorstore = Chroma.from_documents(documents=splits, embedding=OpenAIEmbeddings())
retriever = vectorstore.as_retriever()

### Build retriever tool ###
tool = create_retriever_tool(
    retriever,
    "blog_post_retriever",
    "Searches and returns excerpts from the Autonomous Agents blog post.",
)
tools = [tool]

agent_executor = chat_agent_executor.create_tool_calling_executor(
    llm, tools, checkpointer=memory
)

下一步

我们已经介绍了构建基本对话式问答应用的步骤：

我们使用链式调用构建了一个可预测的应用，它为每个用户输入生成搜索查询；
我们使用代理构建了一个“决定”何时以及如何生成搜索查询的应用。

要探索不同类型的检索器和检索策略，请访问指南中的检索器部分。

有关LangChain对话记忆抽象的详细演示，请访问如何添加消息历史(记忆)LCEL页面。

要了解更多关于代理的信息，请前往代理模块。

Hugo_Hoo

关注

0
点赞
踩
1

收藏

觉得还不错? 一键收藏
0
评论
LangChain-v0.2文档翻译：2.6、教程-构建一个会话式RAG应用程序

在许多问答应用中，我们希望允许用户进行来回的对话，这意味着应用程序需要对过去的问题和答案有一些“记忆”，并且需要一些逻辑来将这些记忆融入到当前的思考中。本文将重点介绍。关于聊天历史管理的更多细节在这里有介绍。对于外部知识源，我们将使用由Lilian Weng在RAG教程中写的关于LLM驱动的自治代理的博客文章。
复制链接

扫一扫