聊天机器人的一个主要特点是能使用以前的对话内容作为上下文。这种状态管理有多种形式,包括:
- 简单地将以前的信息塞进聊天模型提示中。
- 如上,但会修剪旧信息,以减少模型需要处理的干扰信息量。
- 更复杂的修改,如为长对话合成摘要。
Setup
1
| %pip install --upgrade --quiet langchain==0.1.20 langchain-google-genai
|
1
2
3
| from langchain_google_genai import ChatGoogleGenerativeAI
chat = ChatGoogleGenerativeAI(model="gemini-1.5-pro-latest", google_api_key=API_KEY)
|
Message Passing
最简单的记忆形式就是将聊天记录信息传递到一个链中。下面是一个例子:
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
| from langchain_core.messages import AIMessage, HumanMessage
from langchain_core.prompts import ChatPromptTemplate, MessagesPlaceholder
prompt = ChatPromptTemplate.from_messages(
[
(
"system",
"You are a helpful assistant. Answer all questions to the best of your ability.",
),
MessagesPlaceholder(variable_name="messages"),
]
)
chain = prompt | chat
response = chain.invoke(
{
"messages": [
HumanMessage(
content="Translate this sentence from English to French: I love programming."
),
AIMessage(content="J'adore la programmation."),
HumanMessage(content="What did you just say?"),
],
}
)
response
|
1
| AIMessage(content='I said "J\'adore la programmation" which is the French translation for "I love programming". \n', response_metadata={'prompt_feedback': {'block_reason': 0, 'safety_ratings': []}, 'finish_reason': 'STOP', 'safety_ratings': [{'category': 'HARM_CATEGORY_SEXUALLY_EXPLICIT', 'probability': 'NEGLIGIBLE', 'blocked': False}, {'category': 'HARM_CATEGORY_HATE_SPEECH', 'probability': 'NEGLIGIBLE', 'blocked': False}, {'category': 'HARM_CATEGORY_HARASSMENT', 'probability': 'NEGLIGIBLE', 'blocked': False}, {'category': 'HARM_CATEGORY_DANGEROUS_CONTENT', 'probability': 'NEGLIGIBLE', 'blocked': False}]}, id='run-88045024-xxx-xxx-xxx-35fe7f321bbe-0')
|
我们可以看到,通过将之前的对话传递到链中,聊天机器人可以将其作为回答问题的上下文。这就是聊天机器人记忆的基本概念。
Chat Message History
直接以数组形式存储和传递消息完全没问题,但我们也可以使用 LangChain 内置的消息历史记录类来存储和加载消息。
1
2
3
4
5
6
7
8
9
| from langchain.memory import ChatMessageHistory
demo_ephemeral_chat_history = ChatMessageHistory()
demo_ephemeral_chat_history.add_user_message(
"Translate this sentence from English to French: I love programming."
)
demo_ephemeral_chat_history.add_ai_message("J'adore la programmation.")
demo_ephemeral_chat_history
|
1
| InMemoryChatMessageHistory(messages=[HumanMessage(content='Translate this sentence from English to French: I love programming.'), AIMessage(content="J'adore la programmation.")])
|
我们可以直接用它来为我们的链存储多轮对话:
1
2
3
4
5
6
7
8
9
10
11
| demo_ephemeral_chat_history = ChatMessageHistory()
input1 = "Translate this sentence from English to French: I love programming."
demo_ephemeral_chat_history.add_user_message(input1)
response = chain.invoke(
{
"messages": demo_ephemeral_chat_history.messages,
}
)
response
|
1
| AIMessage(content="J'adore programmer. \n", response_metadata={'prompt_feedback': {'block_reason': 0, 'safety_ratings': []}, 'finish_reason': 'STOP', 'safety_ratings': [{'category': 'HARM_CATEGORY_SEXUALLY_EXPLICIT', 'probability': 'NEGLIGIBLE', 'blocked': False}, {'category': 'HARM_CATEGORY_HATE_SPEECH', 'probability': 'NEGLIGIBLE', 'blocked': False}, {'category': 'HARM_CATEGORY_HARASSMENT', 'probability': 'NEGLIGIBLE', 'blocked': False}, {'category': 'HARM_CATEGORY_DANGEROUS_CONTENT', 'probability': 'NEGLIGIBLE', 'blocked': False}]}, id='run-5b427786-xxx-xxx-xxx-c6e6137c4df5-0')
|
记忆类型
LangChain 包含了以下几种记忆类型,通常这些类型会与 LLMs 结合使用。
Conversation Buffer www.cqzlsb.com
ConversationBufferMemory
是一种极其简单的记忆形式,它所做的就是把聊天消息保存在内存中,并将这些消息输入到提示模板。
1
2
3
4
5
6
7
| from langchain.memory import ConversationBufferMemory
memory = ConversationBufferMemory()
memory.chat_memory.add_user_message("hi!")
memory.chat_memory.add_ai_message("what's up?")
memory
|
1
| ConversationBufferMemory(chat_memory=InMemoryChatMessageHistory(messages=[HumanMessage(content='hi!'), AIMessage(content="what's up?")]))
|
另一种使用方式:
1
2
3
4
5
6
| from langchain.memory import ConversationBufferMemory
memory = ConversationBufferMemory()
memory.save_context({"input": "hi"}, {"output": "whats up"})
memory.load_memory_variables({})
|
1
| {'history': 'Human: hi\nAI: whats up'}
|
在这个示例中,你可以注意到 load_memory_variables
返回了一个名为 history
的键值。这意味着你的链(以及可能的输入提示词)可能会期望一个名为 history 的输入。一般而言,你可以通过在记忆类中设置参数来管理这个变量。例如,如果你希望记忆变量在 chat_history
关键字中返回,你可以这样做:
1
2
3
4
5
| memory = ConversationBufferMemory(memory_key="chat_history")
memory.chat_memory.add_user_message("hi!")
memory.chat_memory.add_ai_message("what's up?")
memory.load_memory_variables({})
|
1
| {'chat_history': "Human: hi!\nAI: what's up?"}
|
另外,最常见的一种使用记忆的方式是返回聊天信息的列表。这些信息可以整合成一个字符串返回(当要传入 LLMs 时 这种方式很有用)或者作为一个聊天消息的列表返回(在传入 ChatModels 时这种方式很有用)。
默认情况下,它们以一整串字符串的方式返回。为了以消息列表的形式返回,你可以设置 return_messages=True
。
1
2
3
4
5
6
| memory = ConversationBufferMemory(return_messages=True)
memory.chat_memory.add_user_message("hi!")
memory.chat_memory.add_ai_message("what's up?")
memory.load_memory_variables({})
|
1
| {'history': [HumanMessage(content='hi!'), AIMessage(content="what's up?")]}
|
以下是与链结合使用的示例:
1
2
3
4
5
6
7
8
9
10
11
| from langchain_google_genai import GoogleGenerativeAI
from langchain.chains import ConversationChain
llm = GoogleGenerativeAI(model="gemini-1.5-pro-latest", google_api_key=API_KEY)
conversation = ConversationChain(
llm=llm,
verbose=True,
memory=ConversationBufferMemory()
)
conversation.predict(input="Hi there!")
|
输出
Conversation Buffer Window
ConversationBufferWindowMemory
跟踪并保存随时间发展的对话互动列表。它只保留最近的 K 次对话记录。这种做法有助于创建一个包含最新互动记录的滑动视窗,可以有效地避免缓存变得过大。
1
2
3
4
5
6
7
| from langchain.memory import ConversationBufferWindowMemory
memory = ConversationBufferWindowMemory(k=1)
memory.save_context({"input": "hi"}, {"output": "whats up"})
memory.save_context({"input": "not much you"}, {"output": "not much"})
memory.load_memory_variables({})
|
1
| {'history': 'Human: not much you\nAI: not much'}
|
1
2
3
4
5
| memory = ConversationBufferWindowMemory(k=1, return_messages=True)
memory.save_context({"input": "hi"}, {"output": "whats up"})
memory.save_context({"input": "not much you"}, {"output": "not much"})
memory.load_memory_variables({})
|
1
| {'history': [HumanMessage(content='not much you'), AIMessage(content='not much')]}
|
Entity
实体记忆在对话中记住了关于特定实体的既定事实。它提取关于实体的信息(使用一个 LLM)并且随着时间的推移建立起关于该实体的知识(也使用一个 LLM)。
1
2
3
4
5
6
| from langchain.memory import ConversationEntityMemory
memory = ConversationEntityMemory(llm=llm)
_input = {"input": "Deven & Sam are working on a hackathon project"}
memory.load_memory_variables(_input)
memory
|
1
2
3
4
5
6
7
| ConversationEntityMemory(llm=GoogleGenerativeAI(model='gemini-1.5-pro-latest', google_api_key=SecretStr('**********'), client=genai.GenerativeModel(
model_name='models/gemini-1.5-pro-latest',
generation_config={},
safety_settings={},
tools=None,
system_instruction=None,
)), entity_cache=['Deven', 'Sam'])
|
1
2
3
4
5
| memory.save_context(
_input,
{"output": " That sounds like a great project! What kind of project are they working on?"}
)
memory
|
1
2
3
4
5
6
7
| ConversationEntityMemory(chat_memory=InMemoryChatMessageHistory(messages=[HumanMessage(content='Deven & Sam are working on a hackathon project'), AIMessage(content=' That sounds like a great project! What kind of project are they working on?')]), llm=GoogleGenerativeAI(model='gemini-1.5-pro-latest', google_api_key=SecretStr('**********'), client=genai.GenerativeModel(
model_name='models/gemini-1.5-pro-latest',
generation_config={},
safety_settings={},
tools=None,
system_instruction=None,
)), entity_cache=['Deven', 'Sam'], entity_store=InMemoryEntityStore(store={'Deven': 'Updated summary: Deven is working on a hackathon project with Sam.', 'Sam': 'Updated summary: Sam is working on a hackathon project with Deven.'}))
|
1
| memory.load_memory_variables({"input": 'who is Sam'})
|
1
2
| {'history': 'Human: Deven & Sam are working on a hackathon project\nAI: That sounds like a great project! What kind of project are they working on?',
'entities': {'Sam': 'Updated summary: Sam is working on a hackathon project with Deven.'}}
|
1
2
3
4
5
6
7
8
9
| memory = ConversationEntityMemory(llm=llm, return_messages=True)
_input = {"input": "Deven & Sam are working on a hackathon project"}
memory.load_memory_variables(_input)
memory.save_context(
_input,
{"output": " That sounds like a great project! What kind of project are they working on?"}
)
memory.load_memory_variables({"input": 'who is Sam'})
|
1
2
3
4
| WARNING:langchain_core.language_models.llms:Retrying langchain_google_genai.llms._completion_with_retry.<locals>._completion_with_retry in 10.0 seconds as it raised ResourceExhausted: 429 Resource has been exhausted (e.g. check quota)..
{'history': [HumanMessage(content='Deven & Sam are working on a hackathon project'),
AIMessage(content=' That sounds like a great project! What kind of project are they working on?')],
'entities': {'Sam': 'Updated summary:\nSam is working on a hackathon project with Deven.'}}
|
Conversation Knowledge Graph
这种类型的记忆使用知识图谱来重建记忆。
1
2
3
4
5
6
7
| from langchain.memory import ConversationKGMemory
memory = ConversationKGMemory(llm=llm)
memory.save_context({"input": "say hi to sam"}, {"output": "who is sam"})
memory.save_context({"input": "sam is a friend"}, {"output": "okay"})
memory.load_memory_variables({"input": "who is sam"})
|
1
| {'history': 'On Sam: Sam is a friend.'}
|
我们也可以更加模块化地从一条新消息中获取当前实体(将使用之前的消息作为上下文)。
1
| memory.get_current_entities("what's Sams favorite color?")
|
['Sams']
我们也可以更加模块化地从一条新消息中获取知识三元组(将使用之前的消息作为上下文)。
1
| memory.get_knowledge_triplets("her favorite color is red")
|
1
| [KnowledgeTriple(subject='her', predicate='favorite color', object_='red')]
|
Conversation Summary
现在让我们来看一个略微复杂的记忆类型 ConversationSummaryMemory
。这种记忆类型会随着时间的推移创建对话的总结。这对于压缩对话中随时间积累的信息是有用的。会话总结记忆在对话发生时总结内容,并将当前的总结存储在记忆中。这个记忆之后可以用来将到目前为止的对话总结注入到一个提示词/链中。这种记忆对长时间的对话最有用,因为如果直接在提示词中保持之前的消息历史会占用太多的 Token。
1
2
3
4
5
6
| from langchain.memory import ConversationSummaryMemory, ChatMessageHistory
memory = ConversationSummaryMemory(llm=llm)
memory.save_context({"input": "hi"}, {"output": "whats up"})
memory.load_memory_variables({})
|
我们也可以直接利用 predict_new_summary
方法:
1
2
| messages = memory.chat_memory.messages
messages
|
1
| [HumanMessage(content='hi'), AIMessage(content='whats up')]
|
1
2
| previous_summary = ""
memory.predict_new_summary(messages, previous_summary)
|
'Current summary:\nThe human greeted the AI. The AI responded by asking what was happening. \n'
你可以轻松地用 ChatMessageHistory
初始化 ConversationSummaryMemory
。在加载时,会自动生成一个总结。
1
2
3
4
5
6
7
8
9
10
| history = ChatMessageHistory()
history.add_user_message("hi")
history.add_ai_message("hi there!")
memory = ConversationSummaryMemory.from_messages(
llm=llm,
chat_memory=history,
return_messages=True
)
memory
|
1
2
3
4
5
6
7
| ConversationSummaryMemory(llm=GoogleGenerativeAI(model='gemini-1.5-pro-latest', google_api_key=SecretStr('**********'), client=genai.GenerativeModel(
model_name='models/gemini-1.5-pro-latest',
generation_config={},
safety_settings={},
tools=None,
system_instruction=None,
)), chat_memory=InMemoryChatMessageHistory(messages=[HumanMessage(content='hi'), AIMessage(content='hi there!')]), return_messages=True, buffer='Current summary: \nThe human greeted the AI. The AI returned the greeting. \n')
|
'Current summary: \nThe human greeted the AI. The AI returned the greeting. \n'
你可以使用之前生成的总结来加速初始化,并通过直接初始化来避免重新生成总结。
1
2
3
4
5
6
7
| memory = ConversationSummaryMemory(
llm=llm,
buffer="The human asks what the AI thinks of artificial intelligence. The AI thinks artificial intelligence is a force for good because it will help humans reach their full potential.",
chat_memory=history,
return_messages=True
)
memory
|
1
2
3
4
5
6
7
| ConversationSummaryMemory(llm=GoogleGenerativeAI(model='gemini-1.5-pro-latest', google_api_key=SecretStr('**********'), client=genai.GenerativeModel(
model_name='models/gemini-1.5-pro-latest',
generation_config={},
safety_settings={},
tools=None,
system_instruction=None,
)), chat_memory=InMemoryChatMessageHistory(messages=[HumanMessage(content='hi'), AIMessage(content='hi there!')]), return_messages=True, buffer='The human asks what the AI thinks of artificial intelligence. The AI thinks artificial intelligence is a force for good because it will help humans reach their full potential.')
|
Conversation Token Buffer
ConversationTokenBufferMemory
在内存中保持了一段最近互动的缓存,并使用 Token 的长度而不是互动的数量来决定何时清除互动。
1
2
3
4
5
6
7
| from langchain.memory import ConversationTokenBufferMemory
memory = ConversationTokenBufferMemory(llm=llm, max_token_limit=10)
memory.save_context({"input": "hi"}, {"output": "whats up"})
memory.save_context({"input": "not much you"}, {"output": "not much"})
memory.load_memory_variables({})
|
1
| {'history': 'Human: not much you\nAI: not much'}
|
Conversation Summary Buffer
ConversationSummaryBufferMemory
融合了两种方法。它会在内存中保留最近交互的一个缓存,并不是简单地丢弃旧的交互记录,而是将它们汇总成一份摘要,然后同时使用缓存与摘要。此外,它根据 Token 的使用长度而非交互次数来决定什么时候从缓存中移除旧的交互信息。
1
2
3
4
5
6
7
| from langchain.memory import ConversationSummaryBufferMemory
memory = ConversationSummaryBufferMemory(llm=llm, max_token_limit=10)
memory.save_context({"input": "hi"}, {"output": "whats up"})
memory.save_context({"input": "not much you"}, {"output": "not much"})
memory.load_memory_variables({})
|
1
| {'history': 'System: Current summary:\nThe human greets the AI. The AI responds with an informal greeting. |