4. 使用LangChain的LLM对话记忆

对话记忆是聊天机器人如何以对话方式响应多个query的机制。他使得对话连贯,没有他,每个query都是当做完全独立的输入、而不考虑过去的交互。

上面的图代表了LLM有对话记忆和没有对话记忆。蓝色的框是用户的prompt、灰色的框是LLM的响应。右侧没有对话式记忆,LLM无法使用交互中的历史知识来响应。

记忆使得LLM(Large Language Model)可以记录与用户的历史交互。默认情况下,LLM是无状态的--代表每一个传入的query都是独立处理,与其他交互无关。对于无状态agent来说,唯一存在的就是当前输入,别无其他。

在很多应用场景中,记住历史的交互是很重要的,譬如聊天机器人。对话记忆让我们能够做到这一点。

我们可以通过几种方式实现对话记忆。在LangChain中,他们是在ConversationChain之上来实现的。

ConversationChain

我们从ConversationChain的初始化来开始。我们使用OpenAi的text-davinci-003作为LLM,其他的模型也可使用,譬如gpt-3.5-turbo

from langchain import OpenAI
from langchain.chains import ConversationChain

# first initialize the large language model
llm = OpenAI(
	temperature=0,
	openai_api_key="OPENAI_API_KEY",
	model_name="text-davinci-003"
)

# now initialize the conversation chain
conversation = ConversationChain(llm=llm)

我们可以通过ConversationChain来查看已经使用过的prompt,像下面这样:

print(conversation.prompt.template)
The following is a friendly conversation between a human and an AI. The AI is talkative and provides lots of specific details from its context. If the AI does not know the answer to a question, it truthfully says it does not know.

Current conversation:
{history}
Human: {input}
AI:

在这里,prompt通过告诉model接下来的内容是human和AI(text-davinci-03)之间的一场对话来指导model。该prompt试图通过下面声明减少幻觉(模型编造内容的情况):

"If the AI does not know the answer to a question, it truthfully says it does not know."

这个可以缓解但是不能解决幻觉问题--我们把幻觉留到后面的章节来讨论。

在prompt初始化之后,我们看到了两个参数;{history}{input}{input}是用来存放用户最新的query;是在chatbot(聊天机器人)文本输入框中输入的input:

{history}是对话记忆应用的位置。在这里,我们使用人和AI之间的聊天信息来填充。

这两个参数--{history}{input}--正如我们之前看到的通过prompt模版传递给LLM,输出(我们希望能返回的)只是对话的预延续。

Conversational Memory的形式

我们可以通过ConversationChain使用集中对话记忆。他们修改传递给{history}参数的文本。

ConversationBufferMemory

ConversationBufferMemory是LangChain中最简单的对话记忆。正如我们前面所描述的一样,人和AI之间的历史对话的原始输入以原始形式传递给{history}参数。

from langchain.chains.conversation.memory import ConversationBufferMemory

conversation_buf = ConversationChain(
    llm=llm,
    memory=ConversationBufferMemory()
)
conversation_buf("Good morning AI!")
{'input': 'Good morning AI!',
 'history': '',
 'response': " Good morning! It's a beautiful day today, isn't it? How can I help you?"}

让我们重返对话式agent的第一个响应。继续这个对话,写一个LLM只有考虑历史对话记录才能回答的prompt。同时,我们也添加了count_tokens函数,以便我们可以看看每一轮交互使用了多少tokens。

from langchain.callbacks import get_openai_callback

def count_tokens(chain, query):
    with get_openai_callback() as cb:
        result = chain.run(query)
        print(f'Spent a total of {cb.total_tokens} tokens')

    return result
count_tokens(
    conversation_buf, 
    "My interest here is to explore the potential of integrating Large Language Models with external knowledge"
)
Spent a total of 179 tokens
' Interesting! Large Language Models are a type of artificial intelligence that can process natural language and generate text. They can be used to generate text from a given context, or to answer questions about a given context. Integrating them with external knowledge can help them to better understand the context and generate more accurate results. Is there anything else I can help you with?'
count_tokens(
    conversation_buf,
    "I just want to analyze the different possibilities. What can you think of?"
)
Spent a total of 268 tokens
' Well, integrating Large Language Models with external knowledge can open up a lot of possibilities. For example, you could use them to generate more accurate and detailed summaries of text, or to answer questions about a given context more accurately. You could also use them to generate more accurate translations, or to generate more accurate predictions about future events.'
count_tokens(
    conversation_buf, 
    "Which data source types could be used to give context to the model?"
)
Spent a total of 360 tokens
'  There are a variety of data sources that could be used to give context to a Large Language Model. These include structured data sources such as databases, unstructured data sources such as text documents, and even audio and video data sources. Additionally, you could use external knowledge sources such as Wikipedia or other online encyclopedias to provide additional context.'
count_tokens(
    conversation_buf, 
    "What is my aim again?"
)
Spent a total of 388 tokens
' Your aim is to explore the potential of integrating Large Language Models with external knowledge.'

LLM可以清晰的记录对话的历史。我们来看看是如何通过ConversationBufferMemory来存储对话历史的:

print(conversation_buf.memory.buffer)
Human: Good morning AI!
AI:  Good morning! It's a beautiful day today, isn't it? How can I help you?
Human: My interest here is to explore the potential of integrating Large Language Models with external knowledge
AI:  Interesting! Large Language Models are a type of artificial intelligence that can process natural language and generate text. They can be used to generate text from a given context, or to answer questions about a given context. Integrating them with external knowledge can help them to better understand the context and generate more accurate results. Is there anything else I can help you with?
Human: I just want to analyze the different possibilities. What can you think of?
AI:  Well, integrating Large Language Models with external knowledge can open up a lot of possibilities. For example, you could use them to generate more accurate and detailed summaries of text, or to answer questions about a given context more accurately. You could also use them to generate more accurate translations, or to generate more accurate predictions about future events.
Human: Which data source types could be used to give context to the model?
AI:   There are a variety of data sources that could be used to give context to a Large Language Model. These include structured data sources such as databases, unstructured data sources such as text documents, and even audio and video data sources. Additionally, you could use external knowledge sources such as Wikipedia or other online encyclopedias to provide additional context.
Human: What is my aim again?
AI:  Your aim is to explore the potential of integrating Large Language Models with external knowledge.

我们可以看到buffer直接保存对话历史中的每一次交互。这种方式有一些优缺点。简而言之,他们是:

优势

劣势

将所有都保存起来,可以给LLM最大的信息

较多的token意味着较慢的响应和更高的成本

将所有都保存起来,简单且直接

长对话由于达到token上限而无法保存下来,for text-davinci-003 和 gpt-3.5-turbo的上限都是4096个token

ConversationBufferMemory是一个不错的入门选择,也会因为存储每一次交互而受限制。让我们看看可以补救这个问题的其他选项。

ConversationSummaryMemory

使用ConversationBufferMemory,我们很快就使用了大量的token、甚至超出上下文的窗口限制,即使我们使用最先进的LLM。

为了避免过快的消耗token,我们可以使用ConversationSummaryMemory。顾名思义,这种类型的记忆会将传递给{history}参数之前,将历史对话进行总结。

想下面这样,使用summary memory初始化ConversationChain

from langchain.chains.conversation.memory import ConversationSummaryMemory

conversation = ConversationChain(
	llm=llm,
	memory=ConversationSummaryMemory(llm=llm)
)

在使用ConversationSummaryMemory,我们需要将LLM传递给该对象,因为summarization是有LLM提供的。在这里,我们可以看到完成这项任务的prompt:

print(conversation_sum.memory.prompt.template)
Progressively summarize the lines of conversation provided, adding onto the previous summary returning a new summary.

EXAMPLE
Current summary:
The human asks what the AI thinks of artificial intelligence. The AI thinks artificial intelligence is a force for good.

New lines of conversation:
Human: Why do you think artificial intelligence is a force for good?
AI: Because artificial intelligence will help humans reach their full potential.

New summary:
The human asks what the AI thinks of artificial intelligence. The AI thinks artificial intelligence is a force for good because it will help humans reach their full potential.
END OF EXAMPLE

Current summary:
{summary}

New lines of conversation:
{new_lines}

New summary:

使用这个方法,我们可以总结每一个新的交互,并且将它添加到所有过去互动的“持续总结”中。让我们利用这种方法进行另一次对话。

# without count_tokens we'd call `conversation_sum("Good morning AI!")`
# but let's keep track of our tokens:
count_tokens(
    conversation_sum, 
    "Good morning AI!"
)
Spent a total of 290 tokens
" Good morning! It's a beautiful day today, isn't it? How can I help you?"
count_tokens(
    conversation_sum, 
    "My interest here is to explore the potential of integrating Large Language Models with external knowledge"
)
Spent a total of 440 tokens
" That sounds like an interesting project! I'm familiar with Large Language Models, but I'm not sure how they could be integrated with external knowledge. Could you tell me more about what you have in mind?"
count_tokens(
    conversation_sum, 
    "I just want to analyze the different possibilities. What can you think of?"
)
Spent a total of 664 tokens
' I can think of a few possibilities. One option is to use a large language model to generate a set of candidate answers to a given query, and then use external knowledge to filter out the most relevant answers. Another option is to use the large language model to generate a set of candidate answers, and then use external knowledge to score and rank the answers. Finally, you could use the large language model to generate a set of candidate answers, and then use external knowledge to refine the answers.'
count_tokens(
    conversation_sum, 
    "Which data source types could be used to give context to the model?"
)
Spent a total of 799 tokens
' There are many different types of data sources that could be used to give context to the model. These could include structured data sources such as databases, unstructured data sources such as text documents, or even external APIs that provide access to external knowledge. Additionally, the model could be trained on a combination of these data sources to provide a more comprehensive understanding of the context.'
count_tokens(
    conversation_sum, 
    "What is my aim again?"
)
Spent a total of 853 tokens
' Your aim is to explore the potential of integrating Large Language Models with external knowledge.'

在这个案例里面,这个summary里面包含足够多的信息用来给LLM“记忆”我们的初始目标。我们可以使用如下方法来看看summary的原始内容

print(conversation_sum.memory.buffer)
The human greeted the AI with a good morning, to which the AI responded with a good morning and asked how it could help. The human expressed interest in exploring the potential of integrating Large Language Models with external knowledge, to which the AI responded positively and asked for more information. The human asked the AI to think of different possibilities, and the AI suggested three options: using the large language model to generate a set of candidate answers and then using external knowledge to filter out the most relevant answers, score and rank the answers, or refine the answers. The human then asked which data source types could be used to give context to the model, to which the AI responded that there are many different types of data sources that could be used, such as structured data sources, unstructured data sources, or external APIs. Additionally, the model could be trained on a combination of these data sources to provide a more comprehensive understanding of the context. The human then asked what their aim was again, to which the AI responded that their aim was to explore the potential of integrating Large Language Models with external knowledge.

这个对话所使用的token的数量比ConversationBufferMemory多,那么,使用ConversationSummaryMemory相较于通过buffer memory是否还有其他的优势?

随着互动数量(x轴)的增加,buffer memory和summary memory的令牌数量(y轴)的对比

对于长对话,更有优势一些。如上所示,summary memory在初始阶段使用更多的tokens。然后,随着对话的增长,summarization方式增长的更缓慢。相较而言,buffer memory在对话中使用的token线性持续增长。

ConversationSummaryMemory的优劣势总结如下:

优势

劣势

缩短长对话的令牌数量.

对于较小的对话可能导致更高的令牌使用.

是的更长对话称为可能.

对话历史的记忆,完全依赖于中间摘要LLM的摘要能力.

实现相对简单,直观易懂.

当然,摘要LLM需要令牌;这个会增加成本,但是不会限制对话的长度.

在一些需要长对话的案例中,对话summarization是一个较好的方式。然而,他仍然受限于token的根本性限制。在某个确定的次数之后,我们仍然会超过上下文窗口的限制。

ConversationBufferWindowMemory

ConversationBufferWindowMemory与之前的“buffer memory”的运行方式是一致的,但是给memory添加了一个窗口。代表我们仅保留一定数量的历史交互,然后“忘记”他们。我们按照如下方式使用它们:

from langchain.chains.conversation.memory import ConversationBufferWindowMemory

conversation = ConversationChain(
	llm=llm,
	memory=ConversationBufferWindowMemory(k=1)
)

在上面的实例中,我们设置k=1--代表着窗口仅记忆人和AI交互中的最新一次。也就是人的最新的响应和AI的最新的响应。我们可以看到实际效果:

count_tokens(
    conversation_bufw, 
    "Good morning AI!"
)
Spent a total of 85 tokens
" Good morning! It's a beautiful day today, isn't it? How can I help you?"
count_tokens(
    conversation_bufw, 
    "My interest here is to explore the potential of integrating Large Language Models with external knowledge"
)
Spent a total of 178 tokens
' Interesting! Large Language Models are a type of artificial intelligence that can process natural language and generate text. They can be used to generate text from a given context, or to answer questions about a given context. Integrating them with external knowledge can help them to better understand the context and generate more accurate results. Do you have any specific questions about this integration?'
count_tokens(
    conversation_bufw, 
    "I just want to analyze the different possibilities. What can you think of?"
)
Spent a total of 233 tokens
' There are many possibilities for integrating Large Language Models with external knowledge. For example, you could use external knowledge to provide additional context to the model, or to provide additional training data. You could also use external knowledge to help the model better understand the context of a given text, or to help it generate more accurate results.'
count_tokens(
    conversation_bufw, 
    "Which data source types could be used to give context to the model?"
)
Spent a total of 245 tokens
' Data sources that could be used to give context to the model include text corpora, structured databases, and ontologies. Text corpora provide a large amount of text data that can be used to train the model and provide additional context. Structured databases provide structured data that can be used to provide additional context to the model. Ontologies provide a structured representation of knowledge that can be used to provide additional context to the model.'
count_tokens(
    conversation_bufw, 
    "What is my aim again?"
)
Spent a total of 186 tokens
' Your aim is to use data sources to give context to the model.'

在对话的最后,当我们询问"what is my aim again?",答案在三次互动前的人的响应中。然而,因为我们仅保存了最近的交互K=1,模型已经忘记且无法给出正确的答案。

我们可以像下面这样看看真实“memory”:

bufw_history = conversation_bufw.memory.load_memory_variables(
    inputs=[]
)['history']
print(bufw_history)
Human: What is my aim again?
AI:  Your aim is to use data sources to give context to the model.

尽管这种方法不适合记忆较远的互动,但他在限制使用的令牌数量方面表现良好--这个数量可以根据我们的需要进行增加或减少。对于我们之前对比中使用的长对话,我们可以设置k=6,并在27次总互动后达到每次互动约1.5k令牌:

ConversationBufferWindowMemory在k=6和k=12时使用的token数量

如果我们仅仅需要最近的交互记忆,这是一个很棒的选择。然后,对于同时包含远距离和近期互动的情况,还有其他选择。

ConversationSummaryBufferMemory

ConversationSummaryBufferMemoryConversationSummaryMemoryConversationBufferWindowMemory的结合体。他在总结对话中最早的交互的同时,保持对话中最接近max_token_limit个数的令牌。他的初始化方式如下:

conversation_sum_bufw = ConversationChain(
    llm=llm, memory=ConversationSummaryBufferMemory(
        llm=llm,
        max_token_limit=650
)

当将这个应用到我们之前的对话中,我们可以设置max_token_limit为一个小的数组,LLM仍然能记住我们焦躁的aim.

这是因为这些信息被“summarization”组件捕获,尽管会被“buffer window”组件丢弃。

自然的,该组件的优势和劣势是他基于的早先组件的混合。

Pros

Cons

Summarizer意味着我们可以记住较早一些的交互

Summarizer会增加短对话的token数量

Buffer可以使我们避免丢失最近对话的信息

存储原始的交互-即使仅仅是最近的一些交互-也会增加token的数量

尽管需要更多的调整在哪些内容summarize、以及哪些内容保留在buffer window,ConversationSummaryBufferMemory给了我们较多的灵活性,并且是截止目前为止唯一的memory类型允许我们记忆长对话并且使用原始(信息含量最丰富的)方式存储最近的交互。


max_token_limit参数为650和1300时,conversationSummaryBufferMemory使用的token数量的对比

我们还可以看到,尽管包含了包含了历史交互的总结和最近交互的原始信息,ConversationSummaryBufferMemory的token增长与其他方式相比仍然具有竞争力。

Other Memory Types

在这里我们介绍的memory类型对于起步是很好的,并且在记住更多和减少token之间有一个比较好的平衡。

然而,我们还有其他选择-特别是ConversationKnowledgeGraphMemoryConversationEntityMemory。我们会在接下来的章节中给这些不同类型的memory应有关注。

这就是使用LangChain的对话记忆介绍的全部内容。正如我们看到的,有很多的选择可以帮助无状态的LLMs在有状态的上下文交互--可以考虑和回顾历史的交互。

正如上面提到的,还有其他的forms我们可以覆盖。我们

Conversational Memory for LLMs with Langchain | Pinecone

https://www.cnblogs.com/bonelee/p/17406692.html

评论
添加红包

请填写红包祝福语或标题

红包个数最小为10个

红包金额最低5元

当前余额3.43前往充值 >
需支付:10.00
成就一亿技术人!
领取后你会自动成为博主和红包主的粉丝 规则
hope_wisdom
发出的红包
实付
使用余额支付
点击重新获取
扫码支付
钱包余额 0

抵扣说明:

1.余额是钱包充值的虚拟货币,按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载,可以购买VIP、付费专栏及课程。

余额充值