大模型实战-【Langchain4J中Chat Memory】

LangChain4j提供了一种Chat Memory抽象,用于管理和维护对话,包括驱逐策略和持久化。记忆不同于历史记录,它根据算法修改对话内容以实现对话记忆。目前提供简单的LastN和TokenBased内存实现,并支持自定义持久化存储。
摘要由CSDN通过智能技术生成

Chat Memory

Maintaining and managing ChatMessages manually is cumbersome.
Therefore, LangChain4j offers a ChatMemory abstraction along with multiple out-of-the-box implementations.

ChatMemory can be used as a standalone low-level component,
or as a part of a high-level component like AI Services.

ChatMemory acts as a container for ChatMessages (backed by a List), with additional features like:

  • Eviction policy
  • Persistence
  • Special treatment of SystemMessage
  • Special treatment of tool messages

Memory vs History

Please note that “memory” and “history” are similar, yet distinct concepts.

  • History keeps all messages between the user and AI intact. History is what the user sees in the UI. It represents what was actually said.
  • Memory keeps some information, which is presented to the LLM to make it behave as if it “remembers” the conversation.
    Memory is quite different from history. Depending on the memory algorithm used, it can modify history in various ways:
    evict some messages, summarize multiple messages, summarize separate messages, remove unimportant details from messages,
    inject extra information (e.g., for RAG) or instructions (e.g., for structured outputs) into messages, and so on.

LangChain4j currently offers only “memory”, not “history”. If you need to keep an entire history, please do so manually.

Eviction policy

An eviction policy is necessary for several reasons:

  • To fit within the LLM’s context window. There is a cap on the number of tokens LLM can process at once.
    At some point, conversation might exceed this limit. In such cases, some message(s) should be evicted.
    Usually, the oldest message(s) are evicted, but more sophisticated algorithms can be implemented if needed.
  • To control the cost. Each token has a cost, making each call to the LLM progressively more expensive.
    Evicting unnecessary messages reduces the cost.
  • To control the latency. The more tokens are sent to the LLM, the more time it takes to process
  • 24
    点赞
  • 20
    收藏
    觉得还不错? 一键收藏
  • 0
    评论
评论
添加红包

请填写红包祝福语或标题

红包个数最小为10个

红包金额最低5元

当前余额3.43前往充值 >
需支付:10.00
成就一亿技术人!
领取后你会自动成为博主和红包主的粉丝 规则
hope_wisdom
发出的红包
实付
使用余额支付
点击重新获取
扫码支付
钱包余额 0

抵扣说明:

1.余额是钱包充值的虚拟货币,按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载,可以购买VIP、付费专栏及课程。

余额充值