Chat Memory
Maintaining and managing ChatMessage
s manually is cumbersome.
Therefore, LangChain4j offers a ChatMemory
abstraction along with multiple out-of-the-box implementations.
ChatMemory
can be used as a standalone low-level component,
or as a part of a high-level component like AI Services.
ChatMemory
acts as a container for ChatMessage
s (backed by a List
), with additional features like:
- Eviction policy
- Persistence
- Special treatment of
SystemMessage
- Special treatment of tool messages
Memory vs History
Please note that “memory” and “history” are similar, yet distinct concepts.
- History keeps all messages between the user and AI intact. History is what the user sees in the UI. It represents what was actually said.
- Memory keeps some information, which is presented to the LLM to make it behave as if it “remembers” the conversation.
Memory is quite different from history. Depending on the memory algorithm used, it can modify history in various ways:
evict some messages, summarize multiple messages, summarize separate messages, remove unimportant details from messages,
inject extra information (e.g., for RAG) or instructions (e.g., for structured outputs) into messages, and so on.
LangChain4j currently offers only “memory”, not “history”. If you need to keep an entire history, please do so manually.
Eviction policy
An eviction policy is necessary for several reasons:
- To fit within the LLM’s context window. There is a cap on the number of tokens LLM can process at once.
At some point, conversation might exceed this limit. In such cases, some message(s) should be evicted.
Usually, the oldest message(s) are evicted, but more sophisticated algorithms can be implemented if needed. - To control the cost. Each token has a cost, making each call to the LLM progressively more expensive.
Evicting unnecessary messages reduces the cost. - To control the latency. The more tokens are sent to the LLM, the more time it takes to process