高级RAG特性之一 - 查询压缩

最新推荐文章于 2025-04-02 09:34:23 发布

MultiArrow

最新推荐文章于 2025-04-02 09:34:23 发布

阅读量592

点赞数 3

文章标签： langchain 语言模型

本文链接：https://blog.csdn.net/MultiArrow/article/details/139119176

版权

一、背景

不管是在RAG还是AI对话的场景，为了能让AI更好的理解当前问题，往往会把历史对话跟当前问题一并发送。

携带历史记录的好处很明显，就是可以让AI充分理解上下文，回答更准确。

缺点也很明显：

请求内容大，消耗token多；
可能有很多跟当次请求无关的内容，影响AI对问题的理解速度；

二、例子

USER：详细描述孙悟空的生平及主要事迹
AI：孙悟空是。。。。（很长的内容）
USER：他是哪一年出生的？

AI回复的内容很长，如果在用户提问“他是哪一年出生的？”这个问题时，把历史记录跟问题一并发送给AI，除了可能会超出AI接口的最大请求token数，还会影响AI的回复速度。但是如果把历史记录去掉，不跟随用户问题一起发送，则AI会无法理解用户问题中的“他”指的是什么。

三、解决方法

将历史记录与用户问题提炼压缩，将长文本内容压缩成直指问题核心的AI可以理解的短文本。

整个过程其实就是将历史记录与用户提问交给AI，让AI来提炼压缩。此时执行压缩工作的AI可以选用支持长文本的AI，不一定是真正回答问题的AI。

最终发送给AI的问题可能是：

USER：孙悟空是哪一年出生的？

整体流程如下：

四、代码实现

langchain4j的示例：

ChatLanguageModel chatLanguageModel = OpenAiChatModel.builder()
        .apiKey(OPENAI_API_KEY)
        .build();

// We will create a CompressingQueryTransformer, which is responsible for compressing
// the user's query and the preceding conversation into a single, stand-alone query.
// This should significantly improve the quality of the retrieval process.
QueryTransformer queryTransformer = new CompressingQueryTransformer(chatLanguageModel);

ContentRetriever contentRetriever = EmbeddingStoreContentRetriever.builder()
        .embeddingStore(embeddingStore)
        .embeddingModel(embeddingModel)
        .maxResults(2)
        .minScore(0.6)
        .build();

// The RetrievalAugmentor serves as the entry point into the RAG flow in LangChain4j.
// It can be configured to customize the RAG behavior according to your requirements.
// In subsequent examples, we will explore more customizations.
RetrievalAugmentor retrievalAugmentor = DefaultRetrievalAugmentor.builder()
        .queryTransformer(queryTransformer)
        .contentRetriever(contentRetriever)
        .build();

AiServices.builder(Assistant.class)
        .chatLanguageModel(chatLanguageModel)
        .retrievalAugmentor(retrievalAugmentor)
        .chatMemory(MessageWindowChatMemory.withMaxMessages(10))
        .build();

更多实践代码可以到GitHub - langchain4j-aideepin上查看