LangChain4j-RAG基础

Box_clf

已于 2024-07-27 22:59:18 修改

阅读量2.6k

点赞数 28

CC 4.0 BY-SA版权

分类专栏： AI Agent 文章标签： langchain LangChan4j java 人工智能

于 2024-07-27 22:58:09 首次发布

本文链接：https://blog.csdn.net/Box_clf/article/details/140742838

RAG是什么

简而言之，RAG 是一种在将数据发送到 LLM 之前从数据中查找相关信息并将其注入到提示中的方法。这样LLM将获得（希望）相关信息，并能够使用这些信息进行回复，这应该会减少产生幻觉的可能性。

实现方法:

全文（关键字）搜索。该方法使用 TF-IDF 和 BM25 等技术，通过将查询中的关键字（例如，用户询问的内容）与文档数据库进行匹配来搜索文档。它根据每个文档中这些关键字的频率和相关性对结果进行排名。
矢量搜索，也称为“语义搜索”。使用嵌入模型将文本文档转换为数字向量。然后，它根据查询向量和文档向量之间的余弦相似度或其他相似度/距离度量来查找文档并对其进行排名，从而捕获更深层次的语义。
结合多种搜索方法（例如全文+向量）通常可以提高搜索的效率。

RAG的两个步骤

RAG 过程分为 2 个不同的阶段：索引(indexing) 和 检索(retrieval)。

索引(Indexing)

此过程可能会根据所使用的信息检索方法而有所不同。对于矢量搜索，这通常涉及清理文档，用额外的数据和元数据丰富它们，将它们分成更小的片段（也称为分块），嵌入这些片段，最后将它们存储在嵌入存储（又称为矢量数据库）中。

索引阶段通常离线进行，这意味着它不需要最终用户等待其完成。例如，这可以通过 cron 定时任务来实现，该定时任务每周在周末重新索引一次公司内部文档。负责索引的代码也可以是仅处理索引任务的单独应用程序。

但是，在某些情况下，最终用户可能希望上传其自定义文档，以便 LLM 可以访问它们。在这种情况下，索引应该在线执行并成为主应用程序的一部分。

检索(Retrieval)

检索过程通常发生在用户提交文档使用索引回答用户问题时。

此过程可能会根据所使用的信息检索方法而有所不同。对于向量搜索，这通常涉及嵌入用户的查询（问题）并在嵌入存储中执行相似性搜索。然后相关片段（原始文档的片段）被注入到提示中并发送到LLM。

Easy RAG

LangChain4j 有一个“Easy RAG”功能，可以让 RAG 上手变得尽可能简单。不必了解嵌入、选择向量存储、找到正确的嵌入模型、弄清楚如何解析和分割文档等。只需指向您的文档，LangChain4j 就会发挥其魔力。

导入angchain4j-easy-rag依赖

<dependency>
    <groupId>dev.langchain4j</groupId>
    <artifactId>langchain4j-easy-rag</artifactId>
    <version>0.33.0</version>
</dependency>

官方示例:

public class Easy_RAG_Example {

    /**
     * This example demonstrates how to implement an "Easy RAG" (Retrieval-Augmented Generation) application.
     * By "easy" we mean that we won't dive into all the details about parsing, splitting, embedding, etc.
     * All the "magic" is hidden inside the "langchain4j-easy-rag" module.
     * <p>
     * If you want to learn how to do RAG without the "magic" of an "Easy RAG", see {@link Naive_RAG_Example}.
     */

    public static void main(String[] args) {

        // First, let's load documents that we want to use for RAG
        List<Document> documents = loadDocuments(toPath("documents/"), glob("*.txt"));

        // Second, let's create an assistant that will have access to our documents
        Assistant assistant = AiServices.builder(Assistant.class)
                .chatLanguageModel(OpenAiChatModel.builder().baseUrl(OPENAI_API_URL).apiKey(OPENAI_API_KEY).build()) // it should use OpenAI LLM
                .chatMemory(MessageWindowChatMemory.withMaxMessages(10)) // it should remember 10 latest messages
                .contentRetriever(createContentRetriever(documents)) // it should have access to our documents
                .build();

        // Lastly, let's start the conversation with the assistant. We can ask questions like:
        // - Can I cancel my reservation?
        // - I had an accident, should I pay extra?
        startConversationWith(assistant);
    }

    private static ContentRetriever createContentRetriever(List<Document> documents) {

        // Here, we create and empty in-memory store for our documents and their embeddings.
        InMemoryEmbeddingStore<TextSegment> embeddingStore = new InMemoryEmbeddingStore<>();

        // Here, we are ingesting our documents into the store.
        // Under the hood, a lot of "magic" is happening, but we can ignore it for now.
        EmbeddingStoreIngestor.ingest(documents, embeddingStore);

        // Lastly, let's create a content retriever from an embedding store.
        return EmbeddingStoreContentRetriever.from(embeddingStore);
    }
}

List<Document> documents = FileSystemDocumentLoader.loadDocuments("/home/langchain4j/documentation");

LangChain4j支持了15种向量存储的方式, 为了简单起见, 这里的Easy RAG就是用了内存存储。

InMemoryEmbeddingStore<TextSegment> embeddingStore = new InMemoryEmbeddingStore<>();
EmbeddingStoreIngestor.ingest(documents, embeddingStore);

最后一步: 创建一个AI服务来调用LLM的API

interface Assistant {

    String chat(String userMessage);
}

Assistant assistant = AiServices.builder(Assistant.class)
    .chatLanguageModel(OpenAiChatModel.withApiKey(OPENAI_API_KEY))
    .chatMemory(MessageWindowChatMemory.withMaxMessages(10))
    .contentRetriever(EmbeddingStoreContentRetriever.from(embeddingStore))
    .build();

现在就可以正常聊天了

String answer = assistant.chat("How to do Easy RAG with LangChain4j?");

Accessing Sources

如果想要获取访问源（检索到的 Content 用于扩充消息），您可以通过将返回类型包装在 Result 类中轻松实现：

interface Assistant {

    Result<String> chat(String userMessage);
}

Result<String> result = assistant.chat("How to do Easy RAG with LangChain4j?");

String answer = result.content();
List<Content> sources = result.sources();