使用 LangChain 掌握检索增强生成 (RAG) 的终极指南：2、查询转换

Hugo_Hoo

于 2024-07-16 17:22:29 发布

阅读量427

点赞数 17

分类专栏：使用 LangChain 掌握RAG的指南文章标签： langchain 人工智能 AI编程

本文链接：https://blog.csdn.net/wangjiansui/article/details/140471736

版权

查询转换

查询转换的核心思想是将用户查询以一种能让大型语言模型（LLM）正确回答问题的方式进行翻译或转换。例如，如果用户提出一个模糊的问题，我们的RAG检索器可能会根据与用户问题不太相关的嵌入（embeddings）检索出错误的（或模糊的）文档，导致LLM生成错误的答案。解决这个问题有几种方法：

退一步提示（Step-back prompting）：这涉及到鼓励LLM从一个给定的问题或问题中退一步，提出一个更抽象、更高级别的问题，该问题包含原始询问的精髓。
最少到最多提示（Least-to-most prompting）：这允许将复杂问题分解成一系列更简单的子问题，然后按顺序解决它们。每个子问题的解决都由之前解决的子问题的答案促进。
查询重写（Multi-Query 或 RAG Fusion）：这允许从原始问题生成多个不同措辞和视角的问题。然后使用每个问题与向量存储之间的相似性分数检索文档，以回答原始问题。

现在，让我们尝试使用LangChain实现上述技术！

%load_ext dotenv
%dotenv secrets/secrets.env

与上一篇类似，我们首先导入库，加载文档，分割它们，生成嵌入，将它们存储在向量存储中，并使用向量存储创建检索器。

from langchain_community.document_loaders import PyPDFLoader, DirectoryLoader
from langchain.text_splitter import RecursiveCharacterTextSplitter
from langchain import hub
from langchain_community.vectorstores import Chroma
from langchain_core.output_parsers import StrOutputParser
from langchain_core.runnables import RunnablePassthrough
from langchain_openai import ChatOpenAI, OpenAIEmbeddings
from langchain.load import loads, dumps
from typing import List

loader = DirectoryLoader('data/',glob="*.pdf",loader_cls=PyPDFLoader)
documents = loader.load()

# Split text into chunks

text_splitter  = RecursiveCharacterTextSplitter(chunk_size=500,chunk_overlap=20)
text_chunks = text_splitter.split_documents(documents)

vectorstore = Chroma.from_documents(documents=text_chunks, 
                                    embedding=OpenAIEmbeddings(),
                                    persist_directory="data/vectorstore")
vectorstore.persist()

retriever = vectorstore.as_retriever(search_kwargs={
   'k':5})

2.1 查询翻译

2.1.1 多查询（Multi-Query）

在多查询方法中，我们首先使用LLM（这里是一个GPT-4实例）基于我们的原始问题生成5个不同的问题。为此，我们创建一个提示，并用ChatPromptTemplate封装它。然后我们使用LCEL创建链，读取用户输入并将其分配给提示中的question占位符，将提示发送给LLM，解析包含5个由新行字符分隔的问题的输出。

from langchain.prompts import ChatPromptTemplate

prompt = ChatPromptTemplate.from_template(
    """
    You are an intelligent assistant. Your task is to generate 5 questions based on the provided question in different wording and different perspectives to retrieve relevant documents from a vector database. By generating multiple perspectives on the user question, your goal is to help the user overcome some of the limitations of the distance-based similarity search. Provide these alternative questions separated by newlines. Original question: {question}
    """
)

generate_queries = (
    {
   "question": RunnablePassthrough()}
    | prompt
    | ChatOpenAI(model='gpt-4', temperature=0.7)
    | StrOutputParser()
    | (lambda x: x.split("\n"))
)

我们可以通过使用查询调用创建的链来检查查询生成是否有效。

generate_queries.invoke("What are the benefits of QLoRA?")

# 输出结果
['1. Can you list the advantages of using QLoRA?',
 '2. What positive outcomes can be expected from using QLoRA?',
 '3. In what ways is QLoRA beneficial?',
 '4. How can QLoRA be advantageous?',
 '5. What are the positive impacts of using QLoRA?']

一旦我们得到了 5 个问题，我们就会并行检索每个问题最相关的 5 个文档（生成一个列表列表），并通过获取所有检索到的文档的并集的唯一文档来创建一个新的文档列表。为此，我们使用 LCEL 创建另一个链 retrieval_chain 。

def get_context_union(docs: List[List]):
    all_docs = [dumps(d) for doc in docs for d in doc]
    unique_docs = list(set(all_docs))
    
    return [loads(doc).page_content for doc in unique_docs] # We only return page contents


retrieval_chain = (
    {
   'question': RunnablePassthrough()}
    | generate_queries
    | retriever.map()
    | get_context_union
)

retrieval_chain.invoke("What are the benefits of QLoRA?")

# 输出结果
['trade-off exactly lies for QLoRA tuning, which we leave to future work to explore.\nWe proceed to investigate instruction tuning at scales that would be impossible to explore with full\n16-bit finetuning on academic research hardware.\n5 Pushing the Chatbot State-of-the-art with QLoRA\nHaving established that 4-bit QLORAmatches 16-bit performance across scales, tasks, and datasets\nwe conduct an in-depth study of instruction finetuning up to the largest open-source language models',
 'technology. QLORAcan be seen as an equalizing factor that helps to close the resource gap between\nlarge corporations and small tea

最低0.47元/天解锁文章

Hugo_Hoo

关注

17
点赞
踩
18

收藏

觉得还不错? 一键收藏
打赏
0
评论
使用 LangChain 掌握检索增强生成 (RAG) 的终极指南：2、查询转换

查询转换的核心思想是将用户查询以一种能让大型语言模型（LLM）正确回答问题的方式进行翻译或转换。例如，如果用户提出一个模糊的问题，我们的RAG检索器可能会根据与用户问题不太相关的嵌入（embeddings）检索出错误的（或模糊的）文档，导致LLM生成错误的答案。现在，让我们尝试使用LangChain实现上述技术！与上一篇类似，我们首先导入库，加载文档，分割它们，生成嵌入，将它们存储在向量存储中，并使用向量存储创建检索器。
复制链接

扫一扫