如何使用查询路由构建有用的 RAG

最新推荐文章于 2024-10-01 14:04:29 发布

数云界

最新推荐文章于 2024-10-01 14:04:29 发布

阅读量664

点赞数 26

文章标签： oracle 数据库 java

本文链接：https://blog.csdn.net/2401_85233349/article/details/141276794

版权

欢迎来到雲闪世界。LLM 可以处理一般路由。语义搜索可以更好地处理私有数据。你会选择哪一个？

单一提示无法处理所有事情，单一数据源可能并不适合所有数据。

以下是您在生产中经常看到但在演示中不会看到的内容：

您需要多个数据源来检索信息。多个向量存储、图形数据库，甚至 SQL 数据库。您还需要不同的提示来处理不同的任务。

如果是这样，我们就有问题了。鉴于用户输入是非结构化的、通常含糊不清的、格式很差的，我们如何决定从哪个数据库检索数据？

如果出于某种原因您仍然认为它太简单，这里有一个例子。

假设你有一个导游聊天机器人，一位旅行者询问五个地方之间的最佳旅行计划。让 LLM 回答可能会产生幻觉，因为 LLM 不擅长基于位置的计算。

相反，如果你将这些信息存储在图形数据库中，LLM 可能会生成查询来获取点之间的最短旅行路径。执行此查询将为 LLM 提供正确的信息并提供有用的评论。

这个例子很复杂，但生产应用程序可能需要多个向量存储。例如，您的应用程序可能是多模式 RAG。您可能处理不同的数据类型（文本、图像、音频）并使用不同的向量存储。

我希望我已经说服了你，多个数据源和路由至关重要。本文将讨论两种常用于路由查询的基本技术。

在实际应用中，查询路由通常与查询转换技术（例如查询分解）相结合

开始的示例...

在此之前，让我们先建立一个虚构的例子。假设您已经构建了一个聊天机器人，它可以回答员工有关管理的问题，例如，他们的工资或绩效相关问题。

如果查询涉及员工福利、绩效评估、休假政策或任何与人力资源直接相关的主题，我们需要将其路由到人力资源向量存储。另一方面，如果查询涉及工资、薪资明细、费用报销或其他财务事项，则应将其定向到帐户向量存储。

这是测试我们其余工作的设置。

from langchain_community.document_loaders import TextLoader
from langchain.text_splitter import RecursiveCharacterTextSplitter
from langchain_openai import ChatOpenAI, OpenAIEmbeddings
from langchain_community.vectorstores import Chroma

def create_retriever_from_file(file_name):
    data = TextLoader(file_name).load()
    text_splitter = RecursiveCharacterTextSplitter(chunk_size=200, chunk_overlap=20)
    splits = text_splitter.split_documents(data)
    vectorstore = Chroma.from_documents(splits, embedding=OpenAIEmbeddings())
    return vectorstore.as_retriever()

hr_retriever = create_retriever_from_file("HR_Docs.txt")
accounts_retriever = create_retriever_from_file("Accounts_Docs.txt")

在上面的代码中，我们创建了两个向量存储，一个用于 HR，另一个用于 Finance。由于我们不直接使用向量存储，而是将它们用作检索器，因此我们让函数将向量存储作为检索器对象返回。我在这个例子中使用了文本文件，但这甚至可以是实际应用中的数据管道。

查询路由的外行方法

一种简单的路由方法是关键字过滤。您还可以使用预先训练的 SVM根据查询预测正确的向量存储以供检索。但你明白我的意思吧？我们尝试找到一些单词，我们现有的知识告诉我们查询应该去哪里。

下面是代码实现。

# Define keywords for HR and Finance queries
HR_KEYWORDS = [
    "benefits",
    "performance",
    "evaluations",
    "leave",
    "policies",
    "human resources",
    "HR",
]
ACCOUNTS_KEYWORDS = [
    "salary",
    "payroll",
    "expense",
    "reimbursements",
    "finance",
    "financial",
    "pay",
]


# Function to route query
def route_query(query: str) -> str:

    # Convert query to lowercase for case-insensitive matching
    query_lower = query.lower()

    # Check if any HR keywords are in the query
    if any(keyword in query_lower for keyword in HR_KEYWORDS):
        return hr_retriever

    # Check if any Finance keywords are in the query
    elif any(keyword in query_lower for keyword in ACCOUNTS_KEYWORDS):
        return finance_retriever

    # If no keywords are matched, return a default response
    else:
        return "Unknown category, please refine your query."

# Example queries
queries = [
    "What are the leave policies?",
    "How do I apply for performance evaluations?",
    "Can I get a breakdown of my salary?",
    "Where do I submit expense reimbursements?",
    "Tell me about the HR benefits available.",
]

# Route each query and retrieve the response
for query in queries:
    retriever = route_query(query)
    response = retriever.invoke(query)[0].page_content

    print(f"Query: {query}" + "\n" + f"Response: {response}" + "\n")

上述代码完成了任务但在很多方面还存在不足。

首先，它会寻找关键词匹配。如果用户使用不同的语言来表达他们的担忧怎么办？其次，如果我们使用 ML 模型来预测路线，您的训练数据必须足够大。

这就是为什么我们使用更先进的技术，例如基于 LLM 的路由和语义相似性搜索，我们在本文中讨论这些技术。

让 (LLM) 决定路线。

通过用 LLM 替换关键字搜索或 ML 模型，我们可以在上述方法中获得巨大优势。

LLM 的常识通常足以将查询引导至正确的检索器。它应该能够很好地处理措辞不同的查询、拼写错误和歧义。

这里有一张总结性的图表：

使用 LLM 逻辑地路由查询以从正确的数据存储中检索信息

这里要考虑的一个最佳实践是使用结构化输出。结构化输出为我们提供明确的答案，并让 LLM 了解他们的选择。

我们来看一个代码实现。

from pydantic import BaseModel, Field
from typing import Optional, Literal

from langchain_openai import ChatOpenAI
from langchain_core.output_parsers import StrOutputParser
from langchain_core.prompts import ChatPromptTemplate
from langchain_core.runnables import RunnablePassthrough



# Section 1: Setup LLM and Configure Structured Output
class DataSource(BaseModel):
    datasource: Optional[Literal["hr", "accounts"]] = Field(
        title="Organization data source",
        description="Our organization bot has two data sources: HR and accounts",
    )

llm = ChatOpenAI()
structured_routing_llm = llm.with_structured_output(DataSource)



# Section 2: Routing Prompt Template
routing_prompt_template = ChatPromptTemplate.from_template("""
    You are good at routing questions to either accounts or HR departments.
    Which is the best department to answer the following question?
    If you can't determine the best department, respond with "I don't know".
    question: {question}
    department:
""")
routing_chain = routing_prompt_template | structured_routing_llm



# Section 3: Define Retriever Based on the Routed Department
def get_retriever(question):
    datasource = routing_chain.invoke(question).datasource

    hr_prompt_template = ChatPromptTemplate.from_template("""
        You are a human resources professional at Fictional, Inc.
        Respond to the following employee question in the context provided.
        If you can't answer the question with the given context, please say so.
        context: {context}
        question: {question}
    """)

    accounts_prompt_template = ChatPromptTemplate.from_template("""
        You are an accounts professional at Fictional, Inc.
        Respond to the following employee question in the context provided.
        If you can't answer the question with the given context, please say so.
        context: {context}
        question: {question}
    """)

    if datasource == "hr":
        print("HR")
        return hr_retriever, hr_prompt_template
    else:
        print("Accounts")
        return accounts_retriever, accounts_prompt_template

# Section 4: Answer the Question Using the Appropriate Chain
def answer_the_question(question: str) -> str:
    routing_output = routing_chain.invoke(question)
    retriever, prompt_template = get_retriever(routing_output)
    
    chain = (
        {"question": RunnablePassthrough(), "context": retriever}
        | prompt_template
        | llm
        | StrOutputParser()
    )

    return chain.invoke(question)

# Example usage
answer_the_question("How do I change my salary deposit information?")


>> 'Accounts'
>> 'You can change your salary deposit information by logging into the accounts portal and navigating to the payroll section. From there, you can enter your new bank account information and save the changes. Your salary will then be deposited into the new account each pay period.'

上述代码有四个部分和一个使用示例。第一部分定义了一个 Pydantic 对象，用于告诉 LLM 所需的输出结构。这次，输出将是一个 DataSource 对象，而不是常规响应。

第二部分是我们定义路由器的地方。在提示中，我们要求模型说“我不知道”，这样它就不会尝试回答任何随机问题。

第三和第四部分检索正确的检索器对象，获取相关文档，并在检索到的上下文中回答用户的问题。

基于 LLM 的逻辑查询路由的缺点

当用户的问题不明确时，基于 LLM 的逻辑路由是可靠的。但是，我们还需要解决它的缺点。

使用 LLM 进行路由的最大问题是LLM 的先验知识可能对小众用例没有帮助。大多数公开的 LLM 都是接受过一般知识培训的。他们可能不了解组织特定的首字母缩略词、所有者软件等。

此外，LLM 的输出可能不一致。尽管 LLM 可以更有效地路由模糊查询，但有时也会感到困惑。它还可能将相同的查询路由到不同的源，这使其可靠性受到质疑。

语义查询路由

这种方法非常简单。我们有一个代表每个数据源的段落。使用基于距离的方法，我们将用户的输入与段落进行比较，并找到最相似的数据源。

您可能已经猜到了，文章必须准确代表数据源才能成功。我们经常使用提示作为与文档进行比较的文章。

然而，语义路由最有趣的地方在于，我们可以在提示中使用组织特定的术语。因此，语义路由非常适合私人聊天机器人。

例如，你有专有软件“MySecret”，它允许员工私下谈论他们的担忧。基于法学硕士的方法不知道它是什么意思。但语义相似性可以正确地引导它。

以下是一个示例工作流程：

语义查询路由工作流程

如图所示，基于相似度的提示选择器会比较问题和提示，并选择与问题最接近的提示。根据所选提示，选择用于检索的向量存储。

这是针对同一场景的语义查询路由的完整代码实现。

# Section 1: Defining the prompts for each data source and embedd them. 
hr_template = """You're a human resources professional at Fictional, Inc.
Use the context below to answer the question that follows. 
If you need more information, ask for it.
If you don't have enough information in the context to answer the question, say so.
context: {context}
question: {query}
Answer:
"""

accounts_template = """You're an accounts manager at Fictional, Inc.
Use the context below to answer the question that follows. 
If you need more information, ask for it.
If you don't have enough information in the context to answer the question, say so.
context: {context}
question: {query}
Answer:
{query}"""


prompt_templates = [hr_template, accounts_template]
prompt_embeddings = openai_embeddings.embed_documents(prompt_templates)


# Section 3: Create the similarity-based prompt picker
def find_most_similar_prompt(input):
    # Embed the question
    query_embedding = openai_embeddings.embed_query(input["query"])

    # Pick the most similar prompt
    similarity = cosine_similarity([query_embedding], prompt_embeddings)[0]
    best_match = prompt_templates[similarity.argmax()]

   
    print(
        "Directing to the Accounts Department"
        if best_match == accounts_template
        else "Directing to the HR Department"
    )

    # Also pick the retriever
    retriever = accunts_retriever if best_match == accounts_template else hr_retriever
    
    # Create the prompt template with the choosen prompt and retriever
    prompt_template =  PromptTemplate.from_template(
        best_match, partial_variables={"context": retriever}
    )

    return prompt_template


# Section 4: Define the full RAG chain
chain = (
    {"query": RunnablePassthrough()}
    | RunnableLambda(find_most_similar_prompt)
    | ChatOpenAI()
    | StrOutputParser()
)


# Execute the chain
print(
    chain.invoke(
        """
        I need more budget to buy the software we need. 
        What should I do?
        """
    )
)




>> Directing to the Accounts Department
>> As an accounts manager at Fictional, Inc., you should create a budget proposal outlining the software needed, its cost, and the potential benefits to the company. Present this proposal to the appropriate department or upper management for their review and approval. Additionally, you can also explore cost-saving options or negotiate with the software provider for a better deal.

在上面的代码中，我们定义了一个执行路由的函数。我们单独保存嵌入提示的副本。当有新的用户输入时，我们也会嵌入它，并计算提示集合之间的余弦相似度。最相似的提示及其检索器用于创建提示模板。

语义路由的缺点

语义路由的主要缺点是最大令牌大小限制了我们。对于Open AI 模型，这是 8192 个令牌。对于较小的任务来说这不是问题；大型组织可能有许多私人缩写。因此，如果我们有更多像我们讨论过的“MySecret”这样的私人应用程序，这将在提示中占用更多令牌。

除了 token 限制之外，较大的提示还有另一个问题。由于我们计算用户输入和提示之间的相似度，因此如果提示太大，相似度得分可能不准确。

此外，语义路由对复杂的私有查询的路由能力也值得怀疑。与 MySecret 应用相关的查询应该交给 HR，因为它是员工关注的监听器。但如果有人问为什么 MySecret 应用加载缓慢，应该交给 IT 团队。语义相似性方法可能无法路由此类查询。

最后的想法

单一提示无法处理所有事情，单一数据源可能并不适合所有数据。

RAG 应用通常需要不同的向量存储和提示。还需要一个将查询路由到正确向量存储的路由器。

逻辑和语义路由是两种常用的方法。我们通过代码示例讨论了它们，并讨论了每种方法的缺点。

感谢关注雲闪世界。（Aws解决方案架构师vs开发人员&GCP解决方案架构师vs开发人员）

订阅频道(https://t.me/awsgoogvps_Host) TG交流群(t.me/awsgoogvpsHost)