如何使用SubQuestionQueryEngine处理复杂查询

在这篇文章中,我们将展示如何使用 SubQuestionQueryEngine 处理复杂的查询问题,以及如何将这些问题拆分成多个子问题并从多个数据源中提取信息来综合最终的答案。

准备工作

首先,如果你在Colab中打开此笔记本,你可能需要安装LlamaIndex。

!pip install llama-index
import os
os.environ["OPENAI_API_KEY"] = "sk-..."  # 请用你的OpenAI API密钥替换
import nest_asyncio
nest_asyncio.apply()
from llama_index.core import VectorStoreIndex, SimpleDirectoryReader
from llama_index.core.tools import QueryEngineTool, ToolMetadata
from llama_index.core.query_engine import SubQuestionQueryEngine
from llama_index.core.callbacks import CallbackManager, LlamaDebugHandler
from llama_index.core import Settings

# 使用LlamaDebugHandler来打印捕获的子问题的跟踪信息
llama_debug = LlamaDebugHandler(print_trace_on_end=True)
callback_manager = CallbackManager([llama_debug])
Settings.callback_manager = callback_manager

下载数据

!mkdir -p 'data/paul_graham/'
!wget 'https://raw.githubusercontent.com/run-llama/llama_index/main/docs/docs/examples/data/paul_graham/paul_graham_essay.txt' -O 'data/paul_graham/paul_graham_essay.txt'

加载数据

# 加载数据
pg_essay = SimpleDirectoryReader(input_dir="./data/paul_graham/").load_data()

# 构建索引和查询引擎
vector_query_engine = VectorStoreIndex.from_documents(
    pg_essay,
    use_async=True,
).as_query_engine()

设置SubQuestionQueryEngine

# 设置基础查询引擎作为工具
query_engine_tools = [
    QueryEngineTool(
        query_engine=vector_query_engine,
        metadata=ToolMetadata(
            name="pg_essay",
            description="Paul Graham essay on What I Worked On",
        ),
    ),
]

query_engine = SubQuestionQueryEngine.from_defaults(
    query_engine_tools=query_engine_tools,
    use_async=True,
)

运行查询

response = query_engine.query(
    "How was Paul Grahams life different before, during, and after YC?"
)
print(response)

捕获子问题

# 遍历捕获的SUB_QUESTION事件的子问题项
from llama_index.core.callbacks import CBEventType, EventPayload

for i, (start_event, end_event) in enumerate(
    llama_debug.get_event_pairs(CBEventType.SUB_QUESTION)
):
    qa_pair = end_event.payload[EventPayload.SUB_QUESTION]
    print("Sub Question " + str(i) + ": " + qa_pair.sub_q.sub_question.strip())
    print("Answer: " + qa_pair.answer.strip())
    print("====================================")

示例输出

Sub Question 0: What did Paul Graham work on before YC?
Answer: Paul Graham worked on writing essays and working on YC before YC.
====================================
Sub Question 1: What did Paul Graham work on during YC?
Answer: During his time at YC, Paul Graham worked on various projects. He wrote all of YC's internal software in Arc and also worked on Hacker News (HN), which was a news aggregator initially meant for startup founders but later changed to engage intellectual curiosity. Additionally, he wrote essays and worked on helping the startups in the YC program with their problems.
====================================
Sub Question 2: What did Paul Graham work on after YC?
Answer: After YC, Paul Graham worked on starting his own investment firm with Jessica.
====================================

可能遇到的错误

  1. API密钥错误: 确保你的API密钥正确并且没有过期。
  2. 网络连接问题: 下载数据或访问API时可能会遇到网络连接问题,确保你的网络连接稳定。
  3. 数据文件路径错误: 确保数据文件的路径正确且文件已经下载成功。

如果你觉得这篇文章对你有帮助,请点赞,关注我的博客,谢谢!

参考资料:

评论
添加红包

请填写红包祝福语或标题

红包个数最小为10个

红包金额最低5元

当前余额3.43前往充值 >
需支付:10.00
成就一亿技术人!
领取后你会自动成为博主和红包主的粉丝 规则
hope_wisdom
发出的红包
实付
使用余额支付
点击重新获取
扫码支付
钱包余额 0

抵扣说明:

1.余额是钱包充值的虚拟货币,按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载,可以购买VIP、付费专栏及课程。

余额充值