使用LabelledRagDataset评估RAG管道的性能

最新推荐文章于 2024-10-12 12:26:23 发布

qq_37836323

最新推荐文章于 2024-10-12 12:26:23 发布

阅读量363

点赞数 3

文章标签： python

本文链接：https://blog.csdn.net/qq_29929123/article/details/139798628

版权

在本文中，我们将介绍如何使用LabelledRagDataset来评估任意RAG（Retrieval-Augmented Generation）管道。LabelledRagDataset是一种用于评估RAG管道性能的数据集，可以有多种配置（例如选择LLM、设置similarity_top_k值、chunk_size等）。我们将展示如何从头开始构建LabelledRagDataset，并提供相关的demo代码。

LabelledRagDataExample类

首先，我们介绍LabelledRagDataExample类。该类用于构建LabelledRagDataset中的每个示例。

安装必要的库：

%pip install llama-index-llms-openai
%pip install llama-index-readers-wikipedia

创建一个LabelledRagDataExample示例：

from llama_index.core.llama_dataset import (
    LabelledRagDataExample,
    CreatedByType,
    CreatedBy,
)

# 构建一个LabelledRagDataExample
query = "This is a test query, is it not?"
query_by = CreatedBy(type=CreatedByType.AI, model_name="gpt-4")
reference_answer = "Yes it is."
reference_answer_by = CreatedBy(type=CreatedByType.HUMAN)
reference_contexts = ["This is a sample context"]

rag_example = LabelledRagDataExample(
    query=query,
    query_by=query_by,
    reference_contexts=reference_contexts,
    reference_answer=reference_answer,
    reference_answer_by=reference_answer_by,
)

将示例转换为JSON格式并解析：

print(rag_example.json())

# 输出:
# {"query": "This is a test query, is it not?", "query_by": {"model_name": "gpt-4", "type": "ai"}, "reference_contexts": ["This is a sample context"], "reference_answer": "Yes it is.", "reference_answer_by": {"model_name": "", "type": "human"}}

LabelledRagDataExample.parse_raw(rag_example.json())
# 输出:
# LabelledRagDataExample(query='This is a test query, is it not?', query_by=CreatedBy(model_name='gpt-4', type=<CreatedByType.AI: 'ai'>), reference_contexts=['This is a sample context'], reference_answer='Yes it is.', reference_answer_by=CreatedBy(model_name='', type=<CreatedByType.HUMAN: 'human'>))

构建LabelledRagDataset类

接下来，我们创建一个LabelledRagDataset实例，并展示如何将其保存为JSON文件：

from llama_index.core.llama_dataset import LabelledRagDataset

rag_example_2 = LabelledRagDataExample(
    query="This is a test query, is it so?",
    query_by=query_by,
    reference_contexts=["This is a second sample context"],
    reference_answer="I think yes, it is.",
    reference_answer_by=reference_answer_by,
)

rag_dataset = LabelledRagDataset(examples=[rag_example, rag_example_2])

# 保存为JSON文件
rag_dataset.save_json("rag_dataset.json")

# 从JSON文件加载
reload_rag_dataset = LabelledRagDataset.from_json("rag_dataset.json")

# 转换为pandas DataFrame查看
print(reload_rag_dataset.to_pandas())

构建合成LabelledRagDataset

我们还可以使用GPT-4生成合成的LabelledRagDataset。以下是从维基百科生成合成数据集的示例：

安装相关库并加载维基百科数据：

import nest_asyncio
nest_asyncio.apply()
!pip install wikipedia -q

from llama_index.readers.wikipedia import WikipediaReader
from llama_index.core import VectorStoreIndex

cities = ["San Francisco"]

documents = WikipediaReader().load_data(
    pages=[f"History of {x}" for x in cities]
)
index = VectorStoreIndex.from_documents(documents)

使用RagDatasetGenerator生成问题和答案：

from llama_index.core.llama_dataset.generator import RagDatasetGenerator
from llama_index.llms.openai import OpenAI

llm = OpenAI(model="gpt-3.5-turbo", temperature=0.3)

dataset_generator = RagDatasetGenerator.from_documents(
    documents,
    llm=llm,
    num_questions_per_chunk=2,
    show_progress=True,
)

rag_dataset = dataset_generator.generate_dataset_from_nodes()

print(rag_dataset.to_pandas())