如何使用 UpTrain 和 LlamaIndex进行大型语言模型评估

最新推荐文章于 2024-08-02 18:00:55 发布

ppoojjj

最新推荐文章于 2024-08-02 18:00:55 发布

阅读量276

点赞数 4

文章标签：语言模型人工智能自然语言处理 python

本文链接：https://blog.csdn.net/ppoojjj/article/details/140654445

版权

概述

在本文中，我们将探讨如何利用 UpTrain 和 LlamaIndex 来评估和改善生成性AI应用程序的性能。UpTrain 是一个开源平台，可以对生成的响应进行评估，提供20多个预配置检查，帮助进行根本原因分析以及提供解决方案的见解。而 LlamaIndex 则允许我们基于自有数据进行检索增强生成（RAG），解决数据分布不匹配的问题。

安装 UpTrain 和 LlamaIndex

首先，您需要安装 UpTrain 和 LlamaIndex：

%pip install -q uptrain llama-index

注意：安装后可能需要重启内核以便使用更新的包。

导入所需库

import httpx
import os
import openai
import pandas as pd

from llama_index.core import VectorStoreIndex, SimpleDirectoryReader, Settings
from uptrain import Evals, EvalLlamaIndex, Settings as UpTrainSettings

创建数据集文件夹

我们可以使用任何文档来执行此操作。为了本教程，我们将使用来自维基百科的纽约市数据。

url = "https://uptrain-assets.s3.ap-south-1.amazonaws.com/data/nyc_text.txt"
if not os.path.exists("nyc_wikipedia"):
    os.makedirs("nyc_wikipedia")
dataset_path = os.path.join("./nyc_wikipedia", "nyc_text.txt")

if not os.path.exists(dataset_path):
    r = httpx.get(url)
    with open(dataset_path, "wb") as f:
        f.write(r.content)  # //中转API

创建查询列表

在生成响应之前，我们需要创建一个与纽约市相关的查询列表。

data = [
    {"question": "What is the population of New York City?"},
    {"question": "What is the area of New York City?"},
    {"question": "What is the largest borough in New York City?"},
    {"question": "What is the average temperature in New York City?"},
    {"question": "What is the main airport in New York City?"},
    {"question": "What is the famous landmark in New York City?"},
    {"question": "What is the official language of New York City?"},
    {"question": "What is the currency used in New York City?"},
    {"question": "What is the time zone of New York City?"},
    {"question": "What is the famous sports team in New York City?"},
]

创建查询引擎

接下来，我们使用 LlamaIndex 创建一个向量存储索引，并将其作为查询引擎，以检索相关文档。

openai.api_key = "sk-************************"  # your OpenAI API key

# 创建向量索引
documents = SimpleDirectoryReader("./nyc_wikipedia/").load_data()
vector_index = VectorStoreIndex.from_documents(documents)
query_engine = vector_index.as_query_engine()

运行评估

使用 EvalLlamaIndex 对象生成响应并进行评估。我们选择了两个最相关的评估：

上下文相关性：检查检索到的上下文是否与查询相关。
响应简洁性：检查响应是否简洁，不包含不必要的信息。

settings = UpTrainSettings(openai_api_key=openai.api_key)

llamaindex_object = EvalLlamaIndex(settings=settings, query_engine=query_engine)

results = llamaindex_object.evaluate(data=data, checks=[Evals.CONTEXT_RELEVANCE, Evals.RESPONSE_CONCISENESS])
pd.DataFrame(results)

可能遇到的错误

API密钥错误：如果您的 OpenAI API 密钥无效，您将遇到身份验证错误。请确保输入正确的密钥，并保持其私密性。
网络请求失败：若数据集文件下载失败，检查您的网络连接以及指定的URL是否有效。
包未找到：如果您在运行代码时遇到导入错误，请确保您已正确安装所需库。

如果你觉得这篇文章对你有帮助, 请点赞, 关注我的博客, 谢谢!

ppoojjj

关注

4
点赞
踩
3

收藏

觉得还不错? 一键收藏
0
评论
如何使用 UpTrain 和 LlamaIndex进行大型语言模型评估

在本文中，我们将探讨如何利用 UpTrain 和 LlamaIndex 来评估和改善生成性AI应用程序的性能。UpTrain 是一个开源平台，可以对生成的响应进行评估，提供20多个预配置检查，帮助进行根本原因分析以及提供解决方案的见解。而 LlamaIndex 则允许我们基于自有数据进行检索增强生成（RAG），解决数据分布不匹配的问题。
复制链接

扫一扫