【翻译】多标签分类评价指标metrices multi-label classification

最新推荐文章于 2024-07-29 17:57:20 发布

surrender2u

最新推荐文章于 2024-07-29 17:57:20 发布

阅读量792

点赞数 1

分类专栏： NLP 文章标签：自然语言处理

原文链接：https://medium.com/analytics-vidhya/metrics-for-multi-label-classification-49cc5aeba1c3

版权

NLP 专栏收录该内容

3 篇文章 0 订阅

订阅专栏

翻译日期：2020-05-15
翻译来源：
Lohithmunakala Aug 28, 2020
Metrics for Multi-Label Classification
原地址：https://medium.com/analytics-vidhya/metrics-for-multi-label-classification-49cc5aeba1c3

删减版本正文：
用于多标签分类的最常见指标如下：

Precision at k
Avg precision at k
Mean avg precision at k
Sampled F1 Score

让我们详细了解这些指标。

`Precision at k（P@k）`：

给定实际类别和预测类别的列表，将在k处的精度定义为仅考虑每个类别的前k个元素除以k得出的正确预测的数量。取值范围是0到1。
这是一个解释相同代码的示例：

def patk(actual, pred, k):
	#we return 0 if k is 0 because 
	#   we can't divide the no of common values by 0 
	if k == 0:
		return 0

	#taking only the top k predictions in a class 
	k_pred = pred[:k]

	#taking the set of the actual values 
	actual_set = set(actual)

	#taking the set of the predicted values 
	pred_set = set(k_pred)

	#taking the intersection of the actual set and the pred set
		# to find the common values
	common_values = actual_set.intersection(pred_set)

	return len(common_values)/len(pred[:k])

#defining the values of the actual and the predicted class
y_true = [1 ,2, 0]
y_pred = [1, 1, 0]

if __name__ == "__main__":
    print(patk(y_true, y_pred,3))

运行代码，我们得到以下结果。

0.6666666666666666

在这种情况下，我们将2的值设为1，从而导致得分下降。

`Avg precision at k（AP@k）`：

它定义为k = 1至k时k处所有精度的平均值。为了更加清楚，让我们看一些代码。取值范围是0到1。

import numpy as np
import pk # 为了引入上面的函数

def apatk(acutal, pred, k):
	#creating a list for storing the values of precision for each k 
	precision_ = []
	for i in range(1, k+1):
		#calculating the precision at different values of k 
		#      and appending them to the list 
		precision_.append(pk.patk(acutal, pred, i))

	#return 0 if there are no values in the list
	if len(precision_) == 0:
		return 0 

	#returning the average of all the precision values
	return np.mean(precision_)

#defining the values of the actual and the predicted class
y_true = [[1,2,0,1], [0,4], [3], [1,2]]
y_pred = [[1,1,0,1], [1,4], [2], [1,3]]

if __name__ == "__main__":
	for i in range(len(y_true)):
		for j in range(1, 4):
			print(
				f"""
				y_true = {y_true[i]}
				y_pred = {y_pred[i]}
				AP@{j} = {apatk(y_true[i], y_pred[i], k=j)}
				"""
			)

在这里，我们检查从1到4的AP@k。我们得到以下输出。


				y_true = [1, 2, 0, 1]
				y_pred = [1, 1, 0, 1]
				AP@1 = 1.0
				

				y_true = [1, 2, 0, 1]
				y_pred = [1, 1, 0, 1]
				AP@2 = 0.75
				

				y_true = [1, 2, 0, 1]
				y_pred = [1, 1, 0, 1]
				AP@3 = 0.7222222222222222
				

				y_true = [0, 4]
				y_pred = [1, 4]
				AP@1 = 0.0
				

				y_true = [0, 4]
				y_pred = [1, 4]
				AP@2 = 0.25
				

				y_true = [0, 4]
				y_pred = [1, 4]
				AP@3 = 0.3333333333333333
				

				y_true = [3]
				y_pred = [2]
				AP@1 = 0.0
				

				y_true = [3]
				y_pred = [2]
				AP@2 = 0.0
				

				y_true = [3]
				y_pred = [2]
				AP@3 = 0.0
				

				y_true = [1, 2]
				y_pred = [1, 3]
				AP@1 = 1.0
				

				y_true = [1, 2]
				y_pred = [1, 3]
				AP@2 = 0.75
				

				y_true = [1, 2]
				y_pred = [1, 3]
				AP@3 = 0.6666666666666666

这使我们对代码的工作方式有了清晰的了解。

`Mean avg precision at k（MAP@k）`：

整个训练数据中AP @ k的所有值的平均值称为MAP@k。这有助于我们准确表示整个预测数据的准确性。这是一些相同的代码。
取值范围是0到1。

import numpy as np
import apk

def mapk(acutal, pred, k):

	#creating a list for storing the Average Precision Values
	average_precision = []
	#interating through the whole data and calculating the apk for each 
	for i in range(len(acutal)):
		average_precision.append(apk.apatk(acutal[i], pred[i], k))

	#returning the mean of all the data
	return np.mean(average_precision)

#defining the values of the actual and the predicted class
y_true = [[1,2,0,1], [0,4], [3], [1,2]]
y_pred = [[1,1,0,1], [1,4], [2], [1,3]]

if __name__ == "__main__":
    print(mapk(y_true, y_pred,3))

运行上面的代码，我们得到的输出如下：

0.4305555555555556

此处，由于预测集存在许多错误，因此评分很差。
F1-样本：
此度量标准计算数据中每个实例的F1分数，然后计算F1分数的平均值。我们将在代码中使用sklearn的相同实现。
这是F1分数的文档。取值范围是0到1。
我们首先将数据转换为二进制格式，然后对它执行f1。这为我们提供了所需的值。

from sklearn.metrics import f1_score
from sklearn.preprocessing import MultiLabelBinarizer

def f1_sampled(actual, pred):
    #converting the multi-label classification to a binary output
    mlb = MultiLabelBinarizer()
    actual = mlb.fit_transform(actual)
    pred = mlb.fit_transform(pred)

    #fitting the data for calculating the f1 score 
    f1 = f1_score(actual, pred, average = "samples")
    return f1

#defining the values of the actual and the predicted class
y_true = [[1,2,0,1], [0,4], [3], [1,2]]
y_pred = [[1,1,0,1], [1,4], [2], [1,3]]

if __name__ == "__main__":
    print(f1_sampled(y_true, y_pred))