Different measures can be considered for evaluating ordinal regression models.
However, the most common ones are the Mean Zero-one Error (MZE) and the Mean Absolute Error(MAE).
1. MZE
MZE is the error rate of the classifier:
2. MAE
MAE is the average deviation in absolute value of the predicted rank () from the true one():
3. SD (Standard Deviation)
Meanwhile, in order to measure the performance imbalance among classes by the CNN model, we adopt the standard deviation[10]; it is a measure of the dispersion of data distribution. A samller standard deviation implies lesser deviation of the valus from the average, and vice versa.
We express it as:
Where is the average accuracy on all the classes and is the classification accuracy of the m-th class by the CNN model on the test set.
4. C-index
The value of MAE ranges from 0 to r-1(maximum absolute error between classes). Because the real distances among the class labels are unknown, the nuerical representation of the class labels has a strong impact on the MAE performance.
In order to avoid the above-mentioned impact, a more suitable approach is to consider the relation between the observed class label and the predicted class label.
Here we use the concordance index or C-index to represent these relations. The C-index is computed as the proportion of the number of concordant pairs to the number of comparable pairs[1].
[1]W. Waegeman, Learning to rank: a ROC-based graph-theoretic approach, 675 Ph.D. thesis, Springer (2009)
我没找到python中相应的库函数,所以就自己写了一个:
通过撸码可以明显感觉到C-index这个指标对预测结果进行极大的约束放松,极大的降低了对结果正确性的约束。
该指标用于序分类性能的评估,本人觉得:太宽松了!
'''Daniel He'''
import numpy as np
from sklearn.metrics import accuracy_score, mean_absolute_error, f1_score, recall_score
from sklearn.metrics import balanced_accuracy_score
from collections import OrderedDict
from itertools import combinations
y_true = [0,0,0,1,1,1,2,2,2,3,3,3,4,4,4,5,5,5,6,6,6]
y_pred = [0,0,1,0,1,2,1,2,2,2,3,5,2,4,4,5,5,6,6,6,6]
def C_index(y_true,y_pred):
labels = np.sort(np.unique(y_true))
pair_set = OrderedDict()
for label_pair in combinations(labels,2):
pair_set[label_pair] = []
for idx_pair in combinations(range(len(y_true)),2):
if y_true[idx_pair[0]] < y_true[idx_pair[1]]:
pair_set[(y_true[idx_pair[0]],y_true[idx_pair[1]])].append((idx_pair[0],idx_pair[1]))
elif y_true[idx_pair[0]] > y_true[idx_pair[1]]:
pair_set[(y_true[idx_pair[1]], y_true[idx_pair[0]])].append((idx_pair[1], idx_pair[0]))
else:
continue
nPairs = 0
nResult = 0
for label_pair,idx_pair_list in pair_set.items():
nPairs += len(idx_pair_list)
for pair in idx_pair_list:
if y_pred[pair[0]] < y_pred[pair[1]]:
nResult += 1
elif y_pred[pair[0]] == y_pred[pair[1]]:
nResult += 0.5
# print(nResult/nPairs)
# print(accuracy_score(y_true=y_true,y_pred=y_pred))
# print(mean_absolute_error(y_true=y_true,y_pred=y_pred))
return nResult/nPairs
r = C_index(y_true=y_true,y_pred=y_pred)
print(r)