1.Kappa statistic
Kappa statistic 这个指标用于评判分类器的分类结果与随机分类的差异度。(The Kappa statistic is used to measure the agreement between predicted and observed categorizations of a dataset, while correcting for an agreement that occurs by chance.)
摘自:http://wenku.baidu.com/view/f1061c165f0e7cd18425361d
http://blog.sina.com.cn/emilysasworld
2.Cost-Sensitive Classification && Cost-Sensitive Learning(检验代价及分类代价)
Cost-Sensitive Classification:Given a cost matrix, you can calculate the cost of a particular learned model on a given test set just by summing the relevant elements of the cost matrix for the
model’s prediction for each test instance.(跟定一个cost矩阵,并且已获得一个分类器。可以使用该分类器预测检验集的类别,可以使用该cost矩阵获得检验集中每一个实例的损失。只在检验实例的过程中考虑cost矩阵,在训练分类器的过程中没有考虑到cost矩阵)//http://www.docin.com/p-379037057.html一篇不错的论文
例如:基于最小风险的贝叶斯决策
Cost-Sensitive Learning:Take the cost matrix into account during the training process and ignore costs at prediction time.(在检验实例的过程中没有考虑cost矩阵,只在训练分类器的过程中考虑到cost矩阵)
Varying the proportion of instances in the training set is a general technique for building cost-sensitive classifiers。Suppose you artificially increase the number of no instances by a factor of 10 and use the resulting dataset for tr-aining. If the learning scheme is striving to minimize the number of errors, it will come up with a decision structure that is biased toward avoiding errors on the no instances because such errors are effectively penalized tenfold. If data with the original proportion of no instances is used for testing, fewer errors will be made on these than on yes
instances-that is, there will be fewer false positives than false negatives-
because false positives have beenweighted 10 times more heavily than false negatives.(改变训练集中不中不同类别的实例的比例是建立cost-sensitive的一种方法)
3.Lift chart && ROC curve &&Recall-precision curve
(1)TP Rate =100 × TP/(TP + FN) (2)FP Rate =100 × FP/(FP + TN)
(3)Recall = number of documents retrieved that are relevant/total number of documents that are relevant
(4)Precision =number of documents retrieved that are relevant/total number of documents that are retrieved