5.7 COUNTING THE COST

    1.Kappa statistic

Kappa statistic 这个指标用于评判分类器的分类结果与随机分类的差异度。(The Kappa statistic is used to measure the agreement between predicted and observed categorizations of a dataset, while correcting for an agreement that occurs by chance.

 

 

 

摘自:http://wenku.baidu.com/view/f1061c165f0e7cd18425361d.html

     http://blog.sina.com.cn/emilysasworld 

   2.Cost-Sensitive Classification && Cost-Sensitive Learning(检验代价及分类代价)

Cost-Sensitive Classification:Given a cost matrix, you can calculate the cost of a particular learned model on a given test set just by summing the relevant elements of the cost matrix for the
model’s prediction for each test instance.(
跟定一个cost矩阵,并且已获得一个分类器。可以使用该分类器预测检验集的类别,可以使用该cost矩阵获得检验集中每一个实例的损失。只在检验实例的过程中考虑cost矩阵,在训练分类器的过程中没有考虑到cost矩阵)//
http://www.docin.com/p-379037057.html一篇不错的论文

例如:基于最小风险的贝叶斯决策

Cost-Sensitive Learning:Take the cost matrix into account during the training process and ignore costs at prediction time.(在检验实例的过程中没有考虑cost矩阵,只在训练分类器的过程中考虑到cost矩阵)

Varying the proportion of instances in the training set is a general technique for building cost-sensitive classifiers。Suppose you artificially increase the number of no instances by a factor of 10 and use the resulting dataset for tr-aining. If the learning scheme is striving to minimize the number of errors, it will come up with a decision structure that is biased toward avoiding errors on the no instances because such errors are effectively penalized tenfold. If data with the original proportion of no instances is used for testing, fewer errors will be made on these than on yes
instances-that is, there will be fewer false positives than false negatives-

because false positives have beenweighted 10 times more heavily than false negatives.(改变训练集中不中不同类别的实例的比例是建立cost-sensitive的一种方法)

        3.Lift chart && ROC curve &&Recall-precision curve

 

(1)TP Rate =100 × TP/(TP + FN)   (2)FP Rate =100 × FP/(FP + TN)

(3)Recall = number of documents retrieved that are relevant/total number of documents that are relevant

(4)Precision =number of documents retrieved that are relevant/total number of documents that are retrieved

 

  • 0
    点赞
  • 0
    收藏
    觉得还不错? 一键收藏
  • 0
    评论
评论
添加红包

请填写红包祝福语或标题

红包个数最小为10个

红包金额最低5元

当前余额3.43前往充值 >
需支付:10.00
成就一亿技术人!
领取后你会自动成为博主和红包主的粉丝 规则
hope_wisdom
发出的红包
实付
使用余额支付
点击重新获取
扫码支付
钱包余额 0

抵扣说明:

1.余额是钱包充值的虚拟货币,按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载,可以购买VIP、付费专栏及课程。

余额充值