Data mining 4_7

Well, I really think u have to write note here, because when I new a word, I always lost.

Get down to business. For decision tree, which is based on rule, also use some statistical method, in other words, heuristic rule.

 

Error rate

you couldn't use the error rate in training data, what we care about is that the error rate in text data. Consequently, u can use the optimistic and pessimistic statistical method to get the samiliar error rate.

Problems from missing value

Firstly, when the data is not very mess, which means the date don't possess the statistical quality, thus, u couldn't use the statistical method to predict.

Secondly, the decision tree are not unique, consequently, we can use greed or heuristic algorithm to solve the better tree.

Thirdly, for some points, u couldn't separate by using the only attribute, perhaps u should use the expression of the attributes, such as x + y = 1.

 

Model Evaluation

u know when u create a decision tree, u need a model evalution to know whether it's good. So We introduce two matrices, one is confusion matrix, the other is cost matrix. u know everything have its own environments.

Lastly, we introduce the three index to evaluate the model, which are precision, recall, F.

转载于:https://www.cnblogs.com/chuanlong/archive/2013/04/08/3006622.html

  • 0
    点赞
  • 0
    收藏
    觉得还不错? 一键收藏
  • 0
    评论
评论
添加红包

请填写红包祝福语或标题

红包个数最小为10个

红包金额最低5元

当前余额3.43前往充值 >
需支付:10.00
成就一亿技术人!
领取后你会自动成为博主和红包主的粉丝 规则
hope_wisdom
发出的红包
实付
使用余额支付
点击重新获取
扫码支付
钱包余额 0

抵扣说明:

1.余额是钱包充值的虚拟货币,按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载,可以购买VIP、付费专栏及课程。

余额充值