AdaBoost

Scouting:

Scouting is done by testing the classifiers in the pool using a training set T of N multidimensional data points x.


We test and rank all classifiers in the expert pool by charging a cost   any time a classifier fails(a miss), and a cost every  time a classifier provides the right label(a success or "hit"). We require  so that misses are penalized more heavily than hits. It might seem strange to penalize a hit with non-zero cost, but as long as the penalty of success is smaller than the penalty for a miss everything is fine. This kind of error function different from the usual squared Euclidian distance to the classification target is called an exponential loss function.

AdaBoost uses exponential error loss as error criterion.


The main idea of AdaBoost is to proceed systematically by extracting one classifier from the pool in each of M iterations.  The drafting process concentrates in selecting new classifiers for the committee focusing on which can help with the still misclassified examples. The best team players are those which can provide new insights to the committee. Classifiers being drafted should complement each other in an optimal way. 


Drafting:

In each iteration we need to rank all classifiers, so that we can select the current best out of the pool. At the m-th iteration we have already included m-1 classifiers in the committee and we want to draft the next one. The current linear combination of classifiers is



We define the total cost, or total error, of the extended classifier as the exponential loss



where  are yet to be determined in an optimal way. Since our intention is to draft   we rewrite the above expression as


for i = 1, ..., N. In the first iteration  for i=1,...,N. During later iterations, the vector represents the weights assigned to each data point in the training set at iteration m. We can split the above Eq into two sums 


This means that the total cost is the weighted cost of all hits plus the weighted cost of all misses.

Writing the first summand as  and the second as  we simplify the notation to 



Now,  is the total sum W of the weights of all data points, that is, a constant in the current iteration. The right hand side of the equation is minimized when at the m-th iteration we pick the classifier with the lowest total cost (that is the lowest rate of weighted error). Intuitively this makes sense, the next draftee, km, should be the one with the lowest penalty given the current set of weights.


Weighting:




That is, the pool of classifiers dose not need to be given in advance, it only needs to ideally exist.


  • 0
    点赞
  • 0
    收藏
    觉得还不错? 一键收藏
  • 0
    评论

“相关推荐”对你有帮助么?

  • 非常没帮助
  • 没帮助
  • 一般
  • 有帮助
  • 非常有帮助
提交
评论
添加红包

请填写红包祝福语或标题

红包个数最小为10个

红包金额最低5元

当前余额3.43前往充值 >
需支付:10.00
成就一亿技术人!
领取后你会自动成为博主和红包主的粉丝 规则
hope_wisdom
发出的红包
实付
使用余额支付
点击重新获取
扫码支付
钱包余额 0

抵扣说明:

1.余额是钱包充值的虚拟货币,按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载,可以购买VIP、付费专栏及课程。

余额充值