stacking与blending的区别

网上通用的解释:

stacking是k折交叉验证,元模型的训练数据等同于基于模型的训练数据,该方法为每个样本都生成了元特征,每生成元特征的模型不一样(k是多少,每个模型的数量就是多少);测试集生成元特征时,需要用到k(k fold不是模型)个加权平均;

blending是holdout方法,直接将训练集切割成两个部分,仅10%用于元模型的训练;

暂时先用这个方法来区分吧

再来看quora上的回答:

 

Stacking and Blending are two similar approaches of combining classifiers (ensembling).

First at all, let me refer you to this Kaggle Ensembling Guide. I believe it is very simple and easy to understand (easier than the paper).

The difference is that Stacking uses out-of-fold predictions for the train set, and Blending uses a validation set (let’s say, 10% of the training set) to train the next layer.

Ensembling

Ensembling approaches train several classifiers in the hope that combining their predictions will outperform any single classifier (worst case scenario, be better than the worse classifier). The combination rule can be: majority vote, mean, max, min, product … the average rule is the most used.

Blending and Stacking

As said before, blending and stacking are two very similar approaches. In fact, some people use the terms as synonyms. Such approaches train a first layer of classifiers and use their outputs (i.e. probabilities) to train a second layer of classifiers. Any number of layers can be used. The final prediction is usually performed by the average rule or by a final base classifier (such as Logistic Regression in binary classification).

Figure from Kaggle Ensembling Guide

You can’t (or shouldn’t) use the training set itself to pass to the next layer. For this reason, there are rules such as using cross-fold-validation (the out-of-fold is used to train the next layer) - Stacking - or using a holdout validation (part of the train is used in the first layer, part in the second …) - Blending.

Keep in Mind

Keep in mind that, even though the examples from Kaggle Ensembling Guideshows the same base classifiers (XGB) several times, the classifiers must be diverse enough in order for ensembling to produce good results. This might be accomplished by using different base classifiers, training using different features, training using different parts of the training set, or using different parameters.

再来看CSDN解释:

 csdn部分内容转载自:https://blog.csdn.net/maqunfi/article/details/82220115

再来看一张非常美丽的图:
 

转载自:https://blog.csdn.net/weixin_38526306/article/details/81356325 

  • 1
    点赞
  • 5
    收藏
    觉得还不错? 一键收藏
  • 0
    评论

“相关推荐”对你有帮助么?

  • 非常没帮助
  • 没帮助
  • 一般
  • 有帮助
  • 非常有帮助
提交
评论
添加红包

请填写红包祝福语或标题

红包个数最小为10个

红包金额最低5元

当前余额3.43前往充值 >
需支付:10.00
成就一亿技术人!
领取后你会自动成为博主和红包主的粉丝 规则
hope_wisdom
发出的红包
实付
使用余额支付
点击重新获取
扫码支付
钱包余额 0

抵扣说明:

1.余额是钱包充值的虚拟货币,按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载,可以购买VIP、付费专栏及课程。

余额充值