集成学习

https://www.analyticsvidhya.com/blog/2018/06/comprehensive-guide-for-ensemble-models/
在这里插入图片描述
stacking:堆;blending:混合物;bagging:装袋;boosting:助推
meta-estimator:元估计
2.1 Max Voting
In this technique, multiple models are used to make predictions for each data point. The predictions by each model are considered as a ‘vote’. The predictions which we get from the majority of the models are used as the final prediction.
2.2 Averaging
2.3 Weighted Average

3.1Stacking
Stacking is an ensemble learning technique that uses predictions from multiple models(for example decision tree,knn or svm) to build a new model.
3.1.1 we first define a function to make predictions on n-folds of train and test dataset. This function returns the predictions for train and test for each model.
3.1.2 then we will create two base models-decision tree and knn.
3.1.3 Create a third model,logistic regression,on the predictions of the decision tree and knn models.

3.2 Blending
Blending follows the same approach as stacking but uses only a validation set from the train set to make predictions.

3.3 Bagging
The idea behind bagging is combining the results of multiple models(for instance,all decision trees) to get a generalized result.
Bootstrapping is a sampling technique in which we create subsets of observations from the original dataset, with replacement. The size of the subsets is the same as the size of the original set.
The final predictions are determined by combing the predictions from all the models.

3.4 Boosting
Boosting is a sequential process, where each subsequent model attempts to correct the errors of the previous model. The succeeding models are dependent on the previous model.
3.4.1 A subset is created from the original dataset.
3.4.2 Initially,all data points are given equal weights.
3.4.3 A base model is created on this subset.
3.4.4 This model is used to make predictions on the whole dataset.
3.4.5 Errors are calculated using the actual values and predicted values.
3.4.6 The observations which are incorrectly predicted, are given higher weights.(Here, the three misclassified blue-plus points will be given higher weights)
3.4.7 Another model is created and predictions are made on the dataset.
(This model tries to correct the errors from the previous model)
3.4.8 Similarly, multiple models are created, each correcting the errors of the previous model.
3.4.9 The final model (strong learner) is the weighted mean of all the models (weak learners).

4.Algorithms based on Bagging and Boosting

Bagging algorithms:
Bagging meta-estimator
Random forest

Boosting algorithms:
AdaBoost
GBM
XGBM
Light GBM
CatBoost

4.1 Bagging meta-estimator
Random subsets are created from the original dataset (Bootstrapping).
The subset of the dataset includes all features.
A user-specified base estimator is fitted on each of these smaller sets.
Predictions from each model are combined to get the final result.

4.2 Random Forest
1.Random subsets are created from the original dataset (bootstrapping).
2.At each node in the decision tree, only a random set of features are considered to decide the best split.
3.A decision tree model is fitted on each of the subsets.
4.The final prediction is calculated by averaging the predictions from all decision trees.

4.3 AdaBoost
4.4 Gradient Boosting (GBM)
4.5 XGBoost
4.6 Light GBM

stacking参考网址
https://blog.csdn.net/willduan1/article/details/73618677 #有mlxtend包使用
https://zhuanlan.zhihu.com/p/25836678
https://zhuanlan.zhihu.com/p/26890738

  • 0
    点赞
  • 0
    收藏
    觉得还不错? 一键收藏
  • 0
    评论

“相关推荐”对你有帮助么?

  • 非常没帮助
  • 没帮助
  • 一般
  • 有帮助
  • 非常有帮助
提交
评论
添加红包

请填写红包祝福语或标题

红包个数最小为10个

红包金额最低5元

当前余额3.43前往充值 >
需支付:10.00
成就一亿技术人!
领取后你会自动成为博主和红包主的粉丝 规则
hope_wisdom
发出的红包
实付
使用余额支付
点击重新获取
扫码支付
钱包余额 0

抵扣说明:

1.余额是钱包充值的虚拟货币,按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载,可以购买VIP、付费专栏及课程。

余额充值