集成学习
0.Official Description
The goal of ensemble methods is to combine the predictions of several base estimators built with a given learning algorithm in order to improve generalizability / robustness over a single estimator.
Two families of ensemble methods are usually distinguished:
-
In averaging methods, the driving principle is to build several estimators independently and then to average their predictions. On average, the combined estimator is usually better than any of the single base estimator because its variance is reduced.
Examples: Bagging methods, Forests of randomized trees, …
-
By contrast, in boosting methods, base estimators are built sequentially and one tries to reduce the bias of the combined estimator. The motivation is to combine several weak models to produce a powerful ensemble.
Examples: AdaBoost, Gradient Tree Boosting, …
As they provide a way to reduce overfitting,
Bagging methods work best with strong and complex models (e.g., fully developed decision trees),
in contrast with Boosting methods which usually work best with weak models (e.g., shallow decision trees).
1.什么是集成学习
集成学习通过构建并结合多个学习器来完成学习任务,有时也被称为多分类器系统、基于委员会的学习等.集成学习通过将多个学习器进行结合,常可获得比单一学习器显著优越的泛化性能.
# 泛化能力(generalization ability)是指机器学习算法对新鲜样本的适应能力,,简而言之是在原有的数据集上添加新的数据集,通过训练输出一个合理的结果.学习的目的是学到隐含在数据背后的规律,对具有同一规律的学习集以外的数据,经过训练的网络也能给出合适的输出,该能力称为泛化能力.
2.集成学习分类
根据基学习器的生成方式,目前的集成学习方法大致可以分为两大类,
1. Bagging
基学习器间不存在强依赖关系,可同时生成的并行化方法
2. Boosting
基学习器间存在强依赖关系,必须串行生成的序列化方法
# 基学习器/基分类器/弱学习器 =====> 子训练集通过机器学习算法训练得到的模型
# 上述几个是同一个东西,叫法不同而已,统称weak learner,垃圾翻译常有,自求多福吧
3.结合策略
对于基学习器最终的结合策略常见的方法有如下几种:
- 平均法
对于数值形输出,最常见的结合策略即为平均法:
H ( x ) = 1 T ∑ i = 1 T h i ( x ) H(x)=\frac{1}{T}\sum_{i=1}^{T}h_{i}(x) H(x)=T1i=1∑Thi(x)
其中
h i ( x ) 为 基 学 习 器 的 输 出 结 果 , H ( x ) 为 最 终 学 习 器 的 结 果 , T 为 基 学 习 器 的 个 数 . h_{i}(x)为基学习器的输出结果,H(x)为最终学习器的结果,T为基学习器的个数. hi(x)为基学习器的输出结果,H(x)为最终学习