模型集成 Ensemble Learning 入门笔记

最新推荐文章于 2024-09-22 16:16:15 发布

killercars

最新推荐文章于 2024-09-22 16:16:15 发布

阅读量990

点赞数

本文链接：https://blog.csdn.net/Excaliburrr/article/details/93740277

版权

https://www.mql5.com/en/articles/4227

https://www.mql5.com/en/articles/4228

https://www.mql5.com/en/articles/4722

https://blog.csdn.net/zwqjoy/article/details/80431496

https://www.jianshu.com/p/11083abc5738

文章目录

Ensemble Learning 集成学习

Ensemble Learning 集成学习

1995-

1. classifier ensembles 分类器集成

1.1 集成模型怎么定义？

What is classifier ensemble?

1.2 这样做是否正确？

“Multiple Classifier Combination: Lessons and Next steps”, published in 2002, Tin Kam Ho wrote:

“Instead of looking for the best set of features and the best classifier, now we look for the best set of classifiers and then the best combination method. One can imagine that very soon we will be looking for the best set of combination methods and then the best way to use them all. If we do not take the chance to review the fundamental problems arising from this challenge, we are bound to be driven into such an infinite recurrence, dragging along more and more complicated combination schemes and theories, and gradually losing sight of the original problem.”

陷入构建更复杂模型的循环，忽视问题本质。

2. 切入角度

思路：在各个层次向集成模型转变

2.1 Combiner 聚合器

Non-trainable. An example of such a method is a simple “majority voting (多数投票)”.
Trainable. This group includes “weighted majority voting” and “Naive Bayes”, as well as the “classifier selection” approach, where the decision on a given object is made by one classifier of the ensemble. 最终结果由众多聚合分类器之中的一个表示。
Meta classifier. Outputs of the base classifiers are considered as inputs for the new classifier to be trained, which becomes a combiner. This approach is called “complex generalization”, “generalization through training”, or simply “stacking”. Building a training set for a meta classifier is one of the main problems of this combiner. ( generalization: 概括，简单化 )

2.2 Diversity 分类器差异性

How to generate differences in the ensemble? The following options are suggested.

Manipulate the training parameters. Use different approaches and parameters when training individual base classifiers. For example, it is possible to initialize the neuron weights in the hidden layers of each base classifier’s neural network with different random variables. It is also possible to set the hyperparameters randomly.
Manipulate the samples — take a custom bootstrap sample from the training set for each member of the ensemble.
Manipulate the predictors — prepare a custom set of randomly determined predictors for each base classifier. This is the so-called vertical split of the training set.

2.3 Ensemble size 聚合规模

How to determine the number of classifiers in an ensemble? Is the ensemble built by simultaneous (同时/一次性) training of the required number of classifiers or iteratively by adding/removing classifiers? Possible options:

The number is reserved in advance 预设数量
The number is set in the course of training 训练时确定数量 (e.g. boost)
Classifiers are overproduced and then selected 过量训练

3. 主流集成方法： bagging / boosting / stacking

3.1 动机

方差 / 偏差 / 预测效果(end to end)

3.2 集成方法产生

reduce variance — bagging;
reduce bias — boosting;
improve predictions — stacking.
parallel methods(平行聚合) of constructing an ensemble, where the base models are generated in parallel (for example, a random forest). The idea is to use the independency(独立性) between the base models and to reduce the error by averaging. Hence, the main requirement for models — low mutual correlation and high diversity.
sequential ensemble methods (序列式聚合), where the base models are generated sequentially (for example, AdaBoost, XGBoost). The main idea here is to use the dependency(关联性) between the base models. Here, the overall quality can be increased by assigning higher weights to examples that were previously incorrectly classified.

3.3 Bagging

分类器并列计算 -> 组合输出

a bootstrap sample (引导样本) is extracted from the training set;
each classifier is trained on its own sample;
individual outputs from separate classifiers are combined into one class label. If individual outputs have the form of a class label, then a simple majority voting is used. If the output of classifiers is a continuous variable, then either averaging is applied, or the variable is converted into a class label, followed by a simple majority voting.

来自单独分类器的各个输出被组合成一个类标签。

如果单个输出具有类标签的形式，则使用简单多数表决。如果分类器的输出是连续变量，则应用平均值，或者将变量转换为类标签，然后进行简单多数表决。