Ensemble

最新推荐文章于 2024-07-21 17:44:55 发布

kang0709

最新推荐文章于 2024-07-21 17:44:55 发布

阅读量740

点赞数

分类专栏： Kaggle 文章标签： kaggle 模型融合 ensemble

本文链接：https://blog.csdn.net/minemine999/article/details/105016961

版权

Kaggle 专栏收录该内容

3 篇文章 0 订阅

订阅专栏

Averaging

两个具有显著差异的模型做 Linear Blend

1）多个模型的平均输出

2）多个模型进行加权平均，加权系数可以通过Linear blending在validation上确定

3）条件平均，不同条件下选取不同的模型

Bagging

相同模型不同版本的平均融合， 不同状态下的模型具有不同的偏差和方差，通过模型的简单平均可以减少最后模型的偏差和方差从而提高模型的精度。

Bagging的方式：Bagging models每个之间是独立的，因此可以实现并行化

改变随机种子，每个模型的初始化参数不同
行采样或者Bootstrapping，又放回采样
Shuffling，将数据集打乱以提高模型训练的鲁棒性
列采样，等价于对数据集进行特征采样
模型的数量，模型越多融合的结果越好

随机森林的算法就是很好的 Bagging方式的实现，简单的例子表明Bagging的具体代码实现：

model = RandomForestRegressor()
bags = 10
seed = 1
bagged_prediction = np.zeros(test.shape[0])
for n in range(0,bags):
    model.set_params(random_state=seed+1)
    model.fit(train,y)
    preds = model.predict(test)
    bagged_prediction += preds
bagged_prediction /= bags

Boosting

模型序列化，不同模型的优化依赖前面模型的性能。

1）基于权重的boosting

Boosting参数:

学习率：Learning rate/shringkage/eta：用来权重的更新
模型的数量：estimators
初始输入模型：Input model
AdaBoost很好的案例

2）基于残差的boosting

Boosting参数:

学习率：Learning rate/shringkage/eta：用来权重的更新
模型的数量：estimators
行采样：Row Sampling
列采样：Col Sampling
初始输入模型：Input model

典型的模型有：

xgboost
lightgbm
catboost
skearn's GBM
H20's GBM

Stacking

一般两层的模型堆叠

时间序列处理机制：

模型多样性的生成

It will find when a model is good, and when a model is actually bad or fairly weak.So you don't need to worry too much to make all the models really strong,stacking can actually extract the juice from each prediction. Therefore, what you really need to focus is, am I making a model that brings some information, even though it is generally weak?And this is true, there have been many situations where I've made, I've had some quite weak models in my ensemble, I mean, compared to the top performance. And nevertheless, they were actually adding lots of value in stacking. They were bringing in new information that the meta model could leverage.

Normally, you introduce diversity from two forms：

1）one is by choosing a different algorithm.Which makes sense, certain algorithms capitalize on different relationships within the data. For example, a linear model will focus on a linear relationship, a non-linear model can capture better a non-linear relationships.So predictions may come a bit different.

2）The other thing is you can even run the same model, but you try to run it on different transformation of input data, either less features or completely different transformation.For example, in one data set you may treat categorical features as one whole encoding. In another, you may just use label Encoding, and the result will probably produce a model that is very different.

常见的策略：

with time sensitive data - respect time
Diversity as important as preformance
Diversity may come from:
- Different algorithms
- Different input featurs
Performance plateauing after N models
Meta model is normally modest（顶层的stacking模型不需要太复杂）

StackNet

多层stacking模型的堆叠，3、4层不同的模型堆叠

总的来说：大量使用不同的模型，构建不同的特征，利用不同的特征群和不同的模型训练得到不同的结果，然后一层层的堆叠进行多层训练完成最后模型的输出。

案例1

案例2：

Tips About StackNet

支持许多模型的叠加
可以在回归问题中使用分类器，反之亦然（非常有用）
记住层数越深，模型复杂度越低

一般的模型选择

第一层模型：

第二层模型：

kang0709

关注

0
点赞
踩
0

收藏

觉得还不错? 一键收藏
0
评论
Ensemble

Averaging两个具有显著差异的模型做Linear Blend1）多个模型的平均输出2）多个模型进行加权平均，加权系数可以通过Linear blending在validation上确定3）条件平均，不同条件下选取不同的模型Bagging相同模型不同版本的平均融合，不同状态下的模型具有不同的偏差和方差，通过模型的简单平均可以减少最后...
复制链接

扫一扫

专栏目录