什么是Boosting算法——Adaptive Boosting (AdaBoost) 与Gradient Boosting详解

最新推荐文章于 2024-03-15 08:29:13 发布

WWtianxiang

最新推荐文章于 2024-03-15 08:29:13 发布

阅读量1.7k

点赞数 1

分类专栏：机器学习文章标签： python 机器学习

本文链接：https://blog.csdn.net/qq_34565684/article/details/105493115

版权

本文详细介绍了Boosting算法，包括Adaboost和Gradient Boosting。Adaboost通过调整数据点权重来提升弱分类器，而Gradient Boosting则通过拟合残差来改进模型。文章还提供了Python实现示例。

摘要生成于 C知道，由 DeepSeek-R1 满血版支持，前往体验 >

什么是Boosting：

与许多ML模型专注于由单个模型来完成高质量预测不同， Boosting算法试图通过训练一系列弱模型来提高预测能力，每个模型都可以弥补其前辈的弱点。
在这里插入图片描述
图片来自https://towardsdatascience.com/boosting-algorithms-explained-d38f56ef3f30

要了解Boosting，至关重要的是要认识到Boosting 是一种通用算法，而不是特定模型。Boosting需要你指定一个弱模型(e.g. regression, shallow decision trees, etc)，然后提升它。

1.Adaboost

1.1 弱点的定义

AdaBoost是针对分类问题开发的特定Boosting算法（也称为离散AdaBoost）。弱点由弱估计器的错误率确定：
在这里插入图片描述
在每次迭代中，AdaBoost都会识别分类错误的数据点，从而增加其权重（从某种意义上讲减少正确点的权重），以便下一个分类器将更加重视以使其正确无误。下图解释了权重如何影响一个简单决策树（深度为1）的性能：
在这里插入图片描述
现在定义了弱点，下一步是弄清楚如何组合模型序列以使整体得到加强。

1.2 伪代码

这里给出SAMME，是一种处理多分类问题的特定方法。在这里插入图片描述
AdaBoost训练一系列增强采样权重（augmented sample weights）的模型，并基于误差为各个分类器生成“置信”系数Alpha。低错误会有较大的Alpha，这意味着在最终分类结果投票中该分类器的重要性更高。

1.3 python实现

from sklearn.ensemble import AdaBoostClassifier
from sklearn.datasets import make_classification
X,Y = make_classification(n_samples=100, n_features=2, n_informative=2,
                          n_redundant=0, n_repeated=0, random_state=102)
clf = AdaBoostClassifier(n_estimators=4, random_state=0, algorithm='SAMME')
clf.fit(X, Y)

parameters:

base_estimator : object, optional (default=None)
The base estimator from which the boosted ensemble is built. If None, then the base estimator is DecisionTreeClassifier(max_depth=1) ——弱分类器模型
n_estimators : integer, optional (default=50)
The maximum number of estimators at which boosting is terminated. In case of perfect fit, the learning procedure is stopped early. —— 弱分类器最大数量，为了防止过拟合，训练过程会采用early stopping
learning_rate : float, optional (default=1.)
Learning rate shrinks the contribution of each classifier by learning_rate. —— 学习率
algorithm : {‘SAMME’, ‘SAMME.R’}, optional (default=’SAMME.R’) ——boosting算法
If ‘SAMME.R’ then use the SAMME.R real boosting algorithm. base_estimator must support calculation of class probabilities. If ‘SAMME’ then use the SAMME discrete boosting algorithm.
random_state : int, RandomState instance or None, optional (default=None)

2.Gradient Boosting

2.1 定义弱点

Gradient Boosting对问题的处理方式有所不同。Gradient Boosting不着重于调整数据点的权重，而是着眼于预测与ground truth之间的差异。
In Gradient Boosting,“shortcomings” are identified by gradients.
In Adaboost,“shortcomings” are identified by high-weight data points
在GB中，弱点由梯度定义，在Adaboost中，弱点由高权重数据定义。
Let’s play a game…
你有(x1,y1),(x2,y2),…(xn,yn),让你拟合一个函数F(x)来最小化均方误差。假设你的朋友想帮助你，他给你一个模型F，你检查这个模型，发现模型很好，但还不够完美，有一些误差： F(x1) = 0.8, while y1 = 0.9, and F(x2) = 1.4 while y2 = 1.3… 如何提升你的模型?
游戏规则：

你不能改变模型F
你可以增加一个额外的模型（回归树）h，这样新的预测会变成F(x)+h(x)

在这里插入图片描述

回归树h也许不能完美拟合这个误差，但它可以减小这个误差。
yi − F(xi) 称为’残差‘.。残差的存在导致了模型F不能work的很好，h的存在就是为了补偿F的弱点，如果新的模型 F+h 仍然不能满足，我们可以再加一个回归树h。

然而这和梯度有什么关系呢？？
什么是梯度下降？
在这里插入图片描述
设损失函数是均方误差函数，则

所以，’残差‘就是负的梯度，用h来拟合残差，就是用h来拟合负梯度

在这里插入图片描述

给定一个初始化的模型F，计算损失函数关于模型预测值F(x)的倒数，得到负梯度，用负梯度作为h与模型F融合，这里的损失函数可以是任意损失函数。

在这里插入图片描述

python实现

from sklearn.ensemble import GradientBoostingRegressor
model = GradientBoostingRegressor(n_estimators=3,learning_rate=1)
model.fit(X,Y)

# for classification
from sklearn.ensemble import GradientBoostingClassifier
model = GradientBoostingClassifier()
model.fit(X,Y)