【scikit-learn】sklearn.ensemble.AdaBoostRegressor 类：自适应提升回归器

本文链接：https://blog.csdn.net/u013172930/article/details/146337825

`sklearn.ensemble.AdaBoostRegressor`（自适应提升回归器）

AdaBoostRegressor 是 sklearn.ensemble 提供的 自适应提升（AdaBoost）回归模型，它 结合多个弱回归模型（默认是 DecisionTreeRegressor），逐步调整样本权重，提升回归效果，适用于 非线性回归任务。

1. `AdaBoostRegressor` 作用

用于回归任务（如 房价预测、能源消耗预测）。
基于多个弱回归器（默认是 DecisionTreeRegressor(max_depth=3)），逐步优化错误样本的权重。
适用于小数据集，提高简单模型的预测能力。

2. `AdaBoostRegressor` 代码示例

(1) 训练 AdaBoost 回归器

from sklearn.ensemble import AdaBoostRegressor
from sklearn.tree import DecisionTreeRegressor
from sklearn.datasets import make_regression
from sklearn.model_selection import train_test_split

# 生成回归数据
X, y = make_regression(n_samples=100, n_features=1, noise=10, random_state=42)

# 划分训练集和测试集
X_train, X_test, y_train, y_test = train_test_split(X, y, test_size=0.2, random_state=42)

# 训练 AdaBoost 回归模型（基学习器：决策树）
model = AdaBoostRegressor(base_estimator=DecisionTreeRegressor(max_depth=3), n_estimators=50, learning_rate=1.0, random_state=42)
model.fit(X_train, y_train)

# 预测
y_pred = model.predict(X_test)

# 计算 R²
r2 = model.score(X_test, y_test)
print("AdaBoost 回归器 R²:", r2)

解释

base_estimator=DecisionTreeRegressor(max_depth=3)：弱回归器为 深度 3 的决策树（默认是 max_depth=3）。
n_estimators=50：使用 50 个弱回归器进行提升。
learning_rate=1.0：学习率，控制每个弱回归器的贡献。

3. `AdaBoostRegressor` 主要参数

AdaBoostRegressor(base_estimator=None, n_estimators=50, learning_rate=1.0, loss="linear", random_state=None)

参数	说明
`base_estimator`	弱回归器（默认 `None`，即 `DecisionTreeRegressor(max_depth=3)`）
`n_estimators`	弱回归器数量（默认 `50`，值越大，模型越强但计算量增加）
`learning_rate`	学习率（默认 `1.0`，较小值提高泛化能力，需增加 `n_estimators`）
`loss`	“linear”（默认） or “square” or “exponential”（损失函数）
`random_state`	设置随机种子，保证结果可复现

4. 获取特征重要性

import numpy as np

feature_importances = model.feature_importances_
print("特征重要性:", feature_importances)

解释

feature_importances_ 返回每个特征的重要性（数值越大，该特征越关键）。

5. 计算模型性能

from sklearn.metrics import mean_squared_error, r2_score

mse = mean_squared_error(y_test, y_pred)
r2 = r2_score(y_test, y_pred)

print("均方误差 MSE:", mse)
print("决定系数 R²:", r2)

解释

MSE（均方误差）：值越小，拟合效果越好。
R²（决定系数）：1 表示完美拟合，0 表示无解释能力。

6. `AdaBoostRegressor` vs. `GradientBoostingRegressor`

模型	适用情况	主要区别
`AdaBoostRegressor`	多个弱回归器逐步优化错误样本权重	基学习器通常是决策树（深度较浅），训练速度快
`GradientBoostingRegressor`	逐步优化误差，提高回归精度	基学习器是深度更深的决策树，模型表现更强

示例

from sklearn.ensemble import GradientBoostingRegressor

gbr = GradientBoostingRegressor(n_estimators=50, learning_rate=0.1, max_depth=3, random_state=42)
gbr.fit(X_train, y_train)

print("AdaBoost 回归器 R²:", model.score(X_test, y_test))
print("梯度提升回归器 R²:", gbr.score(X_test, y_test))

解释

AdaBoost 更适用于浅层弱回归器，GBDT 适用于深层决策树。

7. `learning_rate` 对模型的影响

import numpy as np

learning_rates = [0.01, 0.1, 1.0, 2.0]
for lr in learning_rates:
    model = AdaBoostRegressor(n_estimators=50, learning_rate=lr, random_state=42)
    model.fit(X_train, y_train)
    print(f"学习率={lr}, 测试集 R²={model.score(X_test, y_test)}")