GBDT--回归篇

关于分类情形,参见GBDT-分类篇

Gradient Boost的算法流程

LS_TreeBoost

LAD_TreeBoost

sklearn源码解读

sklearn.ensemble.GradientBoostingRegressor(loss=’ls’, learning_rate=0.1, n_estimators=100, subsample=1.0, criterion=’friedman_mse’, min_samples_split=2, min_samples_leaf=1, min_weight_fraction_leaf=0.0, max_depth=3, min_impurity_decrease=0.0, min_impurity_split=None, init=None, random_state=None, max_features=None, alpha=0.9, verbose=0, max_leaf_nodes=None, warm_start=False, presort=’auto’, validation_fraction=0.1, n_iter_no_change=None, tol=0.0001)
loss          : {‘ls’, ‘lad’, ‘huber’, ‘quantile’} | loss function to be optimized;
learning_rate : (default=0.1) | shrinks the contribution of each tree by learning_rate;
n_estimators  : the number of boosting stages to perform;
subsample     : the fraction of samples to be used for fitting the individual base learners;
criterion     : {friedman_mse, mse, mae}the function to measure the quality of a split;
  •  LeastSquaresError
class LeastSquaresError(RegressionLossFunction):
    """
    Loss function for least squares (LS) estimation.
    Terminal regions need not to be updated for least squares.
    """

    def init_estimator(self):
        ''' 初始化F0 '''
        return MeanEstimator()

    def __call__(self, y, pred, sample_weight=None):
        ''' 计算当前的loss '''
        if sample_weight is None:
            return np.mean((y - pred.ravel()) ** 2.0)
        else:
            return (1.0 / sample_weight.sum() *
                    np.sum(sample_weight * ((y - pred.ravel()) ** 2.0)))

    def negative_gradient(self, y, pred, **kargs):
        ''' 计算负梯度 '''
        return y - pred.ravel()

    def update_terminal_regions(self, tree, X, y, residual, y_pred,
                                sample_weight, sample_mask,
                                learning_rate=0.1, k=0):
        ''' 更新Fm(x) '''
        y_pred[:, k] += learning_rate * tree.predict(X).ravel()

  其中,用于初始化F0的MeanEstimator类如下:

class MeanEstimator:
    def fit(self, X, y, sample_weight=None):
        ''' 对于mse,使用均值作为F0的初始值 '''
        if sample_weight is None:
            self.mean = np.mean(y)
        else:
            self.mean = np.average(y, weights=sample_weight)

    def predict(self, X):
        """ Predict labels """
        check_is_fitted(self, 'mean')
        y = np.empty((X.shape[0], 1), dtype=np.float64)
        y.fill(self.mean)
        return y

算例

x12345678910
y5.565.705.916.406.807.058.908.709.009.05

 

           
           

1

评论 1
添加红包

请填写红包祝福语或标题

红包个数最小为10个

红包金额最低5元

当前余额3.43前往充值 >
需支付:10.00
成就一亿技术人!
领取后你会自动成为博主和红包主的粉丝 规则
hope_wisdom
发出的红包

打赏作者

ReLuJie

你的鼓励将是我创作的最大动力

¥1 ¥2 ¥4 ¥6 ¥10 ¥20
扫码支付:¥1
获取中
扫码支付

您的余额不足,请更换扫码支付或充值

打赏作者

实付
使用余额支付
点击重新获取
扫码支付
钱包余额 0

抵扣说明:

1.余额是钱包充值的虚拟货币,按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载,可以购买VIP、付费专栏及课程。

余额充值