GBDT(Gradient Boost Decision Tree)
GBDT的核心思想是从一个弱模型出发,之后每一个模型都在不断逼近之前之前所有模型与目标的差值(不断修修补补),迭代次数越多越逼近。下面分别以GBDT用于回归和分类为例:
GBDT for regression
Input: Data { ( x i , y i ) } i = 1 n \left\{\left(x_{i}, y_{i}\right)\right\}_{i=1}^{n} { (xi,yi)}i=1n, and a differentiable Loss Function L ( y i , F ( x ) ) L\left(y_{i}, F(x)\right) L(yi,F(x))
Step 1: Initialize model with a constant value: F 0 ( x ) = argmin γ ∑ i = 1 n L ( y i , γ ) F_{0}(x)=\underset{\gamma}{\operatorname{argmin}} \sum_{i=1}^{n} L\left(y_{i}, \gamma\right) F0(x)=γargmin∑i=1nL(yi,γ)
Step 2: for m = 1 m=1 m=1 to M M M :
-
(A) Compute r i m = − [ ∂ L ( y i , F ( x i ) ) ∂ F ( x i ) ] F ( x ) = F m − 1 ( x ) r_{i m}=-\left[\frac{\partial L\left(y_{i}, F\left(x_{i}\right)\right)}{\partial F\left(x_{i}\right)}\right]_{F(x)=F_{m-1}(x)} rim=−[∂F(xi)∂L(yi,F(xi))]F(x)=Fm−1(x) for i = 1 , … , n i=1, \ldots, n i=1,…,n
-
(B) Fit a regression tree to the r i m r_{i m} rim values and create terminal regions R j m R_{j m} Rjm, for j = 1 … J m j=1 \ldots J_{m} j=1…Jm
-
© For j = 1 … J m j=1 \ldots J_{m} j=1…Jm compute γ j m = argmin γ ∑ x i ∈ R i j L ( y i , F m − 1 ( x i ) + γ ) \gamma_{j m}=\underset{\gamma}{\operatorname{argmin}} \sum_{x_{i} \in R_{i j}} L\left(y_{i}, F_{m-1}\left(x_{i}\right)+\gamma\right) γjm=γargmin∑xi∈RijL(yi,Fm−1(xi)+γ)
-
(D) Update F m ( x ) = F m − 1 ( x ) + ν ∑ j = 1 J m γ j m I ( x ∈ R j m ) F_{m}(x)=F_{m-1}(x)+\nu \sum_{j=1}^{J_{m}} \gamma_{j m} I\left(x \in R_{j m}\right) Fm(x)=Fm−1(x)+ν∑j=1JmγjmI(x∈Rj