简介
GBDT 的全称是 Gradient Boosting Decision Tree,梯度提升决策树,由Freidman提出。GBDT也是集成学习Boosting家族的成员,但是却和传统的Adaboost有很大的不同。Adaboost是利用前一轮迭代弱学习器的误差率来更新训练集的权重。GBDT也是迭代,使用了前向分布算法,但是弱学习器限定了只能使用CART回归树模型,同时迭代思路和Adaboost也有所不同。
AdaBoost回顾
u n ( t + 1 ) = { u n ( t ) ⋅ Θ t ,    i f    i n c o r r e c t ⇒ y n g t ( x n ) = − 1 u n ( t ) / Θ t ,    i f    c o r r e c t ⇒ y n g t ( x n ) = 1 {u_n}^{\left( {t + 1} \right)} = \left\{ \begin{array}{l} {u_n}^{\left( t \right)} \cdot {\Theta _t},\;if\;incorrect \Rightarrow {y_n}{g_t}\left( { {x_n}} \right) = - 1\\ {u_n}^{\left( t \right)}/{\Theta _t},\;if\;correct \Rightarrow {y_n}{g_t}\left( { {x_n}} \right) = 1 \end{array} \right. un(t+1)={ un(t)⋅Θt,ifincorrect⇒yngt(xn)=−1un(t)/Θt,ifcorrect⇒yngt(xn)=1
这里的 u u u代表同一份数据取几次,而 Θ t = 1 − ε t ε t {\Theta _t} = \sqrt {\frac{ {1 - {\varepsilon _t}}}{ { {\varepsilon _t}}}} Θt=εt1−εt,其中 ε t \varepsilon _t εt代表错误率。
我们可以进一步化简,可得 u n ( t + 1 ) = u n ( t ) ⋅ Θ t − y n g t ( x n ) {u_n}^{\left( {t + 1} \right)} = {u_n}^{\left( t \right)} \cdot \Theta _t^{ - {y_n}{g_t}\left( { {x_n}} \right)} un(t+1)=un(t)⋅Θt−yngt(xn)
因为 α t = ln Θ t = ln 1 − ε t ε t {\alpha _t} = \ln {\Theta _t} = \ln \sqrt {\frac{ {1 - {\varepsilon _t}}}{ { {\varepsilon _t}}}} αt=lnΘt=lnεt1−εt
所以 u n ( T + 1 ) = u n ( 1 ) ⋅ ∏ t = 1 T e − y n α t g t ( x n ) = 1 N ⋅ e − y n ∑ t = 1 T α t g t ( x n ) {u_n}^{\left( {T + 1} \right)} = {u_n}^{\left( 1 \right)} \cdot \prod\limits_{t = 1}^T { {e^{ - {y_n}{\alpha _t}{g_t}\left( { {x_n}} \right)}}} = \frac{1}{N} \cdot {e^{ - {y_n}\sum\limits_{t = 1}^T { {\alpha _t}{g_t}\left( { {x_n}} \right)} }} un(T+1)=un(1)⋅