算法强化 —— 提升树算法(三)

二分类问题

对于二分类问题,原论文中使用的对数损失函数:
L ( y , F ) = l o g ( 1 + e x p ( − 2 y F ) ) , y ∈ − 1 , 1 L(y,F) = log(1+exp(-2yF)),y \in -1,1 L(y,F)=log(1+exp(2yF)),y1,1
其中
F ( x ) = 1 2 l o g [ P r ( y = 1 ∣ x ) P r ( y = − 1 ∣ x ) ] F(x) = \frac{1}{2}log \left[\frac{Pr(y=1|x)}{Pr(y=-1|x)} \right] F(x)=21log[Pr(y=1x)Pr(y=1x)]
那么按照上面的算法一步步进行计算,首先计算负梯度
y ~ i = − [ ∂ L ( y , F ( x i ) ) ∂ F ( x i ) ] F ( x ) = F m − 1 ( x ) = 2 y i 1 + exp ⁡ ( 2 y i F m − 1 ( x i ) ) \tilde{y}_{i}=-\left[\frac{\partial L\left(y, F\left(x_{i}\right)\right)}{\partial F\left(x_{i}\right)}\right]_{F(x)=F_{m-1}(x)}=\frac{2 y_{i}}{1+\exp \left(2 y_{i} F_{m-1}\left(x_{i}\right)\right)} y~i=[F(xi)L(y,F(xi))]F(x)=Fm1(x)=1+exp(2yiFm1(xi))2yi

然后估计叶子节点的值
γ j m = argmin ⁡ γ ∑ x i ∈ R m log ⁡ ( 1 + exp ⁡ ( − 2 y i ( F m − 1 ( x i ) + γ ) ) ) \gamma_{j m}=\operatorname{argmin}_{\gamma} \sum_{x_{i} \in R_{m}} \log \left(1+\exp \left(-2 y_{i}\left(F_{m-1}\left(x_{i}\right)+\gamma\right)\right)\right) γjm=argminγxiRmlog(1+exp(2yi(Fm1(xi)+γ)))
原论文中,直接使用Newton-Raphson方法得出近似结果,
γ j m = ∑ x i ∈ R m y ~ i ∑ x i ∈ R m ∣ y ~ i ∣ ( 2 − ∣ y ~ i ∣ ) \gamma_{j m}=\frac{\sum_{x_{i} \in R_{m}} \tilde{y}_{i}}{\sum_{x_{i} \in R_{m}}\left|\tilde{y}_{i}\right|\left(2-\left|\tilde{y}_{i}\right|\right)} γjm=xiRmy~i(2y~i)xiRmy~i

初始值如何设置

在梯度提升树算法中,我们知道,初始值的设置是:
F o ( x ) = a r g m i n ∑ i = 1 N L ( y i , F ( x i ) ) F_o(x) = argmin \sum_{i=1}^N L(y_i,F(x_i)) Fo(x)=argmini=1NL(yi,F(xi))
我们让损失函数L对F求偏导,并令偏导为0,求极值
∂ ∑ i = 1 N L ( y i , F ( x i ) ) ∂ F = 0 ∑ i = 1 N ( − 2 y i ) e − 2 y i F e − 2 y i F + 1 = 0 \begin{aligned} &\frac{\partial \sum_{i=1}^{N} L\left(y_{i}, F\left(x_{i}\right)\right)}{\partial F}=0\\ &\sum_{i=1}^{N} \frac{\left(-2 y_{i}\right) e^{-2 y_{i} F}}{e^{-2 y_{i} F}+1}=0 \end{aligned} Fi=1NL(yi,F(xi))=0i=1Ne2yiF+1(2yi)e2yiF=0
由于是二分类,所以yi的取值是1和-1,所以有
∑ i : y i = 1 2 e − 2 F e − 2 F + 1 + ∑ i : y i = − 1 − 2 e 2 F e 2 F + 1 = 0 \sum_{i:y_i=1} \frac{2e^{-2F}}{e^{-2F}+1} + \sum_{i:y_i=-1} \frac{-2e^{2F}}{e^{2F}+1} = 0 i:yi=1e2F+12e2F+i:yi=1e2F+12e2F=0
将分母处理成一致:
\sum_{i:y_i=1} \frac{2}{e^{2F}+1} + \sum_{i:y_i=-1} \frac{-2e{2F}}{e{2F}+1} = 0
设正样本数量为m个,负样本数量为n个,则有:
m − n e 2 F = 0 m-ne^{2F} = 0 mne2F=0
e 2 F = m n = 1 + m − n m + n 1 − m − n m + n = 1 + y ˉ 1 − y ˉ e^{2F} = \frac{m}{n} = \frac{1+\frac{m-n}{m+n}}{1-\frac{m-n}{m+n}} = \frac{1+\bar{y}}{1-\bar{y}} e2F=nm=1m+nmn1+m+nmn=1yˉ1+yˉ
m+n表示样本总数,m-n表示yi求和
最终可以得出
F o ( X ) = 1 2 l o g 1 + y ˉ 1 − y ˉ F_o(X) = \frac{1}{2}log \frac{1+\bar{y}}{1-\bar{y}} Fo(X)=21log1yˉ1+yˉ

牛顿近似法求解

如何将公式1转化为公式2
γ j m = argmin ⁡ γ ∑ x i ∈ R m log ⁡ ( 1 + exp ⁡ ( − 2 y i ( F m − 1 ( x i ) + γ ) ) ) \gamma_{j m}=\operatorname{argmin}_{\gamma} \sum_{x_{i} \in R_{m}} \log \left(1+\exp \left(-2 y_{i}\left(F_{m-1}\left(x_{i}\right)+\gamma\right)\right)\right) γjm=argminγxiRmlog(1+exp(2yi(Fm1(xi)+γ)))
γ j m = ∑ x i ∈ R m y ~ i ∑ x i ∈ R m ∣ y ~ i ∣ ( 2 − ∣ y ~ i ∣ ) \gamma_{j m}=\frac{\sum_{x_{i} \in R_{m}} \tilde{y}_{i}}{\sum_{x_{i} \in R_{m}}\left|\tilde{y}_{i}\right|\left(2-\left|\tilde{y}_{i}\right|\right)} γjm=xiRmy~i(2y~i)xiRmy~i
首先,牛顿法是一种迭代求解的方法,论文中提到进一步迭代,我们首先令:
g ( γ ) = ∑ x i ∈ R j m l o g ( 1 + e x p ( − 2 y i ( F m − 1 ( x i + γ ) ) ) ) g(\gamma) = \sum_{x_i \in R_{jm}} log (1+exp(-2y_i(F_{m-1}(x_i+\gamma)))) g(γ)=xiRjmlog(1+exp(2yi(Fm1(xi+γ))))
然后使用牛顿法求解 γ 0 = 0 \gamma_0 = 0 γ0=0开始迭代
γ j m = γ 0 − g ′ ( γ 0 ) g ′ ′ ( γ 0 ) = − g ′ ( γ 0 ) g ′ ′ ( γ 0 ) \gamma_{j m}=\gamma_{0}-\frac{g^{\prime}\left(\gamma_{0}\right)}{g^{\prime \prime}\left(\gamma_{0}\right)}=-\frac{g^{\prime}\left(\gamma_{0}\right)}{g^{\prime \prime}\left(\gamma_{0}\right)} γjm=γ0g(γ0)g(γ0)=g(γ0)g(γ0)
然后分别对 γ \gamma γ进行一阶求导和二阶求导
g ′ ( γ ) = ∑ x i ∈ R j m − 2 y i 1 + exp ⁡ ( 2 y i ( F m − 1 ( x i ) + γ ) ) g^{\prime}(\gamma)=\sum_{x_{i} \in R_{j m}} \frac{-2 y_{i}}{1+\exp \left(2 y_{i}\left(F_{m-1}\left(x_{i}\right)+\gamma\right)\right)} g(γ)=xiRjm1+exp(2yi(Fm1(xi)+γ))2yi
g ′ ′ ( γ ) = ∑ x i ∈ R j m 4 y i 2 exp ⁡ ( 2 y i ( F m − 1 ( x i ) + γ ) ) [ 1 + exp ⁡ ( 2 y i ( F m − 1 ( x i ) + γ ) ) ] 2 = ∑ x i ∈ R j m 4 y i 2 ( exp ⁡ ( 2 y i ( F m − 1 ( x i ) + γ ) ) + 1 ) − 4 y i 2 [ 1 + exp ⁡ ( 2 y i ( F m − 1 ( x i ) + γ ) ) ] 2 g^{\prime \prime}(\gamma)=\sum_{x_{i} \in R_{j m}} \frac{4 y_{i}^{2} \exp \left(2 y_{i}\left(F_{m-1}\left(x_{i}\right)+\gamma\right)\right)}{\left[1+\exp \left(2 y_{i}\left(F_{m-1}\left(x_{i}\right)+\gamma\right)\right)\right]^{2}}=\sum_{x_{i} \in R_{jm}} \frac{4 y_{i}^{2}\left(\exp \left(2 y_{i}\left(F_{m-1}\left(x_{i}\right)+\gamma\right)\right)+1\right)-4 y_{i}^{2}}{\left[1+\exp \left(2 y_{i}\left(F_{m-1}\left(x_{i}\right)+\gamma\right)\right)\right]^{2}} g(γ)=xiRjm[1+exp(2yi(Fm1(xi)+γ))]24yi2exp(2yi(Fm1(xi)+γ))=xiRjm[1+exp(2yi(Fm1(xi)+γ))]24yi2(exp(2yi(Fm1(xi)+γ))+1)4yi2
然后由于
y ~ i = − [ ∂ L ( y , F ( x i ) ) ∂ F ( x i ) ] F ( x ) = F m − 1 ( x ) = 2 y i 1 + exp ⁡ ( 2 y i F m − 1 ( x i ) ) \tilde{y}_{i}=-\left[\frac{\partial L\left(y, F\left(x_{i}\right)\right)}{\partial F\left(x_{i}\right)}\right]_{F(x)=F_{m-1}(x)}=\frac{2 y_{i}}{1+\exp \left(2 y_{i} F_{m-1}\left(x_{i}\right)\right)} y~i=[F(xi)L(y,F(xi))]F(x)=Fm1(x)=1+exp(2yiFm1(xi))2yi
所以可以近似的得出
g ′ ( γ ) = ∑ x i ∈ R j m − 2 y i 1 + exp ⁡ ( 2 y i ( F m − 1 ( x i ) + γ ) ) = − y ~ i g^{\prime}(\gamma)=\sum_{x_{i} \in R_{j m}} \frac{-2 y_{i}}{1+\exp \left(2 y_{i}\left(F_{m-1}\left(x_{i}\right)+\gamma\right)\right)} = -\tilde{y}_{i} g(γ)=xiRjm1+exp(2yi(Fm1(xi)+γ))2yi=y~i
g ′ ′ ( γ ) = ∑ x i ∈ R j m 4 y i 2 ( exp ⁡ ( 2 y i ( F m − 1 ( x i ) + γ ) ) + 1 ) − 4 y i 2 [ 1 + exp ⁡ ( 2 y i ( F m − 1 ( x i ) + γ ) ) ] 2 g^{\prime \prime}(\gamma)=\sum_{x_{i} \in R_{j m}} \frac{4 y_{i}^{2} (\exp \left(2 y_{i}\left(F_{m-1}\left(x_{i}\right)+\gamma\right)\right)+1)-4y_i^2}{\left[1+\exp \left(2 y_{i}\left(F_{m-1}\left(x_{i}\right)+\gamma\right)\right)\right]^{2}} g(γ)=xiRjm[1+exp(2yi(Fm1(xi)+γ))]24yi2(exp(2yi(Fm1(xi)+γ))+1)4yi2
= ∑ x i ∈ R j m [ 2 ∗ 2 y i 2 [ 1 + exp ⁡ ( 2 y i ( F m − 1 ( x i ) + γ ) ) ] − y i 2 ~ ] =\sum_{x_{i} \in R_{jm}}\left[ \frac{2*2y_i^2}{\left[1+\exp \left(2 y_{i}\left(F_{m-1}\left(x_{i}\right)+\gamma\right)\right)\right]} -\tilde{y_i^2}\right] =xiRjm[[1+exp(2yi(Fm1(xi)+γ))]22yi2yi2~]
由于yi取值为+1或者-1,所以 y i 2 = ∣ y i ∣ y_i^2 = |y_i| yi2=yi,所以有:
g ′ ′ ( γ ) = ∣ y i ~ ∣ ( 2 − ∣ y i ~ ∣ ) g^{\prime \prime}(\gamma) = |\tilde{y_i}|(2-|\tilde{y_i}|) g(γ)=yi~(2yi~)

二分类问题

最终我们求出F(x),那么如何使用它进行分类呢:
F ( x ) = 1 2 l o g ( p 1 − p ) F(x) = \frac{1}{2}log \left(\frac{p}{1-p} \right) F(x)=21log(1pp)
稍微进行转化可得
e 2 F ( x ) = p 1 − p e^{2F(x)} = \frac{p}{1-p} e2F(x)=1pp
进一步转换可得
P + ( x ) = p = e 2 F ( x ) 1 + e 2 F ( x ) = 1 1 + e − 2 F ( x ) P_{+}(x) = p = \frac{e^{2F(x)}}{1+e^{2F(x)}} = \frac{1}{1+e^{-2F(x)}} P+(x)=p=1+e2F(x)e2F(x)=1+e2F(x)1
P − ( x ) = 1 − p = 1 1 + e 2 F ( x ) P_{-}(x) = 1-p = \frac{1}{1+e^{2F(x)}} P(x)=1p=1+e2F(x)1
最终实现二分类

  • 0
    点赞
  • 0
    收藏
    觉得还不错? 一键收藏
  • 0
    评论

“相关推荐”对你有帮助么?

  • 非常没帮助
  • 没帮助
  • 一般
  • 有帮助
  • 非常有帮助
提交
评论
添加红包

请填写红包祝福语或标题

红包个数最小为10个

红包金额最低5元

当前余额3.43前往充值 >
需支付:10.00
成就一亿技术人!
领取后你会自动成为博主和红包主的粉丝 规则
hope_wisdom
发出的红包
实付
使用余额支付
点击重新获取
扫码支付
钱包余额 0

抵扣说明:

1.余额是钱包充值的虚拟货币,按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载,可以购买VIP、付费专栏及课程。

余额充值