MLlib - Optimization Module - Gradient

MLlib - Optimization Module - Gradient

@(Hadoop & Spark)[machine learning|algorithm|statistics|Spark]

Topic: Gradient - LogisticGradient

Inference process


  • probility
    P(y=0|x,w)=1/(1+iK1exp(xwi))

    P(y=1|x,w)=exp(xw1)/(1+iK1exp(xwi))

    ...

    P(y=K1|x,w)=exp(xwK1)/(1+iK1exp(xwi))
  • loss function
    l(w,x)=logP(y|x,w)=α(y)logP(y=0|x,w)(1α(y))logP(y|x,w)=log(1+iK1exp(xwi))(1α(y))xwy1=log(1+iK1exp(marginsi))(1α(y))marginsy1

    where
    α(i)=1 if i !=0,andα(i)=0 if i==0,marginsi=xwi
  • first derivative
    l(w,x)wij=(exp(xwi)(1+K1kexp(xwk))(1α(y)δy,i+1))xj=multiplierixj

    where
    δi,j=1 if i==j,δi,j=0 if i!=j,andmultiplier=exp(marginsi)(1+K1kexp(marginsi))(1α(y)δy,i+1)

Arithmetic overflow
when max(margins) > 0 Arithmetic overflow happen so the loss function and the multiplier need rewritten as below:
l(w,x)=log(1+iK1exp(marginsi))(1α(y))marginsy1=log(exp(maxMargin)+iK1exp(marginsimaxMargin))+maxMargin(1α(y))marginsy1=log(sum+1)+maxMargin(1α(y))marginsy1

multiplier=exp(marginsi)(1+K1kexp(marginsi))(1α(y)δy,i+1)=exp(marginsimaxMargin)(1+sum)(1α(y)δy,i+1)

where
sum=exp(maxMargin)+iK1exp(marginsimaxMargin)1
reference

In The Elements of Statistical Learning: Data Mining, Inference, and Prediction, 2nd Edition by Trevor Hastie, Robert Tibshirani, and Jerome Friedman (which can be downloaded from http://statweb.stanford.edu/~tibs/ElemStatLearn/ , Eq. (4.17) on page 119 gives the formula of multinomial logistic regression model)

评论
添加红包

请填写红包祝福语或标题

红包个数最小为10个

红包金额最低5元

当前余额3.43前往充值 >
需支付:10.00
成就一亿技术人!
领取后你会自动成为博主和红包主的粉丝 规则
hope_wisdom
发出的红包
实付
使用余额支付
点击重新获取
扫码支付
钱包余额 0

抵扣说明:

1.余额是钱包充值的虚拟货币,按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载,可以购买VIP、付费专栏及课程。

余额充值