逻辑回归——笔记

指数族分布

P ( y ; η ) = b ( y ) exp ⁡ [ η T T ( y ) − a ( η ) ] P(y;\eta)=b(y)\exp{\left[ \eta^TT(y)-a(\eta) \right]} P(y;η)=b(y)exp[ηTT(y)a(η)]
其中 η \eta η为自然参数, T ( y ) T(y) T(y)是充分统计量, a ( η ) a(\eta) a(η)是归一化因子

对二项分布建模

逻辑回归假设目标变量服从二项分布
P ( y ; θ ) = ϕ y ( 1 − ϕ ) 1 − y = exp ⁡ [ y log ⁡ ϕ + ( 1 − y ) log ⁡ ( 1 − ϕ ) ] = exp ⁡ [ ( log ⁡ ϕ 1 − ϕ ) y + log ⁡ ( 1 − ϕ ) ] \begin{aligned} P(y;\theta)&=\phi^y(1-\phi)^{1-y}\\ &=\exp{\left[ y\log \phi+(1-y)\log(1-\phi) \right]}\\ &=\exp{\left[ (\log{\frac{\phi}{1-\phi})y+\log{(1-\phi)}} \right]} \end{aligned} P(y;θ)=ϕy(1ϕ)1y=exp[ylogϕ+(1y)log(1ϕ)]=exp[(log1ϕϕ)y+log(1ϕ)]
log ⁡ ϕ 1 − ϕ = η \log{\frac{\phi}{1-\phi}}=\eta log1ϕϕ=η,然后得出 ϕ = 1 1 + e − η \phi=\frac{1}{1+e^{-\eta}} ϕ=1+eη1,这便是sigmoid函数的来源,我们用sigmoid函数将 η \eta η转换后,作为二项分布的概率。记
s i g m o i d ( x ) = σ ( x ) = 1 1 + e − x sigmoid(x)=\sigma(x)=\frac{1}{1+e^{-x}} sigmoid(x)=σ(x)=1+ex1

sigmoid

sigmoid函数的导数为
d σ ( x ) d x = e − x ( 1 + e − x ) 2 = 1 1 + e − x e − x 1 + e − x = 1 1 + e − x ( 1 + e − x ) − 1 1 + e − x = 1 1 + e − x ( 1 − 1 1 + e − x ) = σ ( x ) ( 1 − σ ( x ) ) \begin{aligned} \frac{d \sigma(x)}{dx}&=\frac{e^{-x}}{(1+e^{-x})^2}\\ &=\frac{1}{1+e^{-x}}\frac{e^{-x}}{1+e^{-x}}\\ &=\frac{1}{1+e^{-x}}\frac{(1+e^{-x})-1}{1+e^{-x}}\\ &=\frac{1}{1+e^{-x}}\left( 1-\frac{1}{1+e^{-x}} \right)\\ &=\sigma(x)(1-\sigma(x)) \end{aligned} dxdσ(x)=(1+ex)2ex=1+ex11+exex=1+ex11+ex(1+ex)1=1+ex1(11+ex1)=σ(x)(1σ(x))

参数估计

这里我们还有一个假设 η = θ T x \eta=\theta^Tx η=θTx,使用最大似然估计对参数 θ \theta θ进行估计。
L ( θ ) = ∏ i = 1 n p ( y ( i ) ; θ ) = ∏ i = 1 n σ ( x ) y ( 1 − σ ( x ) 1 − y ) L(\theta)=\prod_{i=1}^{n}p(y^{(i)};\theta)=\prod_{i=1}^{n}\sigma(x)^y(1-\sigma(x)^{1-y}) L(θ)=i=1np(y(i);θ)=i=1nσ(x)y(1σ(x)1y)
转化为对数似然函数
ℓ θ ) = ∑ i = 1 n log ⁡ p ( y ( i ) ; θ ) = ∑ i = 1 n log ⁡ σ ( x ( i ) ) y ( i ) ( 1 − σ ( x ( i ) ) ) 1 − y ( i ) = ∑ i = 1 n [ y ( i ) log ⁡ σ ( x ( i ) ) + ( 1 − y ( i ) ) log ⁡ ( 1 − σ ( x ( i ) ) ) ] \begin{aligned} \ell\theta)&=\sum_{i=1}^{n}{\log{p(y^{(i)};\theta)}}\\ &=\sum_{i=1}^{n}{\log{\sigma(x^{(i)})^{y^{(i)}}(1-\sigma(x^{(i)}))^{1-y^{(i)}}}}\\ &=\sum_{i=1}^{n}{\left[ y^{(i)}\log{\sigma(x^{(i)})+(1-y^{(i)})\log{(1-\sigma(x^{(i)}))}} \right]} \end{aligned} θ)=i=1nlogp(y(i);θ)=i=1nlogσ(x(i))y(i)(1σ(x(i)))1y(i)=i=1n[y(i)logσ(x(i))+(1y(i))log(1σ(x(i)))]
然后对 ℓ ( θ ) \ell(\theta) (θ)求偏导
∂ ℓ ( θ ) ∂ θ = ∂ ℓ ( θ ) ∂ σ ( x ) ∂ σ ( x ) ∂ θ T x ∂ θ T x ∂ θ = ( y σ ( x ) − 1 − y 1 − σ ( x ) ) [ σ ( x ) ( 1 − σ ( x ) ) ] x = [ y ( 1 − σ ( x ) ) − ( 1 − y ) σ ( x ) ] x = [ y − y σ ( x ) − σ ( x ) + y σ ( x ) ] x = ( y − σ ( x ) ) x \begin{aligned} \frac{\partial \ell \left( \theta \right)}{\partial \theta}&=\frac{\partial \ell \left( \theta \right)}{\partial \sigma \left( x \right)}\frac{\partial \sigma \left( x \right)}{\partial \theta ^Tx}\frac{\partial \theta ^Tx}{\partial \theta}\\ &=\left( \frac{y}{\sigma \left( x \right)}-\frac{1-y}{1-\sigma \left( x \right)} \right) \left[ \sigma \left( x \right) \left( 1-\sigma \left( x \right) \right) \right] x\\ &=\left[ y\left( 1-\sigma \left( x \right) \right) -\left( 1-y \right) \sigma \left( x \right) \right] x\\ &=\left[ y-y\sigma \left( x \right) -\sigma \left( x \right) +y\sigma \left( x \right) \right] x\\ &=\left( y-\sigma \left( x \right) \right) x\\ \end{aligned} θ(θ)=σ(x)(θ)θTxσ(x)θθTx=(σ(x)y1σ(x)1y)[σ(x)(1σ(x))]x=[y(1σ(x))(1y)σ(x)]x=[yyσ(x)σ(x)+yσ(x)]x=(yσ(x))x
采用随机梯度下降,即每次只使用一个样本 ( x ( i ) , y ( i ) ) \left( x^{\left( i \right)},y^{\left( i \right)} \right) (x(i),y(i))计算偏导,故 θ \theta θ的更新策略为
θ : = θ + α ( y ( i ) − σ ( x ( i ) ) ) x ( i ) \theta:=\theta+\alpha(y^{(i)}-\sigma(x^{(i)}))x^{(i)} θ:=θ+α(y(i)σ(x(i)))x(i)

评论
添加红包

请填写红包祝福语或标题

红包个数最小为10个

红包金额最低5元

当前余额3.43前往充值 >
需支付:10.00
成就一亿技术人!
领取后你会自动成为博主和红包主的粉丝 规则
hope_wisdom
发出的红包
实付
使用余额支付
点击重新获取
扫码支付
钱包余额 0

抵扣说明:

1.余额是钱包充值的虚拟货币,按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载,可以购买VIP、付费专栏及课程。

余额充值