逻辑回归

Logistic regression

目的: 分类还是回归?经典的二分类算法!

机器学习算法选择:先用逻辑回归,再用复杂的,能简单尽量简单

逻辑回归的的决策边界:可以是非线性的

Sigmoid 函数
在这里插入图片描述
公式:
h θ ( x ) = g ( θ T x ) = 1 1 + e − θ T x h_{\theta }(x)=g(\theta ^{T}x)=\tfrac{1}{1+e^{-\theta ^{T}x}} hθ(x)=g(θTx)=1+eθTx1

自变量取值为任意实数,值域[0,1]

解释:将任意的输入映射到了[0,1]区间,我们在线性回归中可以得到一个预测值,再将该值映射到Sigmoid 函数中。这样就完成了由值到概率的转换,也就是分类任务

预测函数:
h θ ( x ) = g ( θ T x ) = 1 1 + e − θ T x 其 中 : θ 0 + θ 1 x 1 + , . . . , + θ n x n = ∑ i = 1 n θ i x i = θ T x h_{\theta }(x)=g(\theta ^{T}x)=\tfrac{1}{1+e^{-\theta ^{T}x}}\newline 其中:\theta _{0}+\theta _{1}x_{1}+,...,+\theta _{n}x_{n}=\sum _{i=1}^{n}\theta _{i}x_{i}=\theta ^{T}x hθ(x)=g(θTx)=1+eθTx1θ0+θ1x1+,...,+θnxn=i=1nθixi=θTx
分类任务:
{ P ( y = 1 ∣ x ; θ ) = h θ ( x ) P ( y = 0 ∣ x ; θ ) = 1 − h θ ( x ) \begin{cases} P(y=1|x;\theta )=h_{\theta }(x)\\ P(y=0|x;\theta )=1-h_{\theta }(x) \end{cases} {P(y=1x;θ)=hθ(x)P(y=0x;θ)=1hθ(x)
整合:
P ( y ∣ x , θ ) = ( h θ ( x ) ) y ( 1 − h θ ( x ) ) 1 − y P(y|x,\theta )=(h_{\theta }(x))^{y}(1-h_{\theta }(x))^{1-y} P(yx,θ)=(hθ(x))y(1hθ(x))1y
解释:对于二分类任务(0,1),整合后y取0只保留 ( 1 − h θ ( x ) ) 1 − y (1-h_{\theta }(x))^{1-y} (1hθ(x))1y,y取1只保留 ( h θ ( x ) ) y (h_{\theta }(x))^{y} (hθ(x))y

似然函数:
L ( θ ) = ∏ i = 1 m P ( y i ∣ x i , θ ) = ∏ i = 1 m ( h θ ( x i ) ) y i ( 1 − h θ ( x i ) ) 1 − y i L(\theta )=\prod _{i=1}^{m}P(y_{i}|x_{i},\theta )=\prod _{i=1}^{m}(h_{\theta }(x_{i}))^{y_{i}}(1-h_{\theta }(x_{i}))^{1-y_{i}} L(θ)=i=1mP(yixi,θ)=i=1m(hθ(xi))yi(1hθ(xi))1yi
对数似然:
l ( θ ) = L ( θ ) = ∑ i = 1 m ( y i log ⁡ h θ ( x i ) + ( 1 − y i ) log ⁡ ( 1 − h θ ( x i ) ) ) l(\theta )=L(\theta )=\sum _{i=1}^{m}(y_{i}\log h_{\theta }(x_{i})+(1-y_{i})\log (1-h_{\theta }(x_{i}))) l(θ)=L(θ)=i=1m(yiloghθ(xi)+(1yi)log(1hθ(xi)))
此时应用梯度上升求最大值,引入 J ( θ ) = − 1 m l ( θ ) J(\theta )=-\tfrac{1}{m}l(\theta ) J(θ)=m1l(θ) 转换为小批量梯度下降求最小值任务

求导过程:
δ δ θ j j ( θ ) = − 1 m ∑ i = 1 m ( y i 1 h θ ( x i ) δ δ θ j − ( 1 − y i ) 1 1 − h θ ( x i ) δ δ θ j h θ ( x i ) ) = − 1 m ∑ i = 1 m ( y i 1 g ( θ T x i ) − ( 1 − y i ) 1 1 − g ( θ T x i ) ) δ δ θ j g ( θ T x i ) = − 1 m ∑ i = 1 m ( y i 1 g ( θ T x i ) − ( 1 − y i ) 1 1 − g ( θ T x i ) ) g ( θ T x i ) ( 1 − g ( θ T x i ) ) δ δ θ j θ T x i = − 1 m ∑ i = 1 m ( y i ( 1 − g ( θ T x i ) ) − ( 1 − y i ) g ( θ T x i ) ) x i j = − 1 m ∑ i = 1 m ( y i − g ( θ T x i ) ) x i j = 1 m ∑ i = 1 m ( h θ ( x i ) − y i ) x i j \begin{aligned} \tfrac{\delta }{\delta _{\theta _{j}}}j(\theta ) &=-\tfrac{1}{m}\sum _{i=1}^{m}\left ( y_{i}\tfrac{1}{h_{\theta }(x_{i})}\tfrac{\delta }{\delta _{\theta _{j}}}-(1-y_{i})\tfrac{1}{1-h_{\theta }(x_{i})}\tfrac{\delta }{\delta _{\theta _{j}}}h_{\theta }(x_{i}) \right ) \\ &=-\tfrac{1}{m}\sum _{i=1}^{m}\left (y_{i}\tfrac{1}{g(\theta ^{T}x_{i})}-(1-y_{i})\tfrac{1}{1-g(\theta ^{T}x_{i})}\right )\tfrac{\delta }{\delta _{\theta _{j}}}g(\theta ^{T}x_{i}) \\ &=-\tfrac{1}{m}\sum _{i=1}^{m}\left (y_{i}\tfrac{1}{g(\theta ^{T}x_{i})}-(1-y_{i})\tfrac{1}{1-g(\theta ^{T}x_{i})}\right )g(\theta ^{T}x_{i})(1-g(\theta ^{T}x_{i}))\tfrac{\delta }{\delta _{\theta _{j}}}\theta ^{T}x_{i} \\ &=-\tfrac{1}{m}\sum _{i=1}^{m}\left ( y_{i}(1-g(\theta ^{T}x_{i}))-(1-y_{i})g(\theta ^{T}x_{i}) \right )x_{i}^{j} \\ &=-\tfrac{1}{m}\sum _{i=1}^{m}\left ( y_{i}-g(\theta ^{T}x_{i}) \right )x_{i}^{j} \\ &=\tfrac{1}{m}\sum _{i=1}^{m}\left ( h_{\theta }(x_{i})-y_{i} \right )x_{i}^{j} \end{aligned} δθjδj(θ)=m1i=1m(yihθ(xi)1δθjδ(1yi)1hθ(xi)1δθjδhθ(xi))=m1i=1m(yig(θTxi)1(1yi)1g(θTxi)1)δθjδg(θTxi)=m1i=1m(yig(θTxi)1(1yi)1g(θTxi)1)g(θTxi)(1g(θTxi))δθjδθTxi=m1i=1m(yi(1g(θTxi))(1yi)g(θTxi))xij=m1i=1m(yig(θTxi))xij=m1i=1m(hθ(xi)yi)xij
参数更新:
θ j : = θ j − α 1 m ∑ i = 1 m ( h θ ( x i ) − y i ) x i j \theta _{j}:=\theta _{j}-\alpha \tfrac{1}{m}\sum _{i=1}^{m}(h_{\theta }(x_{i})-y_{i})x_{i}^{j} θj:=θjαm1i=1m(hθ(xi)yi)xij
多分类的softmax:
h θ ( x ( i ) ) = [ p ( y ( i ) = 1 ∣ x ( i ) ; θ ) p ( y ( i ) = 2 ∣ x ( i ) ; θ ) . . . p ( y ( i ) = k ∣ x ( i ) ; θ ) ] = 1 ∑ j = 1 k e θ j T x ( i ) [ e θ 1 T x ( i ) e θ 2 T x ( i ) . . . e θ k T x ( i ) ] h_{\theta }(x^{(i)})=\begin{bmatrix} p(y^{(i)}=1|x^{(i)};\theta )\\ p(y^{(i)}=2|x^{(i)};\theta )\\ .\\ .\\ .\\ p(y^{(i)}=k|x^{(i)};\theta )\\ \end{bmatrix} =\tfrac{1}{\sum _{j=1}^{k}e^{\theta _{j}^{T}x^{(i)}}} \begin{bmatrix} e^{\theta _{1}^{T}x^{(i)}}\\ e^{\theta _{2}^{T}x^{(i)}}\\ .\\ .\\ .\\ e^{\theta _{k}^{T}x^{(i)}}\\ \end{bmatrix} hθ(x(i))=p(y(i)=1x(i);θ)p(y(i)=2x(i);θ)...p(y(i)=kx(i);θ)=j=1keθjTx(i)1eθ1Tx(i)eθ2Tx(i)...eθkTx(i)
总结:逻辑回归真的真的很好很好用

  • 0
    点赞
  • 0
    收藏
    觉得还不错? 一键收藏
  • 0
    评论

“相关推荐”对你有帮助么?

  • 非常没帮助
  • 没帮助
  • 一般
  • 有帮助
  • 非常有帮助
提交
评论
添加红包

请填写红包祝福语或标题

红包个数最小为10个

红包金额最低5元

当前余额3.43前往充值 >
需支付:10.00
成就一亿技术人!
领取后你会自动成为博主和红包主的粉丝 规则
hope_wisdom
发出的红包
实付
使用余额支付
点击重新获取
扫码支付
钱包余额 0

抵扣说明:

1.余额是钱包充值的虚拟货币,按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载,可以购买VIP、付费专栏及课程。

余额充值