机器学习---基础算法二--回归实践

机器学习—基础算法二

https://blog.csdn.net/fan2312/article/details/100854485

回归

  • 均方误差: M S E = 1 m ∑ i = 1 m ( y i − y i ^ ) 2 MSE = \frac{1}{m}\sum^m_{i=1}(y_i-\hat{y_i})^2 MSE=m1i=1m(yiyi^)2

  • 均方根误差: 标准误差: R M S E = M S E RMSE = \sqrt{MSE} RMSE=MSE

  • 总平方和: T S S = ∑ i = 1 m ( y i − y ‾ ) 2 TSS = \sum^m_{i=1}(y_i-\overline{y})^2 TSS=i=1m(yiy)2

  • 伪方差: V a r ( Y ) = T S S / m Var(Y) = TSS/m Var(Y)=TSS/m

  • 残差平方和: R S S = ∑ i = 1 m ( y i ^ − y i ) 2 RSS=\sum^m_{i=1}(\hat{y_i} - y_i)^2 RSS=i=1m(yi^yi)2

    • 即误差平方和 SSE
  • R 2 = T S S − R S S T S S = 1 − R S S T S S R^2 = \frac{TSS-RSS}{TSS}=1-\frac{RSS}{TSS} R2=TSSTSSRSS=1TSSRSS

    • R 2 R^2 R2越大,拟合效果越好
    • R 2 R^2 R2的最优值为1;若模型预测为随机值, R 2 R^2 R2有可能为负
    • 若预测值恒为样本期望, R 2 R^2 R2为0
  • 回归平方和: E S S = ∑ i = 1 m ( y i ^ − y ‾ ) 2 ESS=\sum^m_{i=1}(\hat{y_i}-\overline{y})^2 ESS=i=1m(yi^y)2

    • T S S ≥ E S S + R S S TSS\geq ESS+RSS TSSESS+RSS
    • 只有无偏估计时, T S S = E S S + R S S TSS=ESS+RSS TSS=ESS+RSS
  • 局部加权线性回归(LWR)

    • 目标函数
    • 权值的设置:
      • 高斯核函数
        • w ( i ) = exp ⁡ ( − ( x ( i ) − x ) 2 2 τ 2 ) w^{(i)}=\exp(-\frac{(x^{(i)}-x)^2}{2\tau^2}) w(i)=exp(2τ2(x(i)x)2)
        • τ \tau τ称为带宽,控制训练样本随着与 x ( i ) x^{(i)} x(i)距离的衰减速率
      • 多项式核函数:
        • κ ( x 1 , x 2 ) = ( < x 1 , x 2 > + R ) d \kappa(x_1,x_2) = (<x_1,x_2>+R)^d κ(x1,x2)=(<x1,x2>+R)d
  • Logistic回归(sigmoid)

    • h θ ( x ) = g ( θ T x ) = 1 1 + e − θ T x h_\theta(x)=g(\theta^Tx)=\frac{1}{1+e^{-\theta^{T}x}} hθ(x)=g(θTx)=1+eθTx1
    • g ′ ( x ) = ( 1 1 + e − x ) ′ = g ( x ) ⋅ ( 1 − g ( x ) ) g'(x)=(\frac{1}{1+e^{-x}})'=g(x)\cdot(1-g(x)) g(x)=(1+ex1)=g(x)(1g(x))
    • 参数估计
      • 假定
        • P ( y = 1 ∣ x ; θ ) = h θ ( x ) P(y=1|x;\theta)=h_\theta(x) P(y=1x;θ)=hθ(x)
        • P ( y = 0 ∣ x ; θ ) = 1 − h θ ( x ) P(y=0|x;\theta)=1-h_\theta(x) P(y=0x;θ)=1hθ(x)
      • p ( y ∣ x ; θ ) = ( h θ ( x ) ) y ( 1 − h θ ( x ) ) 1 − y p(y|x;\theta)=(h_\theta(x))^y(1-h_\theta(x))^{1-y} p(yx;θ)=(hθ(x))y(1hθ(x))1y
      • 似然函数: L ( θ ) = p ( y → ∣ X ; θ ) L(\theta)=p(\overrightarrow{y}|X;\theta) L(θ)=p(y X;θ)
                    = ∏ i = 1 m p ( y ( i ) ∣ x ( i ) ; θ ) =\prod^m_{i=1}p(y^{(i)}|x^{(i)};\theta) =i=1mp(y(i)x(i);θ)
                    = ∏ i = 1 m ( h θ ( x ( i ) ) ) y ( i ) ( 1 − h θ ( x ( i ) ) ) 1 − y ( i ) =\prod^m_{i=1}(h_\theta(x^{(i)}))^{y^{(i)}}(1-h_\theta(x^{(i)}))^{1-y^{(i)}} =i=1m(hθ(x(i)))y(i)(1hθ(x(i)))1y(i)
      • l ( θ ) = l o g L ( θ ) = ∑ i = 1 m y ( i ) l o g h ( x ( i ) ) + ( 1 − y ( i ) ) l o g ( 1 − h ( x ( i ) ) ) l(\theta)=logL(\theta)=\sum^m_{i=1}y^{(i)}logh(x^{(i)})+(1-y^{(i)})log(1-h(x{^{(i)}})) l(θ)=logL(θ)=i=1my(i)logh(x(i))+(1y(i))log(1h(x(i)))
      • ∂ l ( θ ) ∂ θ j = ∑ i = 1 m ( y ( i ) − g ( θ T x ( i ) ) ) ⋅ x j ( i ) \frac{\partial l(\theta)}{\partial\theta_j}=\sum^m_{i=1}(y^{(i)}-g(\theta^Tx^{(i)}))\cdot x^{(i)}_j θjl(θ)=i=1m(y(i)g(θTx(i)))xj(i)
      • 参数更新
        • θ j : = θ j + α ( y ( i ) − h θ ( x ( i ) ) ) x j ( i ) \theta_j:=\theta_j+\alpha(y^{(i)}-h_\theta(x^{(i)}))x_j^{(i)} θj:=θj+α(y(i)hθ(x(i)))xj(i)
    • 用于分类
    • 损失
      • y i = { − 1 , 1 } y_i=\begin{Bmatrix}-1,1\end{Bmatrix} yi={1,1}
      • y ^ i = { p i y i = 1 1 − p i y i = − 1 \hat{y}_i=\begin{cases} p_i &y_i=1\\ 1-p_i & y_i=-1\end{cases} y^i={pi1piyi=1yi=1
      • 似然函数 L ( θ ) = ∏ i = 1 m p i ( y i + 1 ) / 2 ( 1 − p i ) − ( y i − 1 ) / 2 L(\theta)=\prod^m_{i=1}p^{(y_i+1)/2}_i(1-p_i)^{-(y_i-1)/2} L(θ)=i=1mpi(yi+1)/2(1pi)(yi1)/2
      • l n L ( θ ) ⇒ l ( θ ) = ∑ ( i = 1 ) m l n [ p i ( y i + 1 ) / 2 ( 1 − p i ) − ( y i − 1 ) / 2 ] lnL(\theta)\Rightarrow l(\theta)=\sum^m_{(i=1)}ln[p^{(y_i+1)/2}_i(1-p_i)^{-(y_i-1)/2}] lnL(θ)l(θ)=(i=1)mln[pi(yi+1)/2(1pi)(yi1)/2]
        p i = 1 1 + e − f i → l ( θ ) = ∑ i = 1 m l n [ ( 1 1 + e − f i ) ( y i + 1 ) / 2 ( 1 1 + e f i ) ( y i − 1 ) / 2 ) ] \underrightarrow{p_i=\frac{1}{1+e^{-f_i}}} l(\theta)=\sum^m_{i=1}ln[(\frac{1}{1+e^{-f_i}})^{(y_i+1)/2}(\frac{1}{1+e^{f_i}})^{(y_{i}-1)/2})] pi=1+efi1l(θ)=i=1mln[(1+efi1)(yi+1)/2(1+efi1)(yi1)/2)]
      • l o s s ( y i , y ^ i ) = − l ( θ ) loss(y_i,\hat y_i)=-l(\theta) loss(yi,y^i)=l(θ)
        = ∑ i = 1 m [ 1 2 ( y i + 1 ) l n ( 1 + e − f i ) − 1 2 ( y i − 1 ) l n ( 1 + e f i ) ] \sum^m_{i=1}[\frac{1}{2}(y_i+1)ln(1+e^{-f_i})-\frac{1}{2}(y_i-1)ln(1+e^{f_i})] i=1m[21(yi+1)ln(1+efi)21(yi1)ln(1+efi)]
        = { ∑ i = 1 m [ l n ( 1 + e − f i ) ] y i = 1 ∑ i = 1 m [ l n ( 1 + e f i ) ] y i = − 1 ⇒ l o s s ( y i , y i ^ ) = ∑ i = 1 m [ l n ( 1 + e − y i ⋅ f i ) ] \begin{cases}\sum^m_{i=1}[ln(1+e^{-f_i})] &y_i=1\\ \sum^m_{i=1}[ln(1+e^{f_i})] &y_i=-1\end{cases} \Rightarrow loss(y_i,\hat{y_i})=\sum^m_{i=1}[ln(1+e^{-y_i\cdot f_i})] {i=1m[ln(1+efi)]i=1m[ln(1+efi)]yi=1yi=1loss(yi,yi^)=i=1m[ln(1+eyifi)]
  • 对数线性模型

    • 一个事件的机率odds,是指该事件发生的概率与该事件不发生概率的比值
    • 对数几率:logit函数
      • P ( y = 1 ∣ x ; θ ) = h θ ( x ) P(y=1|x;\theta)=h_\theta(x) P(y=1x;θ)=hθ(x)
      • P ( y = 0 ∣ x ; θ ) = 1 − h θ ( x ) P(y=0|x;\theta)=1-h_\theta(x) P(y=0x;θ)=1hθ(x)
      • l o g i t ( p ) = l o g p 1 − p = l o g h θ ( x ) 1 − h θ ( x ) = θ T x logit(p)=log\frac{p}{1-p}=log\frac{h_\theta(x)}{1-h_\theta(x)}=\theta^Tx logit(p)=log1pp=log1hθ(x)hθ(x)=θTx
  • Softmax回归

    • K分类
      • 第k类的参数为 θ ⃗ k \vec{\theta}_k θ k,组成二维矩阵 θ k × n \theta_{k\times n} θk×n
    • 概率
      • p ( c = k ∣ x ; θ ) = e x p ( θ k T x ) ∑ l = 1 K e x p ( θ l T x ) , k = 1 , 2 , … , K p(c=k|x;\theta)=\frac{exp(\theta^T_kx)}{\sum^K_{l=1}exp(\theta^T_lx)}, k=1,2,…,K p(c=kx;θ)=l=1Kexp(θlTx)exp(θkTx),k=1,2,,K
    • 似然函数
      • L ( θ ) = ∏ i = 1 m ∏ k = 1 K p ( c = k ∣ x ( i ) ; θ ) y k ( i ) = ∏ i = 1 m ∏ k = 1 K ( e x p ( θ k T x ) ∑ l = 1 K e x p ( θ l T x ) ) y k ( i ) L(\theta)=\prod^m_{i=1}\prod^K_{k=1}p(c=k|x^{(i)};\theta)^{y^{(i)}_k}=\prod^m_{i=1}\prod^K_{k=1}(\frac{exp(\theta^T_kx)}{\sum^K_{l=1}exp(\theta^T_lx)})^{y^{(i)}_k} L(θ)=i=1mk=1Kp(c=kx(i);θ)yk(i)=i=1mk=1K(l=1Kexp(θlTx)exp(θkTx))yk(i)
    • 对数似然
      • J ( θ ) = ∑ k = 1 K y k ⋅ ( θ k T x − l n ∑ l = 1 K e x p ( θ l T x ( i ) ) ) J(\theta)=\sum^K_{k=1}y_k\cdot(\theta^T_kx-ln\sum^K_{l=1}exp(\theta^T_lx^{(i)})) J(θ)=k=1Kyk(θkTxlnl=1Kexp(θlTx(i)))
    • 随机梯度
      • ∂ J ( θ ) ∂ θ k = ( y k − p ( y k ∣ x ; θ ) ) ⋅ x \frac{\partial J(\theta)}{\partial \theta_k}=(y_k-p(y_k|x;\theta))\cdot x θkJ(θ)=(ykp(ykx;θ))x
评论
添加红包

请填写红包祝福语或标题

红包个数最小为10个

红包金额最低5元

当前余额3.43前往充值 >
需支付:10.00
成就一亿技术人!
领取后你会自动成为博主和红包主的粉丝 规则
hope_wisdom
发出的红包

打赏作者

王二小、

你的鼓励将是我创作的最大动力

¥1 ¥2 ¥4 ¥6 ¥10 ¥20
扫码支付:¥1
获取中
扫码支付

您的余额不足,请更换扫码支付或充值

打赏作者

实付
使用余额支付
点击重新获取
扫码支付
钱包余额 0

抵扣说明:

1.余额是钱包充值的虚拟货币,按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载,可以购买VIP、付费专栏及课程。

余额充值