《吴恩达机器学习》笔记——2 单变量线性回归

《吴恩达机器学习》笔记——2 单变量线性回归

1 模型描述

符号定义
m m m训练样本数量
x x x”输入”变量/特征
y y y“输出”变量/“目标”变量
( x , y ) (x,y) (x,y)一个训练样本
( x ( i ) , y ( i ) ) (x^{(i)},y^{(i)}) (x(i),y(i)) i i i个训练样本
h h h假设函数
模型 h θ ( x ) = θ 0 + θ 1 x h_\theta(x)=\theta_0+\theta_1x hθ(x)=θ0+θ1x

2 代价函数(Lost function)

表达含义
h θ ( x ) = θ 0 + θ 1 x h_\theta(x)=\theta_0+\theta_1x hθ(x)=θ0+θ1x(简写 h ( x ) h(x) h(x)假设函数
θ i \theta_i θi模型参数
-推导
目标 min ⁡ θ 0 , θ 1    1 2 m ∑ i = 1 m ( h θ ( x ( i ) ) − y ( i ) ) 2 \min\limits_{\theta_0,\theta_1}\;\frac{1}{2m}\sum\limits^m_{i=1}(h_\theta(x^{(i)})-y^{(i)})^2 θ0,θ1min2m1i=1m(hθ(x(i))y(i))2
代价函数
(平方误差代价函数(square error cost function))
J ( θ 0 , θ 1 ) = 1 2 m ∑ i = 1 m ( h θ ( x ( i ) ) − y ( i ) ) 2 J(\theta_0,\theta_1)=\frac{1}{2m}\sum\limits^m_{i=1}(h_\theta(x^{(i)})-y^{(i)})^2 J(θ0,θ1)=2m1i=1m(hθ(x(i))y(i))2
目标 min ⁡ θ 0 , θ 1    J ( θ 0 , θ 1 ) \min\limits_{\theta_0,\theta_1}\;J(\theta_0,\theta_1) θ0,θ1minJ(θ0,θ1)
名称表达式
假设函数 h θ ( x ) = θ 0 + θ 1 x h_\theta(x)=\theta_0+\theta_1x hθ(x)=θ0+θ1x
模型参数 θ 0 , θ 1 \theta_0,\theta_1 θ0,θ1
代价函数 J ( θ 0 , θ 1 ) = 1 2 m ∑ i = 1 m ( h θ ( x ( i ) ) − y ( i ) ) 2 J(\theta_0,\theta_1)=\frac{1}{2m}\sum\limits^m_{i=1}(h_\theta(x^{(i)})-y^{(i)})^2 J(θ0,θ1)=2m1i=1m(hθ(x(i))y(i))2
目标 min ⁡ θ 0 , θ 1    J ( θ 0 , θ 1 ) \min\limits_{\theta_0,\theta_1}\;J(\theta_0,\theta_1) θ0,θ1minJ(θ0,θ1)

3 梯度下降法(Gradient descent algorithm)

函数 J ( θ 0 , θ 1 ) J(\theta_0,\theta_1) J(θ0,θ1)
目标 min ⁡ θ 0 , θ 1    J ( θ 0 , θ 1 ) \min\limits_{\theta_0,\theta_1}\;J(\theta_0,\theta_1) θ0,θ1minJ(θ0,θ1)
思路1. 初始化 θ 0 \theta_0 θ0 θ 1 \theta_1 θ1
2. 不停地改变 θ 0 \theta_0 θ0 θ 1 \theta_1 θ1来使 J ( θ 0 , θ 1 ) J(\theta_0,\theta_1) J(θ0,θ1)变小直到找到 J J J的最小值
-梯度下降法
定义 重 复 直 至 收 敛 { θ j : = θ j − α ∂ ∂ θ j J ( θ 0 , θ 1 ) ( f o r    j = 0    a n d    j = 1 ) } 重复直至收敛\{ \\\theta_j:=\theta_j-\alpha\frac{\partial}{\partial\theta_j}J(\theta_0,\theta_1)\qquad(\mathrm{for}\; j=0 \;\mathrm{and}\; j=1)\\\} {θj:=θjαθjJ(θ0,θ1)(forj=0andj=1)}
同时更新 θ 0 \theta_0 θ0 θ 1 \theta_1 θ1 t e m p 0 : = θ 0 − α ∂ ∂ θ 0 J ( θ 0 , θ 1 ) t e m p 1 : = θ 1 − α ∂ ∂ θ 1 J ( θ 0 , θ 1 ) θ 0 : = t e m p 0 θ 1 : = t e m p 1 temp0:=\theta_0-\alpha\frac{\partial}{\partial\theta_0}J(\theta_0,\theta_1)\\temp1:=\theta_1-\alpha\frac{\partial}{\partial\theta_1}J(\theta_0,\theta_1)\\\theta_0:=temp0\\\theta_1:=temp1 temp0:=θ0αθ0J(θ0,θ1)temp1:=θ1αθ1J(θ0,θ1)θ0:=temp0θ1:=temp1
符号描述
: = := :=赋值
α \alpha α学习率

4 线性回归的梯度下降

梯度下降法线性回归模型
重 复 直 至 收 敛 { θ j : = θ j − α ∂ ∂ θ j J ( θ 0 , θ 1 ) ( f o r    j = 0    a n d    j = 1 ) } 重复直至收敛\{ \\\theta_j:=\theta_j-\alpha\frac{\partial}{\partial\theta_j}J(\theta_0,\theta_1)\\(\mathrm{for}\; j=0 \;\mathrm{and}\; j=1)\\\} {θj:=θjαθjJ(θ0,θ1)(forj=0andj=1)} h θ ( x ) = θ 0 + θ 1 x h_\theta(x)=\theta_0+\theta_1x hθ(x)=θ0+θ1x
J ( θ 0 , θ 1 ) = 1 2 m ∑ i = 1 m ( h θ ( x ( i ) ) − y ( i ) ) 2 J(\theta_0,\theta_1)=\frac{1}{2m}\sum\limits^m_{i=1}(h_\theta(x^{(i)})-y^{(i)})^2 J(θ0,θ1)=2m1i=1m(hθ(x(i))y(i))2
公式推导
∂ ∂ θ j J ( θ 0 , θ 1 ) \frac{\partial}{\partial\theta_j}J(\theta_0,\theta_1) θjJ(θ0,θ1) = ∂ ∂ θ j ⋅ 1 2 m ∑ i = 1 m ( h ( x ( i ) ) − y ( i ) ) 2 = ∂ ∂ θ j ⋅ 1 2 m ∑ i = 1 m ( θ 0 + θ 1 x ( i ) − y ( i ) ) 2 =\frac{\partial}{\partial\theta_j}\cdot\frac{1}{2m}\sum\limits^m_{i=1}(h(x^{(i)})-y^{(i)})^2\\=\frac{\partial}{\partial\theta_j}\cdot\frac{1}{2m}\sum\limits^m_{i=1}(\theta_0+\theta_1x^{(i)}-y^{(i)})^2 =θj2m1i=1m(h(x(i))y(i))2=θj2m1i=1m(θ0+θ1x(i)y(i))2
j = 0 : ∂ ∂ θ 0 J ( θ 0 , θ 1 ) j=0:\frac{\partial}{\partial\theta_0}J(\theta_0,\theta_1) j=0:θ0J(θ0,θ1) = 1 m ∑ i = 1 m ( h ( x ( i ) ) − y ( i ) ) =\frac{1}{m}\sum\limits^m_{i=1}(h(x^{(i)})-y^{(i)}) =m1i=1m(h(x(i))y(i))
j = 1 : ∂ ∂ θ 1 J ( θ 0 , θ 1 ) j=1:\frac{\partial}{\partial\theta_1}J(\theta_0,\theta_1) j=1:θ1J(θ0,θ1) = 1 m ∑ i = 1 m ( h ( x ( i ) ) − y ( i ) ) ⋅ x ( i ) =\frac{1}{m}\sum\limits^m_{i=1}(h(x^{(i)})-y^{(i)})\cdot x^{(i)} =m1i=1m(h(x(i))y(i))x(i)
梯度下降(同时更新) 重 复 直 至 收 敛 { θ 0 : = θ 0 − α 1 m ∑ i = 1 m ( h ( x ( i ) ) − y ( i ) ) θ 1 : = θ 1 − α 1 m ∑ i = 1 m ( h ( x ( i ) ) − y ( i ) ) ⋅ x ( i ) } 重复直至收敛\{ \\\theta_0:=\theta_0-\alpha\frac{1}{m}\sum\limits^m_{i=1}(h(x^{(i)})-y^{(i)})\\\theta_1:=\theta_1-\alpha\frac{1}{m}\sum\limits^m_{i=1}(h(x^{(i)})-y^{(i)})\cdot x^{(i)}\\\} {θ0:=θ0αm1i=1m(h(x(i))y(i))θ1:=θ1αm1i=1m(h(x(i))y(i))x(i)}
“Batch”梯度下降每一步梯度下降都遍历整个训练集样本
  • 0
    点赞
  • 2
    收藏
    觉得还不错? 一键收藏
  • 0
    评论
评论
添加红包

请填写红包祝福语或标题

红包个数最小为10个

红包金额最低5元

当前余额3.43前往充值 >
需支付:10.00
成就一亿技术人!
领取后你会自动成为博主和红包主的粉丝 规则
hope_wisdom
发出的红包
实付
使用余额支付
点击重新获取
扫码支付
钱包余额 0

抵扣说明:

1.余额是钱包充值的虚拟货币,按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载,可以购买VIP、付费专栏及课程。

余额充值