机器学习笔记(2)-- 梯度下降法求单变量线性回归

数据生成

通过下式产生数据,作为训练集
y = 4 x 4 + 3 x 3 + 2 x 2 + x + 1 + r a n d y=4x^4+3x^3+2x^2+x+1+rand y=4x4+3x3+2x2+x+1+rand
加入0~1之间的随机数使数据更加接近实际数据

假设

我们假设一个函数用于拟合上述数据
h ( x ) = θ 0 + θ 1 x + θ 2 x 2 + θ 3 x 3 + θ 4 x 4 h(x)=\theta_0 +\theta_1x+\theta_2x^2+\theta_3x^3+\theta_4x^4 h(x)=θ0+θ1x+θ2x2+θ3x3+θ4x4

Θ = [ θ 0 , θ 1 , θ 2 , θ 3 , θ 4 ] \Theta = \begin {matrix} [\theta_0 ,\theta_1,\theta_2,\theta_3,\theta_4] \end{matrix} Θ=[θ0,θ1,θ2,θ3,θ4]

代价函数

J ( Θ ) = 1 2 m ∑ i = 1 m ( h ( x ( i ) ) − y ( i ) ) 2 J(\Theta)=\frac {1}{2m}\sum^{m}_{i=1}(h(x^{(i)}) - y^{(i)})^2 J(Θ)=2m1i=1m(h(x(i))y(i))2
m m m为数据个数

梯度下降法

θ i k = θ i k − 1 − α ∂ J ( Θ ) ∂ θ i \theta_i ^k= \theta_i^{k-1} - \alpha\frac {\partial J(\Theta)} {\partial \theta_i} θik=θik1αθiJ(Θ)
α \alpha α为学习率,用于控制梯度下降得快慢
这个式子表示新的 θ i \theta_i θi等于上一个 θ i \theta_i θi减去 α ∂ J ( Θ ) ∂ θ i \alpha\frac {\partial J(\Theta)} {\partial \theta_i} αθiJ(Θ)使得 θ i \theta_i θi始终朝着使代价函数 J ( Θ ) J(\Theta) J(Θ)下降得方向变化
使用梯度下降法循环100次,或者使 J ( Θ ) J(\Theta) J(Θ)的值下降到可以容许的误差范围内
∂ J ( Θ ) ∂ θ i \frac {\partial J(\Theta)} {\partial \theta_i} θiJ(Θ)的偏导结果
∂ J ( Θ ) ∂ θ 0 = 1 m ∑ i = 1 m ( ( h ( x ( i ) ) − y ( i ) ) ⋅ 1 ) \frac {\partial J(\Theta)} {\partial \theta_0}=\frac {1}{m}\sum^{m}_{i=1}((h(x^{(i)}) - y^{(i)})\cdot 1) θ0J(Θ)=m1i=1m((h(x(i))y(i))1)
∂ J ( Θ ) ∂ θ 1 = 1 m ∑ i = 1 m ( ( h ( x ( i ) ) − y ( i ) ) ⋅ x ) \frac {\partial J(\Theta)} {\partial \theta_1}=\frac {1}{m}\sum^{m}_{i=1}((h(x^{(i)}) - y^{(i)})\cdot x) θ1J(Θ)=m1i=1m((h(x(i))y(i))x)
∂ J ( Θ ) ∂ θ 2 = 1 m ∑ i = 1 m ( ( h ( x ( i ) ) − y ( i ) ) ⋅ x 2 ) \frac {\partial J(\Theta)} {\partial \theta_2}=\frac {1}{m}\sum^{m}_{i=1}((h(x^{(i)}) - y^{(i)})\cdot x^2) θ2J(Θ)=m1i=1m((h(x(i))y(i))x2)
∂ J ( Θ ) ∂ θ 3 = 1 m ∑ i = 1 m ( ( h ( x ( i ) ) − y ( i ) ) ⋅ x 3 ) \frac {\partial J(\Theta)} {\partial \theta_3}=\frac {1}{m}\sum^{m}_{i=1}((h(x^{(i)}) - y^{(i)})\cdot x^3) θ3J(Θ)=m1i=1m((h(x(i))y(i))x3)
∂ J ( Θ ) ∂ θ 4 = 1 m ∑ i = 1 m ( ( h ( x ( i ) ) − y ( i ) ) ⋅ x 4 ) \frac {\partial J(\Theta)} {\partial \theta_4}=\frac {1}{m}\sum^{m}_{i=1}((h(x^{(i)}) - y^{(i)})\cdot x^4) θ4J(Θ)=m1i=1m((h(x(i))y(i))x4)

代码

%数据的产生,并加入噪声
close all;
x = 0:0.01:1;
y = 4*x.^4+3*x.^3+2*x.^2+x+1+rand(1,length(x));
figure(1);
hold on;
plot(x,y,'r');

%假设函数为h(x) = theata0 + theata1*x+ theata2*x^2 + theata3*x^4
theata = 10*rand(1,5);%随机产生系数
alpha = 1.2;%学习率

for i=1:100%迭代100%假设函数
h_theata = theata(1)*ones(1,length(x))+theata(2)*x+theata(3)*x.^2+theata(4)*x.^3+theata(5)*x.^4;
plot(x,h_theata,'y');
var(i) =  1/length(x)*sum((h_theata - y).^2);%方差
%梯度下降法迭代求解新系数,代价函数1/2m *sum((h(x) - y)^2)
theata(1) = theata(1) - alpha*1/length(x)*sum(h_theata - y);
theata(2) = theata(2) - alpha*1/length(x)*sum((h_theata - y).*x);
theata(3) = theata(3) - alpha*1/length(x)*sum((h_theata - y).*x.^2);
theata(4) = theata(4) - alpha*1/length(x)*sum((h_theata - y).*x.^3);
theata(5) = theata(5) - alpha*1/length(x)*sum((h_theata - y).*x.^4);
end

%最终图形
h_theata = theata(1)*ones(1,length(x))+theata(2)*x+theata(3)*x.^2+theata(4)*x.^3+theata(5)*x.^4;
plot(x,h_theata,'b');
title('数据变化');
hold off;

%代价函数1/2m *sum((h(x) - y)^2)的变化曲线
figure(2);
plot(1:100,var);
title('代价函数大小变化');

图中红线为训练集数据的原始图线,黄色为拟合过程中图线的变化,蓝色为最终拟合曲线
在这里插入图片描述
在拟合过程中方差或者说代价函数的值的大小变化

  • 0
    点赞
  • 0
    收藏
    觉得还不错? 一键收藏
  • 0
    评论
评论
添加红包

请填写红包祝福语或标题

红包个数最小为10个

红包金额最低5元

当前余额3.43前往充值 >
需支付:10.00
成就一亿技术人!
领取后你会自动成为博主和红包主的粉丝 规则
hope_wisdom
发出的红包
实付
使用余额支付
点击重新获取
扫码支付
钱包余额 0

抵扣说明:

1.余额是钱包充值的虚拟货币,按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载,可以购买VIP、付费专栏及课程。

余额充值