[Machine Learning] [Octave]Gradient Descent Practice

最新推荐文章于 2021-04-01 06:20:02 发布

jkwRay

最新推荐文章于 2021-04-01 06:20:02 发布

阅读量444

点赞数

分类专栏：日常学习机器学习梯度下降 Octave 文章标签：机器学习 Octave 梯度下降算法

本文链接：https://blog.csdn.net/jkwRay/article/details/77688507

版权

日常学习同时被 3 个专栏收录

6 篇文章 0 订阅

订阅专栏

机器学习

5 篇文章 0 订阅

订阅专栏

Octave

4 篇文章 0 订阅

订阅专栏

After Watching Andrew ng’s Machine Coursera Lessons, I want to practise Gradient Descent by myself.
So, I generate some datas first.

m = 10; %number of the datas
n = 1; %feature
alpha = 0.01; %learning rate
X = [ones(m,1),(1:m)']; %Input
theta = zeros(n+1,1); %theta
correctTheta = [2;1]; %CorrectTheta
y = (correctTheta'*X')'; %Result

function J = costFunction(m,X,y,theta),
J = (1/(2*m)) * sum(((theta'*X')'-y) .^ 2);
end;

% And h(x) is turns out to be the θ1*X1+θ2*X2

There are two ways to do the Gradient Descent based on this h(x) in Octave.

diata = zeros(n+1,1);
for i = 1:m,
    diata += ((theta'*X(i,:)')'-y(i,:))'*(X(i,:))';
end;
theta = theta - alpha/m .* diata;

And the second:

theta = theta - (alpha / m) .* (((theta'*X')'-y)'*X)'; % it takes me some time to compus this expression.

It’s clear that the second is more convinent.But when I practised, I am nearly mad.
The theta is always getting higher and at last turns to be the INF.

When I am thinking, I recall that Andrew ng said that if the alpha turns to be too large, it will be like this
这里写图片描述

So, I change the learning rate from 0.1 to 0.01, to my relief, it finally work!

And for the fully version, n features, m datas.(Be careful, the n increases, the alpha decreases; For example, if the n = 10, m = 100, the alpha need to be 0.000001, even 0.0001 is too large.)

m = (the number of the datas);
n = (the number of the features);
alpha = (learning rate);
X = [ones(m,1),ceil(rand(m,n)*100)];
theta = zeros(n+1,1);
correctTheta = ceil(rand(n+1,1)*10);
y = (correctTheta'*X')';

function J = costFunction(m,X,y,theta),
J = (1/(2*m)) * sum(((theta'*X')'-y) .^ 2);
end;

for j = 1:(times),
theta = theta - (alpha / m) .* (((theta'*X')'-y)'*X)';
end;