吴恩达机器学习Exercise1的部分Octave command

在吴恩达视频中的第35节有讲师关于如何提交答案的过程介绍,以下的代码部分没有提交过,正确性还有待考量。

主要记录的是生成gradient discent函数的过程,这次我用Octave生成的gradient descent函数和原数据汇集图的对比图:

看起来感觉挺对应的

(英语小白)

the first part

Now there are some data and are asked to be visiualized ,and these data are saved in ex1data.txt.The basic steps has been given,so it's easy the get the plot.

As the exercise asks to complete plotData.m to draw the plot,here are the code in plotData.m:

function  plotData(x, y)
plot(x,y,'rx','MarkerSize',10);
ylabel('Profit in $10,000');
xlabel('Poputation in 10,000');

For the first command,both x and y are vector,saving the dataset of profit and polulation respectively.The second command,the plot function has two propertise,x and y, so the result is a scatter plot with red('rx') cross(叉号).The rest two command are just label.

Now the function is finished,and by typing the command in Octave小黑框 and calling this function,the plot can be visualized.

here are the commands:

>>cd E:\Octave\exercise;
>>data = load('ex1data1.txt');

The first command is to find the path of dataset,and then downlod them.So now 'data' is a matrix with two colume.

Define two vector X and y ,x is the first colume of 'data',y is the another,here are the commands:

>>X=data(:,1);
>>y=data(:,2);

we can see the x vector :

>>X
X =

    6.1101
    5.5277
    8.5186
    7.0032
    5.8598
    8.3829
    7.4764
    8.5781
    6.4862
    5.0546
    5.7107
   14.1640
    5.7340
    8.4084
    5.6407
    5.3794
    6.3654
    5.1301
    6.4296
    7.0708
    6.1891
   20.2700
    5.4901
    6.3261
    5.5649
   18.9450
   12.8280

well it's clear to see what the vector X saved

now call the function dataPlot,and then we can see the plot

>>cd 'E:\Octave\exercise';
>>plotData(X,y);

and here is the plot:

写的啰嗦一点方便以后忘了回看*——*

the second part

The vedio has taught the cost function a lot,the cost function:

you know that h(x) is a linear regression,with two unkonwn parameters θ1,θ2:

Let's just hypothesis that we know theta(θ) completely

now define a metrix theta ={0;0},so we have all the factors to define cost function:x,θ,y,m(m is a real number,meaning the number of training examples,that is the length of y)

well before figure the cost funxtion,it's necessary to notice the way to figure the h(x):a transfer mertix 'θ' times a metrix x,which is also the way where Octave figure the h(x).Thus,as we know now that theta is a two rows one columns mertix,while x is a m rows one column metrix.This means,x need to make some change--it's ok to add a colume to x,and all the result will be right,the new x likes this:

    1.0000    6.1101
    1.0000    5.5277
    1.0000    8.5186
    1.0000    7.0032
    1.0000    5.8598
    1.0000    8.3829.........

so now it's can time to theta correctly

Following the exercise,now we complete the computeCost.m,here are the code:

function J = computeCost(X, y, theta)
m = length(y); 
predictions=X*theta;
sqrErros=(predictions-y).^2;
J=sum(sqrErros)/(2*m);

the predictions is the predict result of y,by using the theta that we predict (0,0).

So now it can figure out the J function.

well the last step is to type the command and call the computeCost function in Octave小黑框,here are the commands:

>>m=length(y);
>>x=[ones(m,1),data(:,1)];
>>theta=zeros(2,1);
>>cd 'E:\Octave\exercise'
>>J=computeCost(x,y,theta)
J =  32.073

the second command is to rebuild x

and the result J is super big,it can be changed bu changing the theta.

the third part

now it's time to figure the gradient descent,here is the formula:

There are two θ that needed to be figured until getting  the most proper number.

(in the vedio,the teacher has told about the simply way to figure all the θ a time,that is to regard θ as a vetor and multiply metrix to get the answer.However,I didn't use this complexed way because there are two θ only .But this is not common in reality@_@)

As the formula shows,θj is a real number,and the ultimate result of α...xj would be a real number too,so it's better to calculate the sum of 'h(x)-y)xj' firstly.

the h(x):

As it mentioned before,h(x)=theta(metrix) times x(metrix),the answer will be a m row one column vector,and then minus y vector.

At this moment,x is still a m row two column metrix(the first row is one[maybe there are some grammer mistakes]).To explain this,x is:

     x1:          x2:

    1.0000    6.1101
    1.0000    5.5277
    1.0000    8.5186
    1.0000    7.0032
    1.0000    5.8598
    1.0000    8.3829.........

So,when it calculates θ1,we use x1,and when it calculates θ2,we use x2:

θ1=θ1-α* (1/m) * sum {the transfer metrix of (theta*x-y) * x1}

θ2=θ2-α* (1/m) * sum {the transfer metrix of (theta*x-y) * x2}

α is 0.01,the exercise has given

When I try to complete the gradientDescentMulti.m,I found that I have  no idea to seperate X by using Octave command in the function,I could only get x1 and x2 in Octave 小黑框 .It's a sad thing,maybe I can acquire this skill in later learning

So I have to input parameters x,x1 and x2 in function...:

function [theta, J_history] = gradientDescentMulti(X,x1,x2, y, theta, alpha, num_iters)
m = length(y); % number of training examples
J_history = zeros(num_iters, 1);

for iter = 1:num_iters
    predictions=X*theta;
    vector_1=(predictions-y).*x1;
    theta(1)=theta(1)-alpha/m*sum(vector_1);
    vector_2=(predictions-y).*x2;
    theta(2)=theta(2)-alpha/m*sum(vector_2);

    %figure the cost function and same them in J_history
    predictions=X*theta;
    sqrErros_2=(predictions-y).^2;
    J_history(iter)=sum(sqrErros_2)/(2*m);
end

the meaning of  num_iters is the time of calculate the theta,again and again,until getting the most proper results.

J_history is a num_iters row vector,saving all of the result of cost function(J function)

Now you can define the theta by yourself,I define them with [1;5]

here are the commands in Octave小黑盒:

>>x1=X(:,1)
>>x2=X(:,2)
>>theta=[1;5];
>>cd 'E:\Octave\exercise';
>>[theta,J]=gradientDescentMulti(x,x1,x2,y,theta,alpha,1)
theta =

  -2.4985
   1.5015

J =  12.843

the first and second command is to seperate the x metrix

I try one time,and get the J=12.843.

then I try a few hundreds times,and get all the J=4.5661

>>[theta,J]=gradientDescentMulti(x,x1,x2,y,theta,alpha,100)
theta =

  -2.9062
   1.0938

J =

   4.5661
   4.5661
   4.5661
   4.5661
   4.5661

now I suppose the theta gets the good answer

in the end,I make the ultimate plot,commands:

>>a=[5:4:25]
a =

    5    9   13   17   21   25

>>b=theta(1)+theta(2)*a %use the theta
b =

    2.5630    6.9384   11.3137   15.6890   20.0644   24.4397

>>plot(a,b)
>>hold on;%keep two lines on one image
>>plotData(X,y);

plot:

 

(欢迎指错)

 

  • 0
    点赞
  • 0
    收藏
    觉得还不错? 一键收藏
  • 0
    评论
评论
添加红包

请填写红包祝福语或标题

红包个数最小为10个

红包金额最低5元

当前余额3.43前往充值 >
需支付:10.00
成就一亿技术人!
领取后你会自动成为博主和红包主的粉丝 规则
hope_wisdom
发出的红包
实付
使用余额支付
点击重新获取
扫码支付
钱包余额 0

抵扣说明:

1.余额是钱包充值的虚拟货币,按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载,可以购买VIP、付费专栏及课程。

余额充值