Exercise: Linear Regression

转载 2016年05月30日 18:03:24

Exercise: Linear Regression

This course consists of videos and programming exercises to teach you about machine learning. The exercises are designed to give you hands-on, practical experience for getting these algorithms to work. To get the most out of this course, you should watch the videos and complete the exercises in the order in which they are listed.

This first exercise will give you practice with linear regression. These exercises have been extensively tested with Matlab, but they should also work in Octave, which has been called a "free version of Matlab." If you are using Octave, be sure to install the Image package as well (available for Windows as an option in the installer, and available for Linux from Octave-Forge ).


Download ex2Data.zip, and extract the files from the zip file.

The files contain some example measurements of heights for various boys between the ages of two and eights. The y-values are the heights measured in meters, and the x-values are the ages of the boys corresponding to the heights.

Each height and age tuple constitutes one training example $(x^{(i)}, y^{(i)}$ in our dataset. There are $m = 50$ training examples, and you will use them to develop a linear regression model.

Supervised learning problem

In this problem, you'll implement linear regression using gradient descent. In Matlab/Octave, you can load the training set using the commands

x = load('ex2x.dat');
y = load('ex2y.dat');

This will be our training set for a supervised learning problem with $n=1$ features ( in addition to the usual $x_0 = 1$, so $x \in {\mathbb R}^2$ ). If you're using Matlab/Octave, run the following commands to plot your training set (and label the axes):

figure % open a new figure window
plot(x, y, 'o');
ylabel('Height in meters')
xlabel('Age in years')

You should see a series of data points similar to the figure below.


Before starting gradient descent, we need to add the $x_0 = 1$ intercept term to every example. To do this in Matlab/Octave, the command is

m = length(y); % store the number of training examples
x = [ones(m, 1), x]; % Add a column of ones to x

From this point on, you will need to remember that the age values from your training data are actually in the second column of x. This will be important when plotting your results later.

Linear regression

Now, we will implement linear regression for this problem. Recall that the linear regression model is

\begin{displaymath} h_{\theta}(x) = \theta^Tx = \sum_{i=0}^n \theta_i x_i, \nonumber \end{displaymath}

and the batch gradient descent update rule is

\begin{displaymath} \theta_j := \theta_j - \alpha \frac{1}{m} \sum_{i=1}^m (h_{\... ...{(i)}) x_j^{(i)} \;\;\;\;\;\mbox{(for all $j$)} \nonumber \par \end{displaymath}

1. Implement gradient descent using a learning rate of $\alpha = 0.07$. Since Matlab/Octave and Octave index vectors starting from 1 rather than 0, you'll probably use theta(1) and theta(2) in Matlab/Octave to represent $\theta_0$ and $\theta_1$. Initialize the parameters to $\theta = \vec{0}$ (i.e., $\theta_0=\theta_1=0$), and run one iteration of gradient descent from this initial starting point. Record the value of of $\theta_0$ and $\theta_1$ that you get after this first iteration. (To verify that your implementation is correct, later we'll ask you to check your values of $\theta_0$ and $\theta_1$ against ours.)

2. Continue running gradient descent for more iterations until $\theta$ converges. (this will take a total of about 1500 iterations). After convergence, record the final values of $\theta_0$ and $\theta_1$ that you get.

When you have found $\theta$, plot the straight line fit from your algorithm on the same graph as your training data. The plotting commands will look something like this:

hold on % Plot new data without clearing old plot
plot(x(:,2), x*theta, '-') % remember that x is now a matrix with 2 columns
                           % and the second column contains the time info
legend('Training data', 'Linear regression')

Note that for most machine learning problems, $x$ is very high dimensional, so we don't be able to plot $h_\theta(x)$. But since in this example we have only one feature, being able to plot this gives a nice sanity-check on our result.

3. Finally, we'd like to make some predictions using the learned hypothesis. Use your model to predict the height for a two boys of age 3.5 and age 7.

Debugging If you are using Matlab/Octave and seeing many errors at runtime, try inspecting your matrix operations to check that you are multiplying and adding matrices in ways that their dimensions would allow. Remember that Matlab/Octave by default interprets an operation as a matrix operation. In cases where you don't intend to use the matrix definition of an operator but your expression is ambiguous to Matlab/Octave, you will have to use the 'dot' operator to specify your command. Additionally, you can try printing x, y, and theta to make sure their dimensions are correct.

Understanding $J(\theta)$

We'd like to understand better what gradient descent has done, and visualize the relationship between the parameters $\theta \in {\mathbb R}^2$ and $J(\theta)$. In this problem, we'll plot $J(\theta)$ as a 3D surface plot. (When applying learning algorithms, we don't usually try to plot $J(\theta)$ since usually $\theta \in {\mathbb R}^n$ is very high-dimensional so that we don't have any simple way to plot or visualize $J(\theta)$. But because the example here uses a very low dimensional $\theta \in {\mathbb R}^2$, we'll plot $J(\theta)$ to gain more intuition about linear regression.) Recall that the formula for $J(\theta)$ is

\begin{displaymath} J(\theta)=\frac{1}{2m}\sum_{i=1}^{m}\left(h_{\theta}(x^{(i)})-y^{(i)}\right)^{2} \nonumber \end{displaymath}

To get the best viewing results on your surface plot, use the range of theta values that we suggest in the code skeleton below.

J_vals = zeros(100, 100);   % initialize Jvals to 100x100 matrix of 0's
theta0_vals = linspace(-3, 3, 100);
theta1_vals = linspace(-1, 1, 100);
for i = 1:length(theta0_vals)
	  for j = 1:length(theta1_vals)
	  t = [theta0_vals(i); theta1_vals(j)];
	  J_vals(i,j) = %% YOUR CODE HERE %%

% Plot the surface plot
% Because of the way meshgrids work in the surf command, we need to 
% transpose J_vals before calling surf, or else the axes will be flipped
J_vals = J_vals'
surf(theta0_vals, theta1_vals, J_vals)
xlabel('\theta_0'); ylabel('\theta_1')

You should get a figure similar to the following. If you are using Matlab/Octave, you can use the orbit tool to view this plot from different viewpoints.

Surface plot

What is the relationship between this 3D surface and the value of $\theta_0$ and $\theta_1$ that your implementation of gradient descent had found?

Coursera_Stanford_ML_ex5_正则多项式回归和误差分析 作业记录

Coursera_Stanford_ML_ex5_Regularized Linear Regression and Bias v.s. Variance week6 作业记录
  • u010003526
  • u010003526
  • 2015年10月19日 10:44
  • 1284

ufldl学习笔记与编程作业:Linear Regression(线性回归)

ufldl学习笔记与编程作业:Linear Regression(线性回归) ufldl出了新教程,感觉比之前的好,从基础讲起,系统清晰,又有编程实践。在deep learning高质量群里面听一些前...
  • linger2012liu
  • linger2012liu
  • 2014年08月04日 23:43
  • 4748

简单线性回归(Simple Linear Regression)问题和举例

简单线性回归(Simple Linear Regression)问题和举例 0. 前提介绍: 为什么需要统计量? 统计量:描述数据特征 0.1 集中趋势衡量 0.1.1均...
  • A784586
  • A784586
  • 2017年04月26日 17:13
  • 1354

斯坦福:机器学习CS229:Exercise 1: Linear Regression线性回归(答案1)

先贴代码,有空再根据讲义,逐条讲解%% Machine Learning Online Class - Exercise 1: Linear Regression% Instructions % ...
  • jimtrump
  • jimtrump
  • 2017年05月18日 21:02
  • 563

【机器学习笔记3】Stanford公开课Exercise 2——Linear Regression

Stanford公开课Exercise 2原题地址:http://openclassroom.stanford.edu/MainFolder/DocumentPage.php?course=Machi...
  • achuo
  • achuo
  • 2016年04月15日 11:22
  • 443

Deep Learning Exercise: Linear Regression

Deep Learning Exercise: Linear Regression简介最简单的二元线性回归,参考斯坦福大学教学网http://openclassroom.stanford.edu/Ma...
  • xzywisdili
  • xzywisdili
  • 2017年03月28日 12:44
  • 62

theano linear regression exercise(theano 线性回归练习)

最近学习theano工具包 做deep learning,这个包最令人激动的是自动导数计算,你给出符号化的...
  • vins_napoleon
  • vins_napoleon
  • 2014年07月23日 10:12
  • 1601

Programming Exercise 5: Regularized Linear Regression and Bias v.s. Variance Machine Learning

大家好,今天总结Coursera网课上Andrew Ng MachineLearning 第五次作业 (1) linearRegCostFunction.m function [J, grad] =...
  • qq_21275321
  • qq_21275321
  • 2016年12月27日 22:05
  • 275

【机器学习笔记4】Stanford公开课Exercise 3——Multivariate Linear Regression

Stanford公开课Exercise 3原题地址:http://openclassroom.stanford.edu/MainFolder/DocumentPage.php?course=Machi...
  • lewsn2008
  • lewsn2008
  • 2013年11月27日 09:16
  • 3181

斯坦福大学机器学习公开课---Programming Exercise 1: Linear Regression

斯坦福大学机器学习公开课---Programming Exercise 1: Linear Regression 1  Linear regression with one variable ...
  • E_pen
  • E_pen
  • 2015年02月03日 23:10
  • 3189
您举报文章:Exercise: Linear Regression