Linear regression with multiple variables

原创 2016年06月02日 12:57:06

1.Promble:

Suppose you are selling your house and you want to know what a good market price would be. One way to do this is to first collect information on recent houses sold and make a model of housing prices.

The file ex1data2.txt contains a training set of housing prices in Portland, Oregon. The first column is the size of the house (in square feet), the second column is the number of bedrooms, and the third column is the price of the house.

 

此部分采用了2种求解方法,一种方法就是采用的是梯度下降的方法,另一种采用的是正规方程的方法。

 

2.采用梯度下降的方法求解:

    1)step 1: Feature Normalization

By looking at the values(ex1data2.txt), note that house sizes are about 1000 times the number of bedrooms. When features differ by orders of magnitude, first performing feature scaling can make gradient descent converge much more quickly

采用的特征标准化公式:

z-score标准化方法适用于属性A的最大值和最小值未知的情况,或有超出取值范围的离群数据的情况。

    新数据=(原数据-均值)/标准差

MATLAB代码如下:

<span style="font-size:18px;">function [X_norm, mu, sigma] = featureNormalize(X)
%FEATURENORMALIZE Normalizes the features in X 
%   FEATURENORMALIZE(X) returns a normalized version of X where
%   the mean value of each feature is 0 and the standard deviation
%   is 1. This is often a good preprocessing step to do when
%   working with learning algorithms.
 
% You need to set these values correctly
X_norm = X;
mu = zeros(1, size(X, 2));      % mean value 均值   size(X,2)  列数
sigma = zeros(1, size(X, 2));   % standard deviation  标准差
 
% ====================== YOUR CODE HERE ======================
% Instructions: First, for each feature dimension, compute the mean
%               of the feature and subtract it from the dataset,
%               storing the mean value in mu. Next, compute the 
%               standard deviation of each feature and divide
%               each feature by it's standard deviation, storing
%               the standard deviation in sigma. 
%
%               Note that X is a matrix where each column is a 
%               feature and each row is an example. You need 
%               to perform the normalization separately for 
%               each feature. 
%
% Hint: You might find the 'mean' and 'std' functions useful.
%       
  mu = mean(X);       %  mean value 
  sigma = std(X);     %  standard deviation
  X_norm  = (X - repmat(mu,size(X,1),1)) ./  repmat(sigma,size(X,1),1);%新数据=(原数据-均值)/标准差
 
end</span>


2)step 2:Gradient Descent


梯度下降的更新公式:

 theta = theta - alpha / m * X' * (X * theta - y);


此部分Matlab 代码如下:

function [theta, J_history] = gradientDescentMulti(X, y, theta, alpha, num_iters)
%GRADIENTDESCENTMULTI Performs gradient descent to learn theta
%   theta = GRADIENTDESCENTMULTI(x, y, theta, alpha, num_iters) updates theta by
%   taking num_iters gradient steps with learning rate alpha
 
% Initialize some useful values
m = length(y); % number of training examples
J_history = zeros(num_iters, 1);
 
for iter = 1:num_iters
    
    % ====================== YOUR CODE HERE ======================
    % Instructions: Perform a single gradient step on the parameter vector
    %               theta.
    %
    % Hint: While debugging, it can be useful to print out the values
    %       of the cost function (computeCostMulti) and gradient here.
    %
    theta = theta - alpha / m * X' * (X * theta - y);
    
    % ============================================================
    
    % Save the cost J in every iteration
    J_history(iter) = computeCostMulti(X, y, theta);
    
end
 
end


3.采用正规方程求解

Normal Equations:

the closed-form solution to linear regression is:



Using this formula does not require any feature scaling, and you will get an exact solution in one calculation: there is noloop until convergencelike in gradient descent


Matlab代码如下:

function [theta] = normalEqn(X, y)
%NORMALEQN Computes the closed-form solution to linear regression 
%   NORMALEQN(X,y) computes the closed-form solution to linear 
%   regression using the normal equations.
 
theta = zeros(size(X, 2), 1);
 
% ====================== YOUR CODE HERE ======================
% Instructions: Complete the code to compute the closed form solution
%               to linear regression and put the result in theta.
%
 
% ---------------------- Sample Solution ----------------------
 
theta = pinv( X' * X ) * X' * y;
 
end


最终运行结果:


可以看出运行到500次左右,采用梯度下降的损失函数的值就收敛了


此练习的完整代码请参见:点击打开链接




版权声明:本文为博主原创文章,未经博主允许不得转载。

斯坦福 机器学习Andrew NG 第二讲 Linear Regression with multiple variables

Linear Regression with multiple variables 多参数线性回归 1·Multiple features 引例:同第一讲中一样,依然用房屋售价问题为例。不同的是,...
  • lvpyuan
  • lvpyuan
  • 2014年11月25日 16:46
  • 579

Machine Learning:Linear Regression With Multiple Variables

Machine Learning:Linear Regression With Multiple Variables 接着上次预测房子售价的例子,引出多变量的线性回归。 接着上次预测房子售...

【Stanford机器学习笔记】2-Linear Regression with Multiple Variables

【Stanford机器学习笔记】2-Linear Regression with Multiple Variables

Cousera-stanford-机器学习练习-第二周-Linear Regression with Multiple Variables

Linear Regression with Multiple Variables 5 试题 1。 Suppose m=4 students have taken some class, and...

机器学习之2-多变量线性回归(Linear Regression with Multiple Variables)

1.多维特征多个变量的模型: 特征的数量:n训练集实例:代表第 i 个训练实例,是特征矩阵中的第 i 行,是一个向量(vector)。 代表特征矩阵中第 i 行的第 j 个特征,也就是第 i 个训...

机器学习之多变量线性回归(Linear Regression with multiple variables)

本文引至:http://www.cnblogs.com/jianxinzhou/p/4055333.html 1. Multiple features(多维特征) 在机器学习之...

Machine Learning - IV. Linear Regression with Multiple Variables多变量线性规划 (Week 2)

机器学习Machine Learning - Andrew NG courses学习笔记 linear regression works with multiple variables or wi...

Coursera Machine Learning 第二周 quiz Linear Regression with Multiple Variables 习题答案

1.Suppose m=4 students have taken some class, and the class had a midterm exam and a final exam. You...

机器学习之多变量线性回归(Linear Regression with multiple variables)

本文引至:http://www.cnblogs.com/jianxinzhou/p/4055333.html 1. Multiple features(多维特征) 在机器学习之单变...

Andrew NG机器学习课程笔记系列之——机器学习之多变量线性回归(Linear Regression with multiple variables)

1. Multiple features(多维特征) 在机器学习之单变量线性回归(Linear Regression with One Variable)我们提到过的线性回归中,我们只有一个单一特征量...
内容举报
返回顶部
收藏助手
不良信息举报
您举报文章:Linear regression with multiple variables
举报原因:
原因补充:

(最多只允许输入30个字)