Linear regression with multiple variables.(多变量线性回归)

最新推荐文章于 2023-12-21 17:37:37 发布

罗古洞的女婿

最新推荐文章于 2023-12-21 17:37:37 发布

阅读量177

点赞数

本文链接：https://blog.csdn.net/chk_plusplus/article/details/84950624

版权

一、导入数据。

clear ; close all; clc
fprintf('Loading data ...\n');

%% Load Data 
data = load('ex1data2.txt'); %把txt文件加载为mat文件
X = data(:, 1:2); %把第1列和第二列数据赋给X，第一列房子面积，第二列卧室数目，这两个变量即两个特征
y = data(:, 3);   %第3列数据赋给y，即对应的房子价格
m = length(y);    %训练样例的数目


% Print out some data points
fprintf('First 10 examples from the dataset: \n');
fprintf(' x = [%.0f %.0f], y = %.0f \n', [X(1:10,:) y(1:10,:)]'); %以float格式保留0位小数点
                                                    %打印出X和y的1到10行，其中X是2列，y是一列


fprintf('Program paused. Press enter to continue.\n');
pause;            %pause(5)即为暂停5秒后继续运行，没有参数5，则暂停后按任意键程序继续运行

二、均值归一化

1. 因为房子面积和卧室数目的数据相差过大，需要对特征进行均值归一化，使其取值在相似范围内，否则梯度下降会非常缓慢，需要很多次迭代。

fprintf('Normalizing Features ...\n');

[X mu sigma] = featureNormalize(X); %调用均值归一化函数对X进行处理，函数返回值为3维行向量，3个
%元素分别是均值归一化处理之后的数据，原数据均值，标准差

2. featureNormalize().均值归一化函数

function [X_norm, mu, sigma] = featureNormalize(X)

X_norm = X; %X为传递过来的数据，把它赋给X_norm
mu = zeros(1, size(X, 2));  %size(X, 2)返回矩阵X的列数，这里是两列，所以mu是1行2列的行向量
sigma = zeros(1, size(X, 2)); %这里是对均值mu和标准差sigma初始化为零向量，其实没必要

mu = mean(X); %mean()求每列均值，因为X有2列，得到mu为二维行向量
sigma = std(X); %std()求每列标准差，同理sigma为二维行向量
X_norm = (X - mu) ./ sigma; %进行均值归一化

end

其中X_norm = (X - mu) ./ sigma; ，所使用的数据中X为47*2的矩阵，mu和sigma都是2维行向量，虽然维数不匹配，但matlab是支持这种运算的。是用X的每一行（2维行向量）都减去mu的对应元素，再除以sigma的对应元素。

举例：

>> X = [2 4;2 4;2 4]

X =

     2     4
     2     4
     2     4

>> mu = [1 2]

mu =

     1     2

>> sigma = [1 2]

sigma =

     1     2

>> X - mu

ans =

     1     2
     1     2
     1     2

>> ans ./ sigma

ans =

     1     1
     1     1
     1     1

但要注意计算除法的时候一定要用点除。

三、Gradient Descent(梯度下降)

X = [ones(m, 1) X];  %先做的均值归一化，到这里才加入截距项（即对应于theta0的那一项数据，全为1）
                     %这里也可以看出我们并没有对全1的那一列进行均值归一处理

fprintf('Running gradient descent ...\n');

% Choose some alpha value
alpha = 0.01;
num_iters = 400;
theta = zeros(3, 1);
[theta, J_history] = gradientDescentMulti(X, y, theta, alpha, num_iters);

% Plot the convergence graph，并把alpha取不同值的图像放在一起比较
figure;
plot(1:50, J_history(1:50), '-b', 'LineWidth', 2); %numel返回数组元素数目
xlabel('Number of iterations');
ylabel('Cost J');
hold on;


alpha = 0.03;
num_iters = 400;
theta = zeros(3, 1);
[theta, J_history] = gradientDescentMulti(X, y, theta, alpha, num_iters);
plot(1:50, J_history(1:50), '-r', 'LineWidth', 2);


alpha = 0.1;
num_iters = 400;
theta = zeros(3, 1);
[theta, J_history] = gradientDescentMulti(X, y, theta, alpha, num_iters);
plot(1:50, J_history(1:50), '-k', 'LineWidth', 2);
legend('alpha = 0.01', 'alpha = 0.03', 'alpha = 0.1');

% Display gradient descent's result，打印出梯度下降算法得出的模型参数theta
fprintf('Theta computed from gradient descent: \n');
fprintf(' %f \n', theta);
fprintf('\n');

gradientDescentMulti(）梯度下降算法

function [theta, J_history] = gradientDescentMulti(X, y, theta, alpha, num_iters)

m = length(y); % number of training examples
J_history = zeros(num_iters, 1);  %记录每一次迭代对应的代价函数值
n = size(X,2);  %特征的数目，其中第一个特征的值全为1，是我们为计算方便而加上去的
temp = zeros(size(theta,2),1);

for iter = 1:num_iters
    theta = theta - X'*(X*theta-y)/m*alpha;
    J_history(iter) = computeCostMulti(X, y, theta); %每一次迭代记录下对应的代价函数
end

end

代价函数computeCostMulti(X, y, theta);

function J = computeCostMulti(X, y, theta)


m = length(y);
J = 0;


for i = 1:m
    J = J + 1/(2*m)*(X(i,:)*theta - y(i)).^2;
end

end

接下来用梯度下降算法得到的theta来预测一个面积1650，3个卧室的房子的价格。

price = 0;  
X = [1,([1650 3] - mu)./sigma]; %注意因为我们计算模型参数的时候进行了均值归一化，所以预测的时候
price = X*theta;                %也要进行同样的处理，对1630和3进行归一化，而mu和sigma则是训练
                                %数据X的均值和标准差，这也是为什么均值归一化函数要输出3个变量的
                                %原因  [X_norm, mu, sigma] = featureNormalize(X)

梯度下降算法得到的theta值和对房价预测结果为：

Theta computed from gradient descent:
340412.659574
110631.048958
-6649.472950

Predicted price of a 1650 sq-ft, 3 br house (using gradient descent):
$293081.464622

四、Normal Equations.（正规方程法）

θ =(X^T*X^(−1)) *X^T*y.

fprintf('Program paused. Press enter to continue.\n');
pause;
fprintf('Solving with normal equations...\n');

data = csvread('ex1data2.txt');
X = data(:, 1:2);
y = data(:, 3);
m = length(y);
X = [ones(m, 1) X]; %正规方程法不需要对特征进行均值归一化
theta = normalEqn(X, y);

fprintf('Theta computed from the normal equations: \n');
fprintf(' %f \n', theta);
fprintf('\n');

进行房价预测

price = 0;
X = [1, 1650, 3];

price = X*theta;

fprintf(['Predicted price of a 1650 sq-ft, 3 br house ' ...
         '(using normal equations):\n $%f\n'], price);

因为没有均值归一化，多以对[1650,3]也没必要归一化处理

结果为

Solving with normal equations...
Theta computed from the normal equations:
89597.909544
139.210674
-8738.019113

Predicted price of a 1650 sq-ft, 3 br house (using normal equations):
$293081.464335

由结果可知，梯度下降法和正规方程法得到模型参数theta值不同，但预测结果相同