本文解决多变量(三个)的线性回归问题,理论文档参考:http://openclassroom.stanford.edu/MainFolder/DocumentPage.php?course=MachineLearning&doc=exercises/ex3/ex3.html。自变量为房间的面积和卧室个数,预测相应的价格。样本大小为47。通过线性回归预测当面积为1650和卧室数为3个时,对应的房子价格。
本实验通过改变学习速度α,看代价函数和迭代次数的关系决定哪个学习速度α最优,从而决定参数的选择。
% This file deals linear regression for three variates
% Gradient descent
format long
clc,clear,close all;
x = load('ex3x.dat');
y = load('ex3y.dat');
x = [ones(size(x(:, 1))), x]; % add x0 = 1 intercept term for constant vari theta0
x1 = x;
y1 = y;
[m, n] = size(x);
sigma = std(x); % standard deviations
avg = mean(x); % means
x(:, 2) = (x(:, 2) - avg(2))./ sigma(2);
x(:, 3) = (x(:, 3) - avg(3))./ sigma(3);
iter = 100; % iteration number
alpha = [0.01, 0.03, 0.1, 0.3, 1, 1.2, 1.3]; % descend rate
theta = zeros(n, 1); % optimal theta
color = {'r','g','b','k','--r','--g','--b'};
figure;
for alphaTemp = 1 : length(alpha)
thetaTemp = zeros(n, 1);
J_val = zeros(iter, 1);
for temp = 1 : iter
J_val(temp) = 1 / (2 * m) * (x * thetaTemp - y)' * (x * thetaTemp - y);
thetaTemp = thetaTemp - alpha(alphaTemp) / m * x' * (x * thetaTemp - y);
end
hold on, plot(0 : 50 - 1, J_val(1 : 50), char(color(alphaTemp)), 'LineWidth', 2);
RMS(alphaTemp) = sqrt((x * thetaTemp - y)' * (x * thetaTemp - y) / m);
if (1 == alpha(alphaTemp)) % optimal theta
theta = thetaTemp
end
end
RMS
legend('0.01', '0.03', '0.1', '0.3', '1', '1.2', '1.3');
xlabel('Number of iterations');
ylabel('Cost function');
price = theta'*[1 (1650-avg(2))/sigma(2) (3-avg(3))/sigma(3)]'
% Normal equations
theta1 = inv((x1'*x1))*x1'*y1
price1 = theta1'*[1 1650 3]'
对于梯度下降法得到的数据如下:
其实在RMS中,我们可以看到当α=1和1.2时,其均方差是一样的,为什么选择1呢?来看一下学习速度图:
很明显,α=1要比α=1.2下降的更快,所以选择α=1。
对于公式法得到的参数,可以看出两者预测价格是一致的。