机器学习课后作业笔记(一)

最新推荐文章于 2021-12-16 16:33:02 发布

John4July

最新推荐文章于 2021-12-16 16:33:02 发布

阅读量508

点赞数 1

分类专栏：机器学习笔记文章标签：机器学习吴恩达笔记

本文链接：https://blog.csdn.net/greenlight_74110/article/details/78553017

版权

机器学习笔记专栏收录该内容

8 篇文章 0 订阅

订阅专栏

转载请注明来源：http://blog.csdn.net/greenlight_74110/article/details/78553017

我可能会同时用到matlab和octave

基本练习

简单的octave函数

这是一个类似helloworld的热身运动，实现一个5阶的单位矩阵：

A = eye(5);

输出：

A =

     1     0     0     0     0
     0     1     0     0     0
     0     0     1     0     0
     0     0     0     1     0
     0     0     0     0     1

一个值的线性回归

目标：实现一个单值的线性回归来预测一条食品链的利润。

ex1data1.txt 包含了本次实验所用的数据集，第一列是城市人口，第二列是利润（负数表示亏损）。

画出数据

开始任务之前，先对数据进行可视化是十分必要的。

鉴于数据只有两维，我们先用一个散点图来观察。

加载数据：

data = load('ex1data1.txt'); % read comma separated data
X = data(:, 1); y = data(:, 2);
m = length(y);

使用下列语句来画图：

plot(x, y, 'rx', 'MarkerSize', 10); % Plot the data
ylabel('Profit in $10,000s'); % Set the y−axis label
xlabel('Population of City in 10,000s'); % Set the x−axis label

梯度下降

使用梯度下降来调整参数。

更新等式

注意，初学者容易把i和j搞混淆。

i 表示样本个数编号，j 表示特征个数编号

样本特征：X = [1, x1, x2, …, xj, …, xN]

样本特征权重：θ = [1, θ1, θ2, …, θj, …, θN]

代价函数：

J(θ) = sum[(hθi-yi)^2]/2m
     = sum{[sum(xj*θj)i-yi]^2}/2m

以及，θj的每次迭代函数：

θj = θj - alpha*[dJ(θ)/dθj]
   = θj - alpha*sum[(hθi-yi)*xj]/m

实现

X = [ones(m, 1), data(:,1)]; % Add a column of ones to x
theta = zeros(2, 1); % initialize fitting parameters
iterations = 1500;
alpha = 0.01;

计算代价函数J(θ)

在computeCost.m中添加以下代码：

% Initialize some useful values
m = length(y); % number of training examples

% You need to return the following variables correctly 
J = 0;

% ====================== YOUR CODE HERE ======================
% Instructions: Compute the cost of a particular choice of theta
%               You should set J to the cost.
J = sum((X * theta - y).^2) / (2*m);     % X(79,2)  theta(2,1)

运行如下语句：

computeCost(X, y, theta)

可以看到有输出：

32.07

梯度下降

记住，J(θ)的参数是θ而不是X或y！

在 gradientDescent.m实现代码：

function [theta, J_history] = gradientDescent(X, y, theta, alpha, num_iters)
%GRADIENTDESCENT Performs gradient descent to learn theta
%   theta = GRADIENTDESENT(X, y, theta, alpha, num_iters) updates theta by 
%   taking num_iters gradient steps with learning rate alpha

% Initialize some useful values
m = length(y); % number of training examples
J_history = zeros(num_iters, 1);
theta_s=theta;

for iter = 1:num_iters

    % ====================== YOUR CODE HERE ======================
    % Instructions: Perform a single gradient step on the parameter vector
    %               theta. 
    %
    % Hint: While debugging, it can be useful to print out the values
    %       of the cost function (computeCost) and gradient here.
    %
    theta(1) = theta(1) - alpha / m * sum(X * theta_s - y);       
    theta(2) = theta(2) - alpha / m * sum((X * theta_s - y) .* X(:,2));    
    theta_s=theta; 


    % ============================================================

    % Save the cost J in every iteration    
    J_history(iter) = computeCost(X, y, theta);

end
J_history
end

J(θ)应该会随着迭代的进行逐渐收敛到一个最小值（相对）。

最终得到的θ ：

-3.630291 1.166362

根据这个θ画出线性拟合图，进行观察：

plot(X(:,2), X*theta, '-')
legend('Training data', 'Linear regression')

也可以用这个最终的θ来做预测。

predict1 = [1, 3.5] *theta;
fprintf('For population = 35,000, we predict a profit of %f\n',...
    predict1*10000);
predict2 = [1, 7] * theta;
fprintf('For population = 70,000, we predict a profit of %f\n',...
    predict2*10000);

可以得到：

For population = 35,000, we predict a profit of 4519.767868
For population = 70,000, we predict a profit of 45342.450129

可视化J(θ)

为了对代价函数J(θ)有一个更深刻的认识，接下来将画出θ的二维网状图上的J(θ)分布（一个三维图）。

初始化：

theta0_vals = linspace(-10, 10, 100);
theta1_vals = linspace(-1, 4, 100);
J_vals = zeros(length(theta0_vals), length(theta1_vals));

计算在每个点θ处的J(θ)值：

for i = 1:length(theta0_vals)
    for j = 1:length(theta1_vals)
      t = [theta0_vals(i); theta1_vals(j)];    
      J_vals(i,j) = computeCost(X, y, t);
    end
end

画出表面图和等高线图：

% Because of the way meshgrids work in the surf command, we need to 
% transpose J_vals before calling surf, or else the axes will be flipped
J_vals = J_vals';
% Surface plot
figure;
surf(theta0_vals, theta1_vals, J_vals)
xlabel('\theta_0'); ylabel('\theta_1');

% Contour plot
%figure;
hold on;
% Plot J_vals as 15 contours spaced logarithmically between 0.01 and 100
contour(theta0_vals, theta1_vals, J_vals, logspace(-2, 3, 20))
xlabel('\theta_0'); ylabel('\theta_1');
hold on;
plot(theta(1), theta(2), 'rx', 'MarkerSize', 10, 'LineWidth', 10);

额外练习

实现一个多值线性回归来预测房子价格。

文件ex1data2.txt包含了本次练习所用的数据集。

可以先预览一下数据集：

%% Clear and Close Figures
clear ; close all; clc

fprintf('Loading data ...\n');

%% Load Data
data = load('ex1data2.txt');
X = data(:, 1:2);
y = data(:, 3);
m = length(y);

% Print out some data points
fprintf('First 10 examples from the dataset: \n');
fprintf(' x = [%.0f %.0f], y = %.0f \n', [X(1:10,:) y(1:10,:)]');

特征归一化

标准差是一种用来衡量某个特征的值的离散分布的方式。

注意要将所得的均值和标准差保存起来，以便在给出一个测试数据时，将其标准化。

在 featureNormalize.m中补充以下代码，以计算归一化值和保存均值、标准差：

  mu = mean(X);       %  mean value 
  sigma = std(X);     %  standard deviation
  X_norm  = (X - repmat(mu,size(X,1),1)) ./  repmat(sigma,size(X,1),1);

调用该函数对X进行标准化，并且添加X0=1的列：

[X mu sigma] = featureNormalize(X);      % 均值0，标准差1

% Add intercept term to X
X = [ones(m, 1) X];

梯度下降

与之前唯一的区别就是，本次有多个特征。假设函数和批梯度下降的更新规则都是一样的。

由于是多值线性回归，对θ的计算从之前的向量运算转换成矩阵运算

theta = theta - alpha / m * X' * (X * theta - y);

初始化参数：

% Choose some alpha value
alpha = 0.01;
num_iters = 8500;

% Init Theta and Run Gradient Descent 
theta = zeros(3, 1);

调用gradientDescentMulti函数进行计算：

[theta, J_history] = gradientDescentMulti(X, y, theta, alpha, num_iters);

画出收敛图像：

% Plot the convergence graph
figure;
plot(1:numel(J_history), J_history, '-b', 'LineWidth', 2);
xlabel('Number of iterations');
ylabel('Cost J');

可以看到随着迭代次数的增加，误差函数最终趋向于0。

查看梯度下降的最终结果θ：

% Display gradient descent's result
fprintf('Theta computed from gradient descent: \n');
fprintf(' %f \n', theta);
fprintf('\n');

矩阵方程法（Normal Equations）

当样本量较小时（<10000）,使用Normal Equations方法来计算线性回归的近似解：

θ = (X'*X)^(-1)*X'*y

matlab实现：

theta = pinv( X' * X ) * X' * y;

现在可以尝试给定条件预测一下价格：

price = [1 1650 3] * theta ;

John4July

关注

1
点赞
踩
2

收藏

觉得还不错? 一键收藏
0
评论
机器学习课后作业笔记(一)

转载请注明来源：我可能会同时用到matlab和octave 简单的octave函数这是一个类似helloworld的热身运动，实现一个5阶的单位矩阵：A = eye(5);输出：A = 1 0 0 0 0 0 1 0 0 0 0 0 1 0 0 0 0
复制链接

扫一扫

专栏目录