Coursera机器学习作业分析三（ex 1-3）

2.2.4 梯度下降

function [theta, J_history] = gradientDescent(X, y, theta, alpha, num_iters)
%   taking num_iters gradient steps with learning rate alpha

% Initialize some useful values
m = length(y); % number of training examples
J_history = zeros(num_iters, 1);

for iter = 1:num_iters

% ====================== YOUR CODE HERE ======================
% Instructions: Perform a single gradient step on the parameter vector
%               theta.
%
% Hint: While debugging, it can be useful to print out the values
%       of the cost function (computeCost) and gradient here.
%

H_theta=X*theta;
temp1=theta(1)-alpha*(1/m)*sum((H_theta-y).*X(:,1));
temp2=theta(2)-alpha*(1/m)*sum((H_theta-y).*X(:,2));
theta(1)=temp1;
theta(2)=temp2;

% ============================================================

% Save the cost J in every iteration
J_history(iter) = computeCost(X, y, theta);

end

end

     theta=theta-alpha.*(1/m).*X'*(H_theta-y);

fprintf('\nRunning Gradient Descent ...\n')
theta = gradientDescent(X, y, theta, alpha, iterations);

% print theta to screen
fprintf('%f\n', theta);
fprintf('Expected theta values (approx)\n');
fprintf(' -3.6303\n  1.1664\n\n');

% Plot the linear fit
hold on; % keep previous plot visible
plot(X(:,2), X*theta, '-')
legend('Training data', 'Linear regression')
hold off % don't overlay any more plots on this figure

Running Gradient Descent ...
-3.630291
1.166362
Expected theta values (approx)
-3.6303
1.1664

% Predict values for population sizes of 35,000 and 70,000
predict1 = [1, 3.5] *theta;
fprintf('For population = 35,000, we predict a profit of %f\n',...
predict1*10000);
predict2 = [1, 7] * theta;
fprintf('For population = 70,000, we predict a profit of %f\n',...
predict2*10000);

fprintf('Program paused. Press enter to continue.\n');
pause;

2.4 代价函数可视化

%% ============= Part 4: Visualizing J(theta_0, theta_1) =============
fprintf('Visualizing J(theta_0, theta_1) ...\n')

% Grid over which we will calculate J
theta0_vals = linspace(-10, 10, 100); %规定了我们画图的theta0的取值范围
theta1_vals = linspace(-1, 4, 100);   %规定了我们画图的theta1的取值范围

% initialize J_vals to a matrix of 0's %初始化画图的J为0矩阵
J_vals = zeros(length(theta0_vals), length(theta1_vals));

% Fill out J_vals  计算代价函数值
for i = 1:length(theta0_vals)
for j = 1:length(theta1_vals)
t = [theta0_vals(i); theta1_vals(j)];
J_vals(i,j) = computeCost(X, y, t);
end
end

% Because of the way meshgrids work in the surf command, we need to         为什么要在surf前翻转J呢？
% transpose J_vals before calling surf, or else the axes will be flipped
J_vals = J_vals';
% Surface plot
figure;
surf(theta0_vals, theta1_vals, J_vals)
xlabel('\theta_0'); ylabel('\theta_1');



>> help surf
'surf' is a function from the file C:\Octave\OCTAVE~1.2\share\octave\4.2.2\m\plot\draw\surf.m

-- surf (X, Y, Z)
-- surf (Z)
-- surf (..., C)
-- surf (..., PROP, VAL, ...)
-- surf (HAX, ...)
-- H = surf (...)
Plot a 3-D surface mesh.

The surface mesh is plotted using shaded rectangles.  The vertices
of the rectangles [X, Y] are typically the output of 'meshgrid'.
over a 2-D rectangular region in the x-y plane.  Z determines the
height above the plane of each vertex.  If only a single Z matrix
is given, then it is plotted over the meshgrid 'X = 1:columns (Z),
Y = 1:rows (Z)'.  Thus, columns of Z correspond to different X
values and rows of Z correspond to different Y values.

The color of the surface is computed by linearly scaling the Z
values to fit the range of the current colormap.  Use 'caxis'
and/or change the colormap to control the appearance.

Optionally, the color of the surface can be specified independently
of Z by supplying a color matrix, C.

Any property/value pairs are passed directly to the underlying
surface object.

If the first argument HAX is an axes handle, then plot into this
axes, rather than the current axes returned by 'gca'.

The optional return value H is a graphics handle to the created
surface object.

Note: The exact appearance of the surface can be controlled with
the 'shading' command or by using 'set' to control surface object
properties.

surface, meshgrid, hidden, shading, colormap, caxis.

% Contour plot 等高线图
figure;
% Plot J_vals as 15 contours spaced logarithmically between 0.01 and 100
contour(theta0_vals, theta1_vals, J_vals, logspace(-2, 3, 20))
xlabel('\theta_0'); ylabel('\theta_1');
hold on;
plot(theta(1), theta(2), 'rx', 'MarkerSize', 10, 'LineWidth', 2);

