ex2 1.2.3 Learning parameters using fminunc 采用fminunc函数能够较快获得优化后的θ。但此处我想采用梯度下降法,步长选择0.00104,步数20万,依然收敛比较慢,选择其他学习速率则会出现震荡。
代价函数J(θ)与迭代步长的关系以及最终分类结果:
最终分类结果:
θ = [ -7.617430 0.066803 0.060309 ],与答案提供的 θ = [ -25.161 0.206 0.201]还有差距。
代码如下:
%% ============= Part 3: Optimizing using fminunc =============
% In this exercise, you will use a built-in function (fminunc) to find the
% optimal parameters theta.
% % Set options for fminunc
% options = optimset('GradObj', 'on', 'MaxIter', 400);
%
% % Run fminunc to obtain the optimal theta
% % This function will return theta and the cost
% [theta, cost] = ...
% fminunc(@(t)(costFunction(t, X, y)), initial_theta, options);
[theta, J_history] = gradientDescentMulti(X, y, initial_theta, 0.00104, 200000);
% Plot the convergence graph
figure;
plot(1:numel(J_history), J_history, '-b', 'LineWidth', 2);
xlabel('Number of iterations');
ylabel('Cost J');
% Print theta to screen
fprintf('Cost at theta found by fminunc: %f\n', cost);
fprintf('Expected cost (approx): 0.203\n');
fprintf('theta: \n');
fprintf(' %f \n', theta);
fprintf('Expected theta (approx):\n');
fprintf(' -25.161\n 0.206\n 0.201\n');
% Plot Boundary
plotDecisionBoundary(theta, X, y);
% Put some labels
hold on;
% Labels and Legend
xlabel('Exam 1 score')
ylabel('Exam 2 score')
% Specified in plot order
legend('Admitted', 'Not admitted')
hold off;
fprintf('\nProgram paused. Press enter to continue.\n');
pause;
function [theta, J_history] = gradientDescentMulti(X, y, theta, alpha, num_iters)
%GRADIENTDESCENTMULTI Performs gradient descent to learn theta
% theta = GRADIENTDESCENTMULTI(x, y, theta, alpha, num_iters) updates theta by
% taking num_iters gradient steps with learning rate alpha
% Initialize some useful values
m = length(y); % number of training examples
J_history = zeros(num_iters, 1);
for iter = 1:num_iters
% ====================== YOUR CODE HERE ======================
% Instructions: Perform a single gradient step on the parameter vector
% theta.
%
% Hint: While debugging, it can be useful to print out the values
% of the cost function (computeCostMulti) and gradient here.
%
sum_1 = 0;
sum_2 = 0;
sum_3 = 0;
for j=1:m
sum_1 = sum_1 + ( sigmoid( theta'*X(j,:)' ) - y(j) )*X(j,1);
sum_2 = sum_2 + ( sigmoid( theta'*X(j,:)' ) - y(j) )*X(j,2);
sum_3 = sum_3 + ( sigmoid( theta'*X(j,:)' ) - y(j) )*X(j,3);
end
theta(1) = theta(1) - (alpha/m)*sum_1;
theta(2) = theta(2) - (alpha/m)*sum_2;
theta(3) = theta(3) - (alpha/m)*sum_3;
% ============================================================
% Save the cost J in every iteration
[J_history(iter),~] = costFunction(theta, X, y);
end
end
有用梯度下降法解决这个问题的小伙伴交流交流。