吴恩达机器学习ex2 logistic regression作业采用梯度下降法收敛较慢的问题

ex2 1.2.3 Learning parameters using fminunc 采用fminunc函数能够较快获得优化后的θ。但此处我想采用梯度下降法,步长选择0.00104,步数20万,依然收敛比较慢,选择其他学习速率则会出现震荡。

代价函数J(θ)与迭代步长的关系以及最终分类结果:

最终分类结果:

θ = [ -7.617430  0.066803  0.060309 ],与答案提供的 θ = [ -25.161 0.206 0.201]还有差距。

代码如下:

%% ============= Part 3: Optimizing using fminunc  =============
%  In this exercise, you will use a built-in function (fminunc) to find the
%  optimal parameters theta.

% %  Set options for fminunc
% options = optimset('GradObj', 'on', 'MaxIter', 400);
% 
% %  Run fminunc to obtain the optimal theta
% %  This function will return theta and the cost 
% [theta, cost] = ...
% 	fminunc(@(t)(costFunction(t, X, y)), initial_theta, options);

[theta, J_history] = gradientDescentMulti(X, y, initial_theta, 0.00104, 200000);

% Plot the convergence graph
figure;
plot(1:numel(J_history), J_history, '-b', 'LineWidth', 2);
xlabel('Number of iterations');
ylabel('Cost J');

% Print theta to screen
fprintf('Cost at theta found by fminunc: %f\n', cost);
fprintf('Expected cost (approx): 0.203\n');
fprintf('theta: \n');
fprintf(' %f \n', theta);
fprintf('Expected theta (approx):\n');
fprintf(' -25.161\n 0.206\n 0.201\n');

% Plot Boundary
plotDecisionBoundary(theta, X, y);

% Put some labels 
hold on;
% Labels and Legend
xlabel('Exam 1 score')
ylabel('Exam 2 score')

% Specified in plot order
legend('Admitted', 'Not admitted')
hold off;

fprintf('\nProgram paused. Press enter to continue.\n');
pause;
function [theta, J_history] = gradientDescentMulti(X, y, theta, alpha, num_iters)
%GRADIENTDESCENTMULTI Performs gradient descent to learn theta
%   theta = GRADIENTDESCENTMULTI(x, y, theta, alpha, num_iters) updates theta by
%   taking num_iters gradient steps with learning rate alpha

    % Initialize some useful values
    m = length(y); % number of training examples
    J_history = zeros(num_iters, 1);

    for iter = 1:num_iters

        % ====================== YOUR CODE HERE ======================
        % Instructions: Perform a single gradient step on the parameter vector
        %               theta. 
        %
        % Hint: While debugging, it can be useful to print out the values
        %       of the cost function (computeCostMulti) and gradient here.
        %

        
        sum_1 = 0;
        sum_2 = 0;
        sum_3 = 0;
        for j=1:m
            sum_1 = sum_1 + ( sigmoid( theta'*X(j,:)' ) - y(j) )*X(j,1);
            sum_2 = sum_2 + ( sigmoid( theta'*X(j,:)' ) - y(j) )*X(j,2);
            sum_3 = sum_3 + ( sigmoid( theta'*X(j,:)' ) - y(j) )*X(j,3);
        end
    
        theta(1) = theta(1) - (alpha/m)*sum_1;
        theta(2) = theta(2) - (alpha/m)*sum_2;
        theta(3) = theta(3) - (alpha/m)*sum_3;

        % ============================================================

        % Save the cost J in every iteration    
        [J_history(iter),~] = costFunction(theta, X, y);

    end

end

有用梯度下降法解决这个问题的小伙伴交流交流。

  • 0
    点赞
  • 0
    收藏
    觉得还不错? 一键收藏
  • 4
    评论

“相关推荐”对你有帮助么?

  • 非常没帮助
  • 没帮助
  • 一般
  • 有帮助
  • 非常有帮助
提交
评论 4
添加红包

请填写红包祝福语或标题

红包个数最小为10个

红包金额最低5元

当前余额3.43前往充值 >
需支付:10.00
成就一亿技术人!
领取后你会自动成为博主和红包主的粉丝 规则
hope_wisdom
发出的红包
实付
使用余额支付
点击重新获取
扫码支付
钱包余额 0

抵扣说明:

1.余额是钱包充值的虚拟货币,按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载,可以购买VIP、付费专栏及课程。

余额充值