Logistic Regression and Newton's Method Exercise

最新推荐文章于 2020-12-10 17:30:38 发布

Try_You_Can

最新推荐文章于 2020-12-10 17:30:38 发布

阅读量605

点赞数

本文链接：https://blog.csdn.net/u010457543/article/details/48054517

版权

Machine Learning 专栏收录该内容

6 篇文章 0 订阅

订阅专栏

本文将阐述logistic regression 的原理及编程实现。理论参考文档：http://openclassroom.stanford.edu/MainFolder/DocumentPage.php?course=DeepLearning&doc=exercises/ex4/ex4.html。训练样本数据为80个，根据学生两门课的成绩及相应的是否允许上大学，要求给定一位同学的两门成绩，预测是否能上大学。属于分类问题，采用logistic regression 求解。

Logistic Regression

熟悉线性回归的朋友都知道代价函数，即cost function，一般能使代价函数最小的估计参数即为我们所求的参数。logistic regression 也不例外，因为是分类问题，目标值只能是0或者1，即

那么，预测函数：

采用 logistic函数形式：

那么，对于给定的特征值x，y=1的概率为：

以上两式可以用同一式子表达如下：

那么代价函数cost function 表达如下：

我们的目标是求得相应的θ参数，使代价函数的最小化。当采用牛顿法解决时，其迭代函数如下：

其中：

对于一阶偏导有：

那么：

二阶偏导：

那么：

有以上的代价函数及相应的迭代函数即可求得最佳θ参量。

% This file deals Logistic Regression and Newton's Method

clc,clear,close all;
x = load('ex4x.dat');
y = load('ex4y.dat');

x = [ones(size(x(:, 1)), 1), x];
pos = find(y == 1);
neg = find(y == 0);
[m, n] = size(x);

% Newtow's Method
g = inline('1.0 ./ (1.0 + exp(-z))');       % inline expression

theta = zeros(n, 1);                        % parameters
thetaTemp = theta;
iter = 1;                                   % iteration number
J_theta = 0;                                % cost function
figure,
while(1)
    H = zeros(n, n);
    delta_J = zeros(n, 1);
    h_theta = g((thetaTemp' * x'));         % 1 * m
    J_thetaTemp = (1 / m) .* sum(y .* log(h_theta') + (1 - y) .* log(1 - h_theta'));
    hold on; plot(iter, J_thetaTemp, '--o', 'LineWidth',2, 'MarkerFaceColor','r')
    if (abs(J_theta - J_thetaTemp) < 0.00000001)
        theta = thetaTemp
        iter
        break;
    end
    
    temp = - 1 / m * (h_theta .* (ones(size(h_theta)) - h_theta));
    for i = 1 : m
        H = H + temp(i) .* x(i, :)' * x(i, :);
%        delta_J = delta_J - 1 / m * (h_theta(i) - y(i)) .* x(i, :)';
    end
    delta_J = - (1 / m).* x' * (h_theta'  - y);
    
    J_theta = J_thetaTemp;
    thetaTemp = thetaTemp - pinv(H) * delta_J;
    iter = iter + 1;
end
xlabel('Iteration');
ylabel('J_value');

% the probabailty that a student with
% a score of 20 on Exam 1 and a score
% of 80 on Exam 2 will not be admitted
prob = 1 - g([1, 20, 80] * theta)

% plot decision boundary
% theta3 * x2 + theta2 * x1 + theta1 = 0;
score1 = x(:, 2);
score2 = -(score1 * theta(2) + theta(1)) ./ theta(3);
figure;
plot(x(pos, 2), x(pos, 3), '+');
hold on; 
plot(x(neg, 2), x(neg, 3), 'o');
hold on; 
plot(score1, score2);
xlabel('Exam 1 score');
ylabel('Exam 2 score');
legend('Anditted', 'Not admitted', 'Decision boundary');

θ参量及迭代次数如下：

迭代次数与代价函数关系如下：

分类边界线：

Try_You_Can

关注

0
点赞
踩
0

收藏

觉得还不错? 一键收藏
0
评论
Logistic Regression and Newton's Method Exercise

本文将阐述logistic regression 的原理及编程实现。理论参考文档：http://openclassroom.stanford.edu/MainFolder/DocumentPage.php?course=DeepLearning&doc=exercises/ex4/ex4.html。训练样本数据为80个，根据学生两门课的成绩及相应的是否允许上大学，要求给定一位同学的两门成绩，预测是
复制链接

扫一扫

专栏目录