参考了文章:
http://www.holehouse.org/mlclass/06_Logistic_Regression.html
http://blog.csdn.net/abcjennifer/article/details/7716281
binary logistic regression在minimize cost function时的计算过程与linear regression的形式上是一样的,都是:
区别就是里面的h函数不一样。
由于其形式的一样,所以使用gradient descent 的时侯和linear regression的也是一样的。我直接使用了exercise 2的代码来做验证。
假设一个函数 x2=3+1.5x1,用来产生数据,这条线上面或下面。上面用1表示,下面用0表示,得到数据样本:
x数据,有两维:x1,x2
1 4
1 4.4
1 4.6
1 5
2 5.5
2 5.8
2 6.2
2 6.3
3 7.3
3 7.7
4 8.7
4 9.3
y数据:
0
0
1
1
0
0
1
1
0
1
0
1
代码:
% line function is : x2=3+1.5*x1
clear all; close all; clc
function z=sigmoid(v)
z= 1 ./ (1 + e.^-v);
endfunction
x = load('x.dat');
y = load('y.dat');
m = length(y); % number of training examples
x = [ones(m, 1) x]; % Add a column of ones to x
theta = zeros(size(x(1,:)))'; % initialize fitting parameters
MAX_ITR = 1500;
alpha = 0.06;
x
y
theta
for num_iterations = 1:MAX_ITR
% This is a vectorized version of the
% gradient descent update formula
% It's also fine to use the summation formula from the videos
% Here is the gradient
grad = (1/m).* x' * (sigmoid(x * theta) - y);
% Here is the actual update
theta = theta - alpha .* grad;
% Sequential update: The wrong way to do gradient descent
% grad1 = (1/m).* x(:,1)' * ((x * theta) - y);
% theta(1) = theta(1) + alpha*grad1;
% grad2 = (1/m).* x(:,2)' * ((x * theta) - y);
% theta(2) = theta(2) + alpha*grad2;
end
theta
predict1 = [1, 1,3.5] *theta
predict2 = [1, 1,4.7] *theta
predict3 = [1, 2,6.1] *theta
可以看到,我只是在exercise2的代码的基础上,在循环的地方,用sigmoid包住了x与theta的计算结果。
输出
theta =
-2.6180
-1.7516
1.0481
predict1 = -0.70114
predict2 = 0.55663
predict3 = 0.27238
如果将迭代次数增加,会更加可信:
增加到15000的输出:
theta =
-16.1457
-8.3833
5.5103
predict1 = -5.2431
predict2 = 1.3692
predict3 = 0.70026