Logistic Regression(属于Classification)

最新推荐文章于 2020-06-25 04:33:47 发布

qinjunu

最新推荐文章于 2020-06-25 04:33:47 发布

阅读量343

点赞数

分类专栏： machine learning 文章标签： Logistic Regression

本文链接：https://blog.csdn.net/qj970514/article/details/95945769

版权

machine learning 专栏收录该内容

14 篇文章 0 订阅

订阅专栏

Classification

y∈{0，1}          0: “Negative Class”
                         1:“Positio Class”
Logistic Regression:      0 <= h(x) <= 1

Hypothesis(假设函数)

Sigmoid function(Loistion function):

Hypothesis:

h(x) = estimated probability that y=1 on input x
P(y = 0) + P(y = 1) = 1

Cost function

如果使用线性回归的代价函数，J(θ)图像是"non-convex"（非凸函数），无法通过迭代找到 θ。

- cost function:

y = 1:

y = 0:

方法一：Gradient Descent

对J(θ)求关于θ的偏导，形式与线性回归一致，但是 h(θ) 不一致。

方法二：Advanced optimization(高级优化)

Optimization algorithms:
         Gradient descent
         Conjugate gradient
         BFGS
         L-BFGS
Example:

costFunction():

optimset():‘GradObj’:设置梯度目标参数开关；‘MaxIter’:最大迭代次数
fminunc():无约束最小化函数

Multi-class classification:One-vs-all(多元分类：一对多)

Programming Exercise

根据学生的两门卡是成绩决定是否对其录取。
训练集（分数1，分数2，是否录取（boolean））

可视化数据

data = load('ex2data1.txt');
X = data(:, [1, 2]); y = data(:, 3);
plotData(X, y);

function plotData(X, y)

pos = find(y == 1);     %返回y == 1的索引
neg = find(y == 0);
plot(X(pos,1),X(pos,2),'k+','LineWidth',2,'MarkerSize',7);
hold on;
plot(X(neg,1),X(neg,2),'ko','MarkerFaceColor','y','MarkerSize',7);

end

实现costFunction

[m, n] = size(X);          %m:行数   n：列数
X = [ones(m, 1) X];        %左侧添加一列，元素为1
initial_theta = zeros(n + 1, 1);
[cost, grad] = costFunction(initial_theta, X, y);

function g = sigmoid(z)
g = zeros(size(z));
g = 1./(1+exp(-1 * z));
end

function [J, grad] = costFunction(theta, X, y)
m = length(y);
grad = zeros(size(theta));    
  
h = sigmoid(X * theta);        %100*1矩阵
J = (-1/m)*((y' * log(h))+(1-y)' * (log(1 - h)));
grad = (1/m)*X' * (h - y);       %公式中的X(j)代表的是一列数据

end

fminunc

options = optimset('GradObj', 'on', 'MaxIter', 400);
[theta, cost] =  fminunc(@(t)(costFunction(t, X, y)), initial_theta, options);

画决策界限

function plotDecisionBoundary(theta, X, y)
 plotData(X(:,2:3), y);         %训练集可视化
 hold on

if size(X, 2) <= 3           %特征个数<=2
	plot_x = [min(X(:,2))-2,  max(X(:,2))+2];        %确定两个特征的而范围，并向外扩大2
	plot_y = (-1./theta(3)).*(theta(2).*plot_x + theta(1));      %根据决策边界定义，可得 θ'X = 0，即 θ1 + θ2*x2 + θ3*x3 = 0，即 x3 = (-1/θ3) * (θ2*x2 + θ1)。注意，此处plot_y中的y表示y轴，而非样本y。
	plot(plot_x, plot_y)；
	legend('Admitted', 'Not admitted', 'Decision Boundary')
    	axis([30, 100, 30, 100])       %x,y轴的范围
else
	u = linspace(-1, 1.5, 50);      %1*50   类似的，首先确定x轴、y轴上的坐标取值和范围
    	v = linspace(-1, 1.5, 50);      %1*50
	z = zeros(length(u), length(v));     %50*50  z用于填写坐标(x,y)处对应的高度值

	for i = 1:length(u)
	        for j = 1:length(v)
	            z(i,j) = mapFeature(u(i), v(j))*theta;   %在分类边界明显不为直线时，使用mapFeature函数，将原来低次幂的特征映射为高次组合  （u(i)，v(2)代表一个点）mapFeature(u(i), v(j))的到1*28的矩阵（1,x1,x2,x1^2,x1x2,x2^2......x1x2^5,x2^6）
	        end
	 end
	z = z';     % 将 z 转置，以满足 contour 函数对x、y轴的特定顺序要求
	contour(u, v, z, [0, 0], 'LineWidth', 2)      % 绘制高度值为0的等高线即决策边界，[0, 0]表示高度为0
  
end
end

5. 预测和准确度

prob = sigmoid([1 45 85] * theta);
p = predict(theta, X);
fprintf('Train Accuracy: %f\n', mean(double(p == y)) * 100);     %计算准确度，mean计算平均值

function p = predict(theta, X)
m = size(X, 1);
p = zeros(m, 1);
p = sigmoid(X * theta);
for i = 1:m,
     if(p(i,1) < 0.5),
         p(i,1) = 0;
     else
         p(i,1) = 1;
     end
end
end