1 Logistic Regression
1.1 Visualizing the data
在plotData.m中,添加如下代码:
pos = find(y==1); neg = find(y==0);
plot(X(pos,1),X(pos,2),'k+','LineWidth',2,'MarkerSize',7);
%X(pos,1)and X(pos,2) 得到正样本的行列坐标
%'k+' 用黑色+表示正样例
%LineWidth :线的宽度
%MarkerSize :标示的符号的大小,这儿是+的大小
plot(X(neg,1),X(neg,2),'ko','MarkerFaceColor','y','MarkerSize',7);
%MarkerFaceColor :标示符号中填充的颜色
1.2 Implementation
1.2.1 Warmup exercise: sigmoid function
假设函数的定义:
其中,
sigmoid.m中添加代码:
g = 1 ./(1+exp(-1*z));
或者:
g = (1+exp(-1*z)).^(-1);
1.2.2 Cost function and gradient
J(θ)的计算公式:
梯度下降的公式:
costFunction.m中添加如下代码:
h = sigmoid(X*theta);
S = y.*log(h)+(1-y).*log(1-h);
J = (-1/m)*sum(S(:));
%grad = (1/m)*((h-y)'*X)';
grad = (-1/m)*((h-y)'*X)';
%注意::关于J的计算,注意*和.*的区别,.*是必须同行同列
解释:
(1) 数据:
34.62365962451697,78.0246928153624,0
30.28671076822607,43.89499752400101,0
35.84740876993872,72.90219802708364,0
60.18259938620976,86.30855209546826,1
79.0327360507101,75.3443764369103,1
45.08327747668339,56.3163717815305,0
61.10666453684766,96.51142588489624,1
75.02474556738889,46.55401354116538,1
76.09878670226257,87.42056971926803,1
84.43281996120035,43.53339331072109,1
最后传入的实际参数是:
Initial_theta中添加theta0
initial_theta =
0
0
0
X =
1.0000 34.6237 78.0247
1.0000 30.2867 43.8950
1.0000 35.8474 72.9022
1.0000 60.1826 86.3086
1.0000 79.0327 75.3444
1.0000 45.0833 56.3164
1.0000 61.1067 96.5114
1.0000 75.0247 46.5540
1.0000 76.0988 87.4206
1.0000 84.4328 43.5334
y =
0
0
0
1
1
0
1
1
1
1
函数中的变量:
m为训练样本的数目
J为cost
grad的初值为:
grad =
0
0
0
h = sigmoid(X*theta);%计算假设函数h
h的值为100*1的向量:
h =
0.5000
0.5000
0.5000
0.5000
0.5000
0.5000
0.5000
0.5000
0.5000
0.5000
S = y.*log(h)+(1-y).*log(1-h);
S为一个100*1的向量。
S =
-0.6931
-0.6931
-0.6931
-0.6931
-0.6931
-0.6931
-0.6931
-0.6931
-0.6931
-0.6931
-0.6931
-0.6931
J = (-1/m)*sum(S(:));
J= 0.6931
grad = (1/m)*((h-y)'*X)';%找到各个theta的最优值。
grad =
0.1000
12.0092
11.2628
1.2.3 Learning parameters using fminunc
直接submit,只要保证costFunction正确。
1.2.4 Evaluating logistic regression
在predict.m中填写如下代码:
p_Matric=sigmoid(X*theta);%计算sigmoid (m*1矩阵)
pos=find(p_Matric>0.5);%寻找>0.5的位置
p(pos,:)=1;%设置为1
2 Regularized logistic regression
2.1 Visualizing the data
2.2 Feature mapping
2.3 Cost function and gradient
Costfunction 的公式:
梯度下降函数:
填写代码:
h = sigmoid(X*theta);
t = theta(2:length(theta),1);
S = y.*log(h)+(1-y).*log(1-h);
J = (-1/m)*sum(S(:))+(lambda/(2*m))*sum(t.^2);
grad_1 = (1/m)*((h-y)'*X)';
grad_2 = (1/m)*((h-y)'*X)'+(lambda/m)*theta;
grad = [grad_1(1,1);grad_2(2:length(theta),1)];
解释:
function [J, grad] = costFunctionReg(theta, X, y, lambda)
X为118个训练样本
X为前两列的数值。
X =
0.0513 0.6996
-0.0927 0.6849
-0.2137 0.6923
-0.3750 0.5022
-0.5132 0.4656
-0.5248 0.2098
-0.3980 0.0344
-0.3059 -0.1923
0.0167 -0.4042
0.1319 -0.5139
y为实际值
y =
1
1
1
1
1
1
1
1
1
1
1
通过mapFeature之后,X变为28维的矩阵。
theta为28*1的矩阵
lambda = 1;
h = sigmoid(X*theta);
h为118*1的矩阵
t = theta(2:length(theta),1);
t得到的是27*1的矩阵,从第2行开始
关于计算grad
grad_1 = (1/m)*((h-y)'*X)';
grad_2 = (1/m)*((h-y)'*X)'+(lambda/m)*theta;
grad = [grad_1(1,1);grad_2(2:length(theta),1)];
grad中的theta0的计算需要注意,注意上面两个公式中j的起始下标。