这是UFLDL的编程练习。
Weight decay(Softmax 回归有一个不寻常的特点:它有一个“冗余”的参数集)后的cost function和梯度函数:
- cost function:
J(θ)=−1m⎡⎣∑i=1m∑j=1k1{y(i)=j}logeθTjx(i)∑kl=1eθTlx(i)⎤⎦+λ2∑i=1k∑j=0nθ2ij
- 梯度函数:
∇θjJ(θ)=−1m∑i=1m[x(i)(1{y(i)=j}−p(y(i)=j|x(i);θ))]+λθj
p(y(i)=j|x(i);θ))等于UFLDL练习中step2中的h。
bsxfun函数的使用:
- to prevent overflow, simply subtract some large constant value from each of the
θTjx(i)terms before computing the exponential:
% M is the matrix as described in the text
M = bsxfun(@minus, M, max(M, [], 1)); - use the following code to compute the hypothesis:
% M is the matrix as described in the text
M = bsxfun(@rdivide, M, sum(M)
练习题答案(建议自己完成,后参考):
- softmaxCost.m:
M = theta*data; %exp(theta(l)' * x(i))
M = bsxfun(@minus, M, max(M, [], 1));
h = exp(M);
h = bsxfun(@rdivide, h, sum(h));
size(groundTruth);
cost = -1/numCases*sum(sum(groundTruth.*log(h)))+lambda/2*sum(sum(theta.^2));
thetagrad = -1/numCases*((groundTruth-h)*data')+lambda*theta;
- softPredict.m:
[index , pred]= max(theta * data,[],1);