Softmax回归_matlab写softmax回归成本函数-CSDN博客

本文链接：https://blog.csdn.net/qq_25491201/article/details/51289638

Softmax回归

这次，让我们一起来做Softmax回归。首先我们先去下载。

同时，这次我们将用到 computeNumericalGradient.m这个是在我们做稀疏自编码的时候的。

在这次实验里面，我们只需要完成softmaxCost.m和softmaxPredict.m这两个文件的修改。

第0步:初始化常数和变量

这一步，系统已经自动帮我们完成了。

第1步:加载数据

我们将加载MNIST的图片和标签，把它们作为inputData和labels。图片是经过预处理成0到1之间的像素值，label0倍映射成10为了方便。当然，上面的代码也已经给出。

第2步:执行softmaxCost

这步是需要我们去完成的，我们需要实现代价函数和梯度的计算。

function [cost, grad] = softmaxCost(theta, numClasses, inputSize, lambda, data, labels)

% numClasses - the number of classes 
% inputSize - the size N of the input vector
% lambda - weight decay parameter
% data - the N x M input matrix, where each column data(:, i) corresponds to
%        a single test set
% labels - an M x 1 matrix containing the labels corresponding for the input data
%

% Unroll the parameters from theta
theta = reshape(theta, numClasses, inputSize);

numCases = size(data, 2);
%full函数是将sparse稀疏矩阵转换为全矩阵,稀疏矩阵是为了节省存储空间(只记录非0的)
%而全矩阵则是连同0项也一同记录了，全矩阵就是matlab里面一般的矩阵，两者的目的不一样
%全矩阵是为了计算方便，稀疏矩阵是为了节省存储空间
groundTruth = full(sparse(labels, 1:numCases, 1));
cost = 0;

thetagrad = zeros(numClasses, inputSize);

%% ---------- YOUR CODE HERE --------------------------------------
%  Instructions: Compute the cost and gradient for softmax regression.
%                You need to compute thetagrad and cost.
%                The groundTruth matrix might come in handy.
%size(theta) 10x8
%size(data) 8x100
%size(groundTruth) 10x100
%thetagrad  10x8

%第一步
M=theta*data;
%第二步，减去每一列M中的最大值，防止内存溢出
M=bsxfun(@minus,M,max(M,[],1));
%第三步
M=exp(M);
%计算p矩阵
p = bsxfun(@rdivide, M, sum(M));
%计算代价函数，别忘了权重衰减项
cost=-sum(sum(groundTruth.*log(p)))/numCases+lambda*sum(theta(:).^2)/2;
%计算梯度
thetagrad=(-1/numCases)*(groundTruth-p)*data'+lambda*theta;
% ------------------------------------------------------------------
% Unroll the gradient matrices into a vector for minFunc
grad = [thetagrad(:)];
end

第3步:梯度检验

做这一步的时候，我们得把SoftmaxExercise.m文件里面的Debug改成false，意思是不再调试。(否则会出错)

梯度检验这一代码，SoftmaxExercise.m里面已经帮我们写好。

第4步:学习的系数

其实这里的系数，SoftmaxExercise.m里面默认已经帮你弄好了。

第5步:测试

我们将会用MNIST的训练集来进行测试。我们先来改一下

function [pred] = softmaxPredict(softmaxModel, data)

% softmaxModel - model trained using softmaxTrain
% data - the N x M input matrix, where each column data(:, i) corresponds to
%        a single test set
%
% Your code should produce the prediction matrix 
% pred, where pred(i) is argmax_c P(y(c) | x(i)).
 
% Unroll the parameters from theta
theta = softmaxModel.optTheta;  % this provides a numClasses x inputSize matrix
pred = zeros(1, size(data, 2));

%% ---------- YOUR CODE HERE --------------------------------------
%  Instructions: Compute pred using theta assuming that the labels start 
%                from 1.

%p不是概率，但是和概率是正相关的
p = theta*data;
%得到p最大的位置的所在下标
[junk, idx] = max(p, [], 1);

pred = idx;






% ---------------------------------------------------------------------

end

这时候，我们就可以来运行一下程序了。

运行结果