Deep learning:一softmax Regression 练习

最新推荐文章于 2019-03-21 17:43:55 发布

Richel-Li

最新推荐文章于 2019-03-21 17:43:55 发布

阅读量494

点赞数

分类专栏： Deep learning

本文链接：https://blog.csdn.net/u011964544/article/details/41719657

版权

Deep learning 专栏收录该内容

2 篇文章 0 订阅

订阅专栏

引言：

参看的是http://www.cnblogs.com/tornadomeet/archive/2013/03/23/2977621.html 和 http://deeplearning.stanford.edu/wiki/index.php/Exercise:Softmax_Regression

主要完成的是高光谱数据，训练样本103*42776，测试样本是103*21391，实验环境是MATLAB2009a

实验理论：

只用了softmax模型，没有隐含层，只有输入输出，，输入为原始的高光谱图像，全部数据作为训练，一半数据作为预测。在试验中主要计算误差函数和其偏导数。

其推理过程如下：

oftmax regression中对参数的最优化求解不只一个，每当求得一个优化参数时，如果将这个参数的每一项都减掉同一个数，其得到的损失函数值也是一样的。这说明这个参数不是唯一解。用数学公式证明过程如下所示：

　　那这个到底是什么原因呢？从宏观上可以这么理解，因为此时的损失函数不是严格非凸的，也就是说在局部最小值点附近是一个”平坦”的，所以在这个参数附近的值都是一样的了。那么怎样避免这个问题呢？其实加入规则项就可以解决（比如说，用牛顿法求解时，hession矩阵如果没有加入规则项，就有可能不是可逆的从而导致了刚才的情况，如果加入了规则项后该hession矩阵就不会不可逆了），加入规则项后的损失函数表达式如下：

　　这个时候的偏导函数表达式如下所示：

注意的事项：

MATLAB程序的实现过程为：在softmaxCost函数中，groundTruth=full(sparse(labels,1:numCase,1))可能不是很好理解：比如data=[1 2 3 4;5 6 7 8],是一个2*4的矩阵，labels为[3 2 4 1],sparse(labels,1:numCase,1):(3,1) 1;(2,1) 1;(4,3) 1;(1,4) 1;比如（1,4）表示标签为1时第4个样本为1，即1{y(4)=1}=0,如果y(4)=其他则要为0；进一步扩展矩阵

0 0 0 1
0 1 0 0
1 0 0 0

0 0 1 0

在softmaxPredict中：theta=softmaxModel.optTheta;

pred=zeros(1,size(data,2));
[nop,pred]=max(theta*data); nop为每一行中最大的数，pred为该数对应的类别是多少；利用acc=mean(labels(:)==pred(:))来计算精确度

实验结果为：74.106%,精度很低，说明softmax不能直接用来对数据进行分类，相比于SVM精度很低。

还需进一步完善的地方：在MATLAB中矩阵的乘法还不是很熟悉，有待进一步练习；

附录代码：

softmaxExercise

clc;
clear all;

%%======================================================================
%% STEP 0: Initialise constants and parameters
%
%  Here we define and initialise some constants which allow your code
%  to be used more generally on any arbitrary input. 
%  We also initialise some parameters used for tuning the model.

inputSize=103;
numClasses=9;
lambda=1e-4;


%%======================================================================
%% STEP 1: Load data
%
%  In this section, we load the input and output data.
%  For softmax regression on MNIST pixels, 
%  the input data is the images, and 
%  the output data is the labels.
load one.mat
train_data=[train_data test_data];
train_data=(train_data-min(train_data(:)))./(max(train_data(:))-min(train_data(:)));
images = train_data;
labels = [train_label;test_label];
%labels(labels==0)=10;

inputData=images;

% DEBUG = true; % Set DEBUG to true when debugging.
DEBUG = false;
if DEBUG
    inputSize = 8;
    inputData = randn(8, 100);
    labels = randi(10, 100, 1);
end

theta=0.005*randn(numClasses*inputSize,1);


%%======================================================================
%% STEP 2: Implement softmaxCost
%
%  Implement softmaxCost in softmaxCost.m. 

[cost,grad]=softmaxCost(theta,numClasses,inputSize,lambda,inputData,labels);



%%======================================================================
%% STEP 3: Gradient checking
%
%  As with any learning algorithm, you should always check that your
%  gradients are correct before learning the parameters.
% 

if DEBUG
    numGrad = computeNumericalGradient( @(x) softmaxCost(x, numClasses, ...
                                    inputSize, lambda, inputData, labels), theta);

    % Use this to visually compare the gradients side by side
    disp([numGrad grad]); 

    % Compare numerically computed gradients with those computed analytically
    diff = norm(numGrad-grad)/norm(numGrad+grad);
    disp(diff); 
    % The difference should be small. 
    % In our implementation, these values are usually less than 1e-7.

    % When your gradients are correct, congratulations!
end



%% STEP 4: Learning parameters
%
%  Once you have verified that your gradients are correct, 
%  you can start training your softmax regression code using softmaxTrain
%  (which uses minFunc).

options.maxIter=100;
%softmaxModel其实只是一个结构体，里面包含了学习到的最优参数以及输入尺寸大小和类别个数信息
softmaxModel=softmaxTrain(inputSize,numClasses,lambda,inputData,labels,options);



%%======================================================================
%% STEP 5: Testing
%
%  You should now test your model against the test images.
%  To do this, you will first need to write softmaxPredict
%  (in softmaxPredict.m), which should return predictions
%  given a softmax model and the input data.
test_data=(test_data-min(test_data(:)))./(max(test_data(:))-min(test_data(:)));
images = test_data;
labels = test_label;
%labels(labels==0) = 10; % Remap 0 to 10


inputData=images;
size(softmaxModel.optTheta);
size(inputData);


[pred]=softmaxPredict(softmaxModel,inputData);
acc=mean(labels(:)==pred(:));

fprintf('Accurancy: %0.3f%%\n', acc*100);

softmaxCost

function [cost,grad]=softmaxCost(theta,numClasses,inputSize,lambda,data,labels)

theta=reshape(theta,numClasses,inputSize);

numCase=size(data,2);
groundTruth=full(sparse(labels,1:numCase,1));%%不容易理解的地方

cost = 0;
thetagrad = zeros(numClasses, inputSize);

M=bsxfun(@minus,theta*data,max(theta*data,[],1));
M=exp(M);
p=bsxfun(@rdivide,M,sum(M));

cost=-1/numCase*groundTruth(:)'*log(p(:))+lambda/2*sum(theta(:).^2);

thetagrad=-1/numCase*(groundTruth-p)*data'+lambda*theta;
grad =thetagrad(:);

end

softmaxTrain

function softmaxModel=softmaxTrain(inputSize,numClasses,lambda,inputData,labels,options)

if ~exist('options', 'var')
    options = struct;
end

if ~isfield(options, 'maxIter')
    options.maxIter = 400;
end

theta = 0.005 * randn(numClasses * inputSize, 1);


addpath minFunc/
options.Method='lbfgs';

minFuncOptions.display='on';

[softmaxOptTheta, cost] = minFunc( @(p) softmaxCost(p, ...
                                   numClasses, inputSize, lambda, ...
                                   inputData, labels), ...                                   
                              theta, options);

% Fold softmaxOptTheta into a nicer format
softmaxModel.optTheta = reshape(softmaxOptTheta, numClasses, inputSize);
softmaxModel.inputSize = inputSize;
softmaxModel.numClasses = numClasses;
                          
end

softmaxPredict

function [pred]=softmaxPredict(softmaxModel,data)

theta=softmaxModel.optTheta;
pred=zeros(1,size(data,2));


[nop,pred]=max(theta*data);
%nop为每一列最大的数，pred为每一列中索引数；

Richel-Li

关注

0
点赞
踩
0

收藏

觉得还不错? 一键收藏
0
评论
Deep learning:一softmax Regression 练习

引言：参看的是http://www.cnblogs.com/tornadomeet/archive/2013/03/23/2977621.html 和 http://deeplearning.stanford.edu/wiki/index.php/Exercise:Softmax_Regression 主要完成的是高光谱数据，训练样本103*42776，测
复制链接

扫一扫

专栏目录