线性解码器
动机
当采用稀疏自编码器的输出层采用sigmoid作为激励函数时,要求对输入进行缩放或限制,使其位于[0,1]范围中。但是有些输入值很难满足要求,如PCA白化处理的输入并不满足该范围要求。因此,可对稀疏自编码器的输出层激活函数从an=sigmoid(zn)改为an=zn.——称为线性激励函数。
把具有线性解码器的稀疏自编码器应用于彩色特征提取
预处理:
meanPatch = mean(patches, 2);
patches = bsxfun(@minus, patches, meanPatch);
% Apply ZCA whitening
sigma = patches * patches' / numPatches;
[u, s, v] = svd(sigma);
ZCAWhite = u * diag(1 ./ sqrt(diag(s) + epsilon)) * u';
patches = ZCAWhite * patches;
代价函数和梯度:
Jcost = 0;%直接误差
Jweight = 0;%权值惩罚
Jsparse = 0;%稀疏性惩罚
[n m] = size(data);%m为样本的个数,n为样本的特征数
%
% %前向算法计算各神经网络节点的线性组合值和active值
z2=W1*data+repmat(b1,1,m);
a2=sigmoid(z2);
z3=W2*a2+repmat(b2,1,m);
a3=z3;%%%和稀疏自编码器不同的地方
Jcost=(0.5/m)*sum(sum((a3-data).^2));
Jweight=0.5*(sum(sum(W1.^2))+sum(sum(W2.^2)));
rho=(1/m).*sum(a2,2);
Jsparse=sum(sparsityParam.*log(sparsityParam./rho)+(1-sparsityParam).*log((1-sparsityParam)./(1-rho)));
cost=Jcost+lambda*Jweight+beta*Jsparse;
d3=-(data-a3);%%%和稀疏自编码器不同的地方
sterm = beta*(-sparsityParam./rho+(1-sparsityParam)./(1-rho));
d2=(W2'*d3+repmat(sterm,1,m)).*(sigmoid(z2).*(1-sigmoid(z2)));
W1grad=(1/m).*(d2*data')+lambda.*W1;
W2grad=(1/m).*(d3*a2')+lambda.*W2;
b1grad=(1/m).*sum(d2,2);
b2grad=(1/m).*sum(d3,2);
%-------------------------------------------------------------------
% After computing the cost and gradient, we will convert the gradients back
% to a vector format (suitable for minFunc). Specifically, we will unroll
% your gradient matrices into a vector.
grad = [W1grad(:) ; W2grad(:) ; b1grad(:) ; b2grad(:)];