多层神经网络

本文简单整理自《模式分类》第二版的第六章,先上一张图,描述了三层神经网络的基本概念(图片看不清的请在图片上“右键》新标签页中打开”)。


多层神经网络的理论基础参见《模式分类》第六章,这里没有做相关讨论。下面将简单分析一个stochasic backpropagation的matlab代码

function [test_targets, Wh, Wo, J] = Backpropagation_Stochastic(train_patterns, train_targets, test_patterns, params)

% Classify using a backpropagation network with stochastic learning algorithm
% Inputs:
% 	training_patterns   - Train patterns
%	training_targets	- Train targets
%   test_patterns       - Test  patterns
%	params              - Number of hidden units, Convergence criterion, Convergence rate
%
% Outputs
%	test_targets        - Predicted targets
%   Wh                  - Hidden unit weights
%   Wo                  - Output unit weights
%   J                   - Error throughout the training

[Nh, Theta, eta] = process_params(params);
iter	         = 1;

[Ni, M]          = size(train_patterns);
No		         = 1;

Uc               = length(unique(train_targets));
%If there are only two classes, remap to {-1,1}
if (Uc == 2)
    train_targets    = (train_targets>0)*2-1;
end

%Initialize the net: In this implementation there is only one output unit, so there
%will be a weight vector from the hidden units to the output units, and a weight matrix
%from the input units to the hidden units.
%The matrices are defined with one more weight so that there will be a bias
w0		= max(abs(std(train_patterns')'));
Wh		= rand(Nh, Ni+1).*w0*2-w0; %Hidden weights
Wo		= rand(No, Nh+1).*w0*2-w0; %Output weights

Wo    = Wo/mean(std(Wo'))*(Nh+1)^(-0.5);
Wh    = Wh/mean(std(Wh'))*(Ni+1)^(-0.5);

rate	= 10*Theta;
J(1)    = 1e3;

while (rate > Theta),
    %Randomally choose an example
    i	= randperm(M);
    m	= i(1);
    Xm = train_patterns(:,m);
    tk = train_targets(m);
    
    %Forward propagate the input:
    %First to the hidden units
    gh				= Wh*[Xm; 1];
    [y, dfh]		= activation(gh);
    %Now to the output unit
    go				= Wo*[y; 1];
    [zk, dfo]	= activation(go);
    
    %Now, evaluate delta_k at the output: delta_k = (tk-zk)*f'(net)
    delta_k		= (tk - zk).*dfo;
    
    %...and delta_j: delta_j = f'(net)*w_j*delta_k
    delta_j		= dfh'.*Wo(1:end-1).*delta_k;
    
    %w_kj <- w_kj + eta*delta_k*y_j
    Wo				= Wo + eta*delta_k*[y;1]';
    
    %w_ji <- w_ji + eta*delta_j*[Xm;1]
    Wh				= Wh + eta*delta_j'*[Xm;1]';
    
    iter 			= iter + 1;

    %Calculate total error
    J(iter)    = 0;
    for i = 1:M,
        J(iter) = J(iter) + (train_targets(i) - activation(Wo*[activation(Wh*[train_patterns(:,i); 1]); 1])).^2;
    end
    J(iter) = J(iter)/M; 
    rate  = abs(J(iter) - J(iter-1))/J(iter-1)*100;
    
    if (iter/100 == floor(iter/100)),
        disp(['Iteration ' num2str(iter) ': Total error is ' num2str(J(iter))])
    end
    
end

disp(['Backpropagation converged after ' num2str(iter) ' iterations.'])

%Classify the test patterns
test_targets = zeros(1, size(test_patterns,2));
for i = 1:size(test_patterns,2),
    test_targets(i) = activation(Wo*[activation(Wh*[test_patterns(:,i); 1]); 1]);
end

if (Uc == 2)
    test_targets  = test_targets >0;
end



function [f, df] = activation(x)

a = 1.716;
b = 2/3;
f	= a*tanh(b*x);
df	= a*b*sech(b*x).^2;

算法本身是梯度下降算法的一种扩展。迭代地按一定规则逐步更新w值使算法达到局部最优,w更新的规则是

w(m+1) = w(m) + Δw(m)

因为是三层网络,所以要对Wkj和Wji分别进行更新,这就是

    Wo				= Wo + eta*delta_k*[y;1]';
    Wh				= Wh + eta*delta_j'*[Xm;1]';
代码中的
[f, df] = activation(x)

实现上图中提到的activation函数,f为节点输出端的值,df为f(net)的差分即f'(net).

我们没对

Nh, Theta, eta

这三个参数进行特定的选择,默认依次为5, 0.1, 0.1,表示隐节点个数为5,dJ<0.1时结束循环,算法中的η更新速度为0.1,使用其的分了结果如下图,由此可知效果不是很好。


用于对比的SVM效果如下,SVM的分类效果很好。


以上只是最简单的神经网络的一种训练方式,要获得好的效果还需要做大量的改进。

SVM的出现比神经网络晚3~4年,SVM的出现就是为了与神经网络竞争而产生的,2006年,神经网络一族为了打败SVM,提出了深度学习(Deep Learning)算法,最近这个算法非常火,有机器学习志向的应该好好研究。


Refrences:

[1] To C. A. Rosen and C. W. Stork, patten classfication, edition 2.

  • 0
    点赞
  • 0
    收藏
    觉得还不错? 一键收藏
  • 0
    评论

“相关推荐”对你有帮助么?

  • 非常没帮助
  • 没帮助
  • 一般
  • 有帮助
  • 非常有帮助
提交
评论
添加红包

请填写红包祝福语或标题

红包个数最小为10个

红包金额最低5元

当前余额3.43前往充值 >
需支付:10.00
成就一亿技术人!
领取后你会自动成为博主和红包主的粉丝 规则
hope_wisdom
发出的红包
实付
使用余额支付
点击重新获取
扫码支付
钱包余额 0

抵扣说明:

1.余额是钱包充值的虚拟货币,按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载,可以购买VIP、付费专栏及课程。

余额充值