多层神经网络和BP反向传播算法Multilayer Networks and Backpropagation

最新推荐文章于 2024-08-22 09:18:16 发布

GarfieldEr007

最新推荐文章于 2024-08-22 09:18:16 发布

阅读量1.8k

点赞数

分类专栏： Deep Learning 文章标签：多层神经网络 BP 反向传播算法 Multilayer Networks

Deep Learning 专栏收录该内容

188 篇文章 6 订阅

订阅专栏

Multilayer Networks and Backpropagation

Introduction

Much early research in networks was abandoned because of the severe limitations of single layer linear networks. Multilayer networks were not "discovered" until much later but even then there were no good training algorithms. It was not until the `80s that backpropagation became widely known.
People in the field joke about this because backprop is really just applying the chain rule to compute the gradient of the cost function. How many years should it take to rediscover the chain rule?? Of course, it isn't really this simple. Backprop also refers to the very efficient method that was discovered for computing the gradient.

Note: Multilayer nets are much harder to train than single layer networks. That is, convergence is much slower and speed-up techniques are more complicated.

Method of Training: Backpropagation

Define a cost function (e.g. mean square error)

where the activation y at the output layer is given by

and where

z is the activation at the hidden nodes
f2 is the activation function at the output nodes
f1 is the activation function at the hidden nodes.

Written out more explicitly, the cost function is

or all at once:

Computing the gradient: for the hidden-to-output weights:

the gradient: for the input-to-hidden weights:

Summary of Gradients

hidden-to-output weights:

where

input-to-hidden:

where

Implementing Backprob

Create variables for :

the weights W and w,
the net input to each hidden and output node, neti
the activation of each hidden and output node, yi = f(neti)
the "error" at each node, ´i

For each input pattern k:

Step 1: Foward Propagation

Compute neti and yi for each hidden node, i=1,..., h:

Compute netj and yj for each output node, j=1,...,m:

Step 2: Backward Propagation

Compute ´2's for each output node, j=1,...,m:

Compute ´1's for each hidden node, i=1,...,h

Step 3: Accumulate gradients over the input patterns (batch)

Step 4: After doing steps 1 to 3 for all patterns, we can now update the weights:

Networks with more than 2 layers

The above learning procedure (backpropagation) can easily be extended to networks with any number of layers.

from: http://www.willamette.edu/~gorr/classes/cs449/Backprop/backprop.html

关注

0
点赞
踩
0

收藏

觉得还不错? 一键收藏
0
评论
复制链接

分享到 QQ

分享到新浪微博

扫一扫

专栏目录

评论

被折叠的条评论为什么被折叠?

到【灌水乐园】发言

查看更多评论

添加红包

成就一亿技术人!

hope_wisdom

发出的红包

实付元

使用余额支付

点击重新获取

扫码支付

钱包余额 0

抵扣说明：

1.余额是钱包充值的虚拟货币，按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载，可以购买VIP、付费专栏及课程。