deeplearning—book—整理——mlp

1、前馈神经网络与RNN的关系:

--There are no feedback connections in which outputs of the model are fed back into itself. When feedforward neural networksare extended to include feedback connections, they are calledrecurrent neuralnetworks, presented in chapter 10.

2、如何理解当前前馈神经网络的研究方向:

--However, modern neural networkresearch is guided by many mathematical and engineering disciplines, and thegoal of neural networks is not to perfectly model the brain. It is best to think offeedforward networks as function approximation machines that are designed toachieve statistical generalization, occasionally drawing some insights from what weknow about the brain, rather than as models of brain function.

3、贯穿神经网络设计反复出现的主题:

--One recurring theme throughout neural network design is that the gradient ofthe cost function must be large and predictable enough to serve as a good guidefor the learning algorithm. Functions that saturate (become very flat) underminethis objective because they make the gradient become very small. In many casesthis happens because the activation functions used to produce the output of thehidden units or the output units saturate. The negative log-likelihood helps toavoid this problem for many models. Many output units involve anexpfunctionthat can saturate when its argument is very negative. Thelogfunction in thenegative log-likelihood cost function undoes theexpof some output units.

4、代价函数和输出单元的选择之间的关系:

--The choice of cost function is tightly coupled with the choice of output unit. Mostof the time, we simply use the cross-entropy between the data distribution and themodel distribution. The choice of how to represent the output then determinesthe form of the cross-entropy function.

//2017/1/19

1、纠正线性单元及其三个扩展:

--Rectified linear units use the activation function g(z) = max{0, z}.

--Three generalizations of rectified linear units are based on using a non-zeroslopeαiwhenzi<0:hi=g(z, α)i=max(0, zi) +αimin(0, zi).Absolute valuerectificationfixesαi=−1 to obtaing(z) =|z|. It is used for object recognitionfrom images (Jarrett et al., 2009), where it makes sense to seek features that areinvariant under a polarity reversal of the input illumination. Other generalizationsof rectified linear units are more broadly applicable. Aleaky ReLU(Maas et al.,2013) fixesαito a small value like 0.01 while aparametric ReLUorPReLUtreats αias a learnable parameter (He et al., 2015).

2、纠正线性单元的另一个扩展:

--Maxout units(Goodfellow et al., 2013a) generalize rectified linear unitsfurther. Instead of applying an element-wise functiong(z), maxout units dividezinto groups ofkvalues. Each maxout unit then outputs the maximum element of one of these groups.

--A maxout unit can learn a piecewise linear, convex function with up tokpieces.Maxout units can thus be seen as learning the activation function itself ratherthan just the relationship between units. With large enoughk, a maxout unit canlearn to approximate any convex function with arbitrary fidelity. In particular,a maxout layer with two pieces can learn to implement the same function of theinputxas a traditional layer using the rectified linear activation function, absolutevalue rectification function, or the leaky or parametric ReLU, or can learn toimplement a totally different function altogether.

//2017/1/20

1、对back-propagation术语的正确理解:

--The term back-propagation is often misunderstood as meaning the wholelearning algorithm for multi-layer neural networks. Actually, back-propagationrefers only to the method for computing the gradient, while another algorithm,such as stochastic gradient descent, is used to perform learning using this gradient.Furthermore, back-propagation is often misunderstood as being specific to multi-layer neural networks, but in principle it can compute derivatives of any function.

评论
添加红包

请填写红包祝福语或标题

红包个数最小为10个

红包金额最低5元

当前余额3.43前往充值 >
需支付:10.00
成就一亿技术人!
领取后你会自动成为博主和红包主的粉丝 规则
hope_wisdom
发出的红包
实付
使用余额支付
点击重新获取
扫码支付
钱包余额 0

抵扣说明:

1.余额是钱包充值的虚拟货币,按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载,可以购买VIP、付费专栏及课程。

余额充值