Encog3Java-User.pdf翻译：第五章传播训练_manhattan权重更新规则-CSDN博客

本文链接：https://blog.csdn.net/weilan06/article/details/78683936

本文档翻译了Encog3Java的第五章，重点讲解了在超线程四核处理器上使用Encog进行传播训练的过程和原理，详细阐述了Encog在神经网络训练中的应用。

摘要由CSDN通过智能技术生成

Chapter 5
第五章

Propagation Training
传播训练

? How Propagation Training Works
? Propagation Training Types
? Training and Method Factories
? Multithreaded Training
Training is the means by which neural network weights are adjusted to give desirable outputs. This book will cover both supervised and unsupervised training. This chapter will discuss propagation training, a form of supervised training where the expected output is given to the training algorithm.
训练是调整神经网络权值以获得理想输出的手段。这本书将涵盖监督和非监督训练。本章将讨论传播训练，一种监督训练的形式，其中期望输出要提供给训练算法。

Encog also supports unsupervised training. With unsupervised training, the neural network is not provided with the expected output. Rather, the neural network learns and makes insights into the data with limited direction. Chapter 10 will discuss unsupervised training.
encog也支持无监督训练。在无监督训练下，神经网络不具有期望输出。而是在被限定方向上，神经网络学习和洞察数据。第10章将讨论无监督训练。

Propagation training can be a very effective form of training for feedforward, simple recurrent and other types of neural networks. While there are several different forms of propagation training, this chapter will focus on the forms of propagation currently supported by Encog. These six forms are listed as follows:
传播训练可以作为前馈、简单递归和其他类型神经网络的一种非常有效的训练形式。虽然有几种不同形式的传播训练，本章将重点放在目前encog支持的传播形式。这六种形式如下所示：

? Backpropagation Training
? Quick Propagation Training (QPROP)
? Manhattan Update Rule
? Resilient Propagation Training (RPROP)
? Scaled Conjugate Gradient (SCG)
? Levenberg Marquardt (LMA)
All six of these methods work somewhat similarly. However, there are some important differences. The next section will explore propagation training in general.
所有这六种方法都有点类似。然而，有一些重要的区别。总的来说，下一节将探讨传播训练。

5.1 Understanding Propagation Training
5.1 了解传播训练

Propagation training algorithms use supervised training. This means that the training algorithm is given a training set of inputs and the ideal output for each input. The propagation training algorithm will go through a series of iterations that will most likely improve the neural network’s error rate by some degree.
使用监督训练的传播训练算法。这意味着训练算法被赋予一组训练输入和每个输入的理想输出。传播训练算法将经过一系列迭代，很可能在一定程度上改善神经网络的误差率。

The error rate is the percent difference between the actual output from the neural network and the ideal output provided by the training data.Each iteration will completely loop through the training data. For each item of training data, some change to the weight matrix will be calculated.
误差率是神经网络的实际输出与训练数据所提供的理想输出之间的百分差。每次迭代都会完全通过训练数据循环。对于每一项训练数据，都会对计算权重矩阵产生一些变化。

These changes will be applied in batches using Encog’s batch training. Therefore, Encog updates the weight matrix values at the end of an iteration.Each training iteration begins by looping over all of the training elements in the training set. For each of these training elements, a two-pass process is executed: a forward pass and a backward pass.
这些变化将被应用在encog批量批次训练中。因此，在一个迭代结束时encog更新权重矩阵的值。每个训练迭代开始遍历所有的训练内容。对于每一个训练元素，执行两个传输过程：正向传输和反向传输。

The forward pass simply presents data to the neural network as it normally would if no training had occurred. The input data is presented and the algorithm calculates the error, i.e. the difference between the actual and ideal outputs. The output from each of the layers is also kept in this pass. This allows the training algorithms to see the output from each of the neural network layers.
前向传递只是将数据呈现给神经网络，像没有训练时它通常做的一样。输入数据和算法计算误差，即实际输出和理想输出之间的差异。每个层的输出也保存在这次传输中。这使得训练算法能够看到神经网络每个层的输出。

The backward pass starts at the output layer and works its way back to the input layer. The backward pass begins by examining the difference between each of the ideal and actual outputs from each of the neurons. The gradient of this error is then calculated. To calculate this gradient, the neural network’s actual output is applied to the derivative of the activation function used for this level. This value is then multiplied by the error.
反向传递从输出层开始，并返回到输入层。反向传递开始于检查每个神经元的理想输出和实际输出之间的差异。然后计算此误差的梯度。为了计算这个梯度，将神经网络的实际输出应用于该层的激活函数的导数。然后这个值乘以误差。

Because the algorithm uses the derivative function of the activation function, propagation training can only be used with activation functions that actually have a derivative function. This derivative calculates the error gradient for each connection in the neural network. How exactly this value is used depends on the training algorithm used.
由于该算法使用激活函数的导数函数，所以传播训练只能用于实际具有导数函数的激活函数。这个导数计算神经网络中每个连接的误差梯度。这个值是如何使用的取决于所使用的训练算法。

5.1.1 Understanding Backpropagation
5.1.1 了解反向传播

Backpropagation is one of the oldest training methods for feedforward neural networks. Backpropagation uses two parameters in conjunction with the gradient descent calculated in the previous section. The first parameter is the learning rate which is essentially a percent that determines how directly the gradient descent should be applied to the weight matrix. The gradient is multiplied by the learning rate and then added to the weight matrix. This slowly optimizes the weights to values that will produce a lower error.
反向传播是前馈神经网络最古老的训练方法之一。反向传播使用了两个参数，与前一节计算的梯度下降相结合。第一个参数是学习率，本质上是百分数，它决定了梯度下降如何直接应用于权重矩阵。梯度乘以学习速率，然后加入到权重矩阵中。这缓慢地将权重优化为会产生较低误差的值。

One of the problems with the backpropagation algorithm is that the gradient descent algorithm will seek out local minima. These local minima are points of low error, but may not be a global minimum. The second parameter provided to the backpropagation algorithm helps the backpropagation out of loca