【读书1】【2017】MATLAB与深度学习——代价函数与学习规则(4)

图3-12描述了本节迄今所解释的内容。

在这里插入图片描述
Figure 3-12 depicts what this section hasexplained so far.

图3-12 输出层和隐藏层采用不同的增量计算公式The output andhidden layers employ the different formulas of the delta calculation

关键在于当学习规则基于交叉熵和sigmoid函数时，输出层和隐藏层采用了不同的增量计算公式。

The key is the fact that the output andhidden layers employ the different formulas of the delta calculation when thelearning rule is based on the cross entropy and the sigmoid function.

这里我们将只讨论一个关于代价函数的问题。

While we are at it, we will address justone more thing about the cost function.

在第1章中谈到的过拟合是一个挑战性的问题，每一种机器学习技术都要面对这一问题。

You saw in Chapter 1 that overfitting is achallenging problem that every technique of Machine Learning faces.

你还看到，用于克服过拟合的主要方法之一是采用正则化使得模型尽可能简单。

You also saw that one of the primaryapproaches used to overcome overfitting is making the model as simple aspossible using regularization.

在数学意义上，正则化的本质是将权重之和累加到代价函数中，如下所示。

In a mathematical sense, the essence ofregularization is adding the sum of the weights to the cost function, as shownhere.

当然，应用下面的新代价函数会导致不同的学习规则。

Of course, applying the following new costfunction leads to a different learning rule formula.

当某个输出误差和权重保持较大时，则该代价函数保持较大的值。

This cost function maintains a large valuewhen one of the output errors and the weight remain large.

因此，仅使输出误差为零就不足以降低代价函数。

Therefore, solely making the output errorzero will not suffice in reducing the cost function.

为了减小代价函数的值，应尽可能减小误差和权重。

In order to drop the value of the costfunction, both the error and weight should be controlled to be as small aspossible.

然而，如果权重足够小，则意味着相关节点实际上断开连接。

However, if a weight becomes small enough,the associated nodes will be practically disconnected.

因此，不必要的节点连接被消除，神经网络将变得更简单。

As a result, unnecessary connections areeliminated, and the neural network becomes simpler.

因此，可以通过向代价函数中添加权重和来改善神经网络的过拟合，可以有效减少过拟合。

For this reason, overfitting of the neuralnetwork can be improved by adding the sum of weights to the cost function,thereby reducing it.

总之，神经网络监督学习的学习规则是根据代价函数推导的。

In summary, the learning rule of the neuralnetwork’s supervised learning is derived from the cost function.

学习规则和神经网络的性能与代价函数的选择有关。

The performance of the learning rule andthe neural network varies depending on the selection of the cost function.

最近，交叉熵函数作为代价函数已经引起极大关注。

The cross entropy function has beenattracting recent attention for the cost function.

用于处理过拟合的正则化过程是作为代价函数的变化来实现的。

The regularization process that is used todeal with overfitting is implemented as a variation of the cost function.

——本文译自Phil Kim所著的《Matlab Deep Learning》

更多精彩文章请关注微信号：在这里插入图片描述