神经网络——PRML

ＡｎｙｔｈｉｎｇＢｕｔＭｉｓｃ

已于 2022-10-13 14:37:24 修改

阅读量182

点赞数

文章标签：神经网络

于 2021-09-01 22:47:44 首次发布

In Chapters 3 and 4 we considered models for regression and classiﬁcation that comprised linear combinations of ﬁxed basis functions.

在第三章和第四章中，我们讨论了使用fixed基础函数的线性组合的模型用来做回归和分类。

We saw that such models have useful analytical and computational properties but that their practical applicability was limited by the curse of dimensionality.

我们看到，这种模型在分析和计算复杂性上都具备费用有用的特性，但是它们实际的可用性被维度所限制了。

In order to apply such models to large scale problems, it is necessary to adapt the basis functions to the data.

为了能把这种由简单函数的线性组合所构造出来的模型，应用到大规模/大数据量的问题上，针对数据，需要对基本函数进行一定的改造。

Support vector machines (SVMs), discussed in Chapter 7, address this by ﬁrst deﬁning basis functions that are centred on the training data points and then selecting a subset of these during training.

我们将在第七章来讨论支持向量机（SVM），SVM是怎么解决这个问题（上一段中提到的大规模数据，和简单函数线性组合模型缺少维度的矛盾）的呢？首先SVM在训练数据点群的中心上定义基础函数，然后再选择这些点的一个子集来进行训练。

One advantage of SVMs is that, although the training involves nonlinear optimization, the objective function is convex, and so the solution of the optimization problem is relatively straightforward.

SVM的一个很大的优势就是，尽管训练的过程涉及到非线性优化（nonlinear optimization），但是目标函数本身是convex（凸）的，所以这种优化问题的解决方案是相对直接（比较好求解的）。

The number of basis functions in the resulting models is generally much smaller than the number of training points, although it is often still relatively large and typically increases with the size of the training set.

在SVM方法中，最终产出的结果模型，是使用一定数量的基础函数来表达的，这个基础函数的数量通常来说都会比训练集合中的点的数量小很多，尽管很多时候，这个数量还是挺大的，并且随着训练集的元素的数量而增加。

The relevance vector machine（相关向量机）, discussed in Section 7.2, also chooses a subset from a ﬁxed set of basis functions and typically results in much sparser models.

我们也会在7.2小节来讨论“相关向量机”，这种模型会从一个fixed的基础函数集合中选择一个子集来构成最终的模型，从而会得到一个相当稀疏的模型。

（Page 226）

An alternative approach is to ﬁx the number of basis functions in advance but allow them to be adaptive, in other words to use parametric forms for the basis functions in which the parameter values are adapted during training.

使基础函数的数量固定下来的另外一种可行的方式（approach）是让基础函数的参数的取值，在训练的过程中可以变化。

The most successful model of this type in the context of pattern recognition is the feed-forward neural network, also known as the multilayer perceptron, discussed in this chapter.

在模式识别这个上下文中（隐含指的是固定数量的basis function），最成功的模型是前馈神经网络，也被称为（a.k.a.）多层感知机，我们在这一章中讨论这个模型。

In fact, ‘multilayer perceptron’ is really a misnomer, because the model comprises multiple layers of logistic regression models (with continuous nonlinearities) rather than multiple perceptrons (with discontinuous nonlinearities).

实际上，多层感知机是一种用词不当的叫法。因为这种模型中，主要室友多层的逻辑回归模型（@todo）来组成的，而不是多层的感知机（@todo）组成的。

For many applications, the resulting model can be signiﬁcantly more compact, and hence faster to evaluate, than a support vector machine having the same generalization performance.

在多种应用场景下，产出的模型都可以相当的袖珍（简洁），因此在使用的时候，速度也相当的快，相比于有相同泛华能力的SVM来说的。

The price to be paid for this compactness, as with the relevance vector machine, is that the like
lihood function（似然函数）, which forms the basis for network training, is no longer a convex function of the model parameters.

为了产生袖珍模型而付出的代价，就像相关向量机一样，就是那个似然函数（@todo），似然函数组成了神经网络训练的基础，而这个似然函数已经不再是一个基于模型参数的凸函数了。（看来凸函数有很多比较优秀的数据特性。@todo）

看到了226，paragraph 3rd， to be continued