高斯过程分类和高斯过程回归
Gaussian processing (GP) is quite a useful technique that enables a non-parametric Bayesian approach to modeling. It has wide applicability in areas such as regression, classification, optimization, etc. The goal of this article is to introduce the theoretical aspects of GP and provide a simple example in regression problems.
高斯处理(GP)是一项非常有用的技术,可以采用非参数贝叶斯方法进行建模。 它在回归,分类,优化等领域具有广泛的适用性。本文的目的是介绍GP的理论方面,并为回归问题提供一个简单的示例。
Multivariate Gaussian distribution
多元高斯分布
We first need to do a refresher on multivariate Gaussian distribution, which is what GP is based on. A multivariate Gaussian distribution can be fully defined by its mean vector and covariance matrix
我们首先需要对多元高斯分布进行复习,这是GP所基于的。 多元高斯分布可以通过其均值向量和协方差矩阵完全定义
There are two important properties of Gaussian distributions that make later GP calculations possible: marginalization and conditioning.
高斯分布的两个重要属性使后来的GP计算成为可能:边际化和条件化。
Marginalization
边际化
With a joint Gaussian distribution, this can be written as,
使用联合高斯分布,可以写成:
We can retrieve a subset of the multivariate distribution via marginalization. For example, we can marginalize out the random variable Y, with the resulting X random variable expressed as follows,
我们可以通过边际化获取多元分布的子集。 例如,我们可以将随机变量Y边缘化,结果X随机变量表示如下,
Note that the marginalized distribution is also a Gaussian distribution.
注意,边缘分布也是高斯分布。
Conditioning
调理
Another important operation is conditioning, which describes the probability of a random variable given the presence of another random variable. This operation enables Bayesian inference, as we will show later, in deriving the predictions given the observed data.
另一个重要的操作是调节,它描述了在存在另一个随机变量的情况下一个随机变量的概率。 正如我们将在后面展示的那样,此操作启用贝叶斯推理,从而在给定观测数据的情况下得出预测。
With conditioning, you can derive for example,
通过调节,您可以得出例如