【论文翻译】A Global Geometric Framework for Nonlinear Dimensionality Reduction

论文题目:A Global Geometric Framework for Nonlinear Dimensionality Reduction

论文来源:https://www.sci-hub.ren/10.2307/3081721

A Global Geometric Framework for Nonlinear Dimensionality Reduction

非线性降维的全局几何框架

Joshua B. Tenenbaum,1 Vin de Silva,2 John C. Langford3*

Abstract

Scientists working with large volumes of high-dimensional data, such as global climate patterns, stellar spectra, or human gene distributions, regularly confront the problem of dimensionality reduction: finding meaningful low-dimensional structures hidden in their high-dimensional observations. The human brain confronts the same problem in everyday perception, extracting from its high-dimensional sensory inputs—30,000 auditory nerve fibers or 106 optic nerve fibers—a manageably small number of perceptually relevant features. Here we describe an approach to solving dimensionality reduction problems that uses easily measured local metric information to learn the underlying global geometry of a data set. Unlike classical techniques such as principal component analysis (PCA) and multidimensional scaling (MDS), our approach is capable of discovering the nonlinear degrees of freedom that underlie complex natural observations,such as human handwriting or images of a face under different viewing conditions. In contrast to previous algorithms for nonlinear dimensionality reduction,our sefficiently computes a globally optimal solution, and, for an important class of data manifolds, is guaranteed to converge asymptotically to the true structure。

摘要

科学家们在处理大量高维数据时,如全球气候模式、恒星光谱或人类基因分布等,经常会面临维度降低的问题:在高维观测过程中,发现隐藏在其中的有意义的低维结构。人脑在日常感知中也面临同样的问题,从高维感官输入中提取出30,000个听觉神经元或106个视神经纤维,这是数量很少的感知相关特征。在这里,我们描述了一种解决维度降低问题的方法,该方法使用易于测量的局部度量信息来学习数据集的底层全局几何,与主成分分析(PCA)和多维度缩放(MDS)等经典技术不同,我们的方法能够发现复杂的自然观察结果所蕴含的非线性自由度,例如不同观察条件下的人类笔迹或人脸图像。与以往的非线性维度降低算法相比,我们的方法能够计算出一个全局最优的解,并且对于一类重要的数据表征,保证渐进地收敛到真实结构。

正文

A canonical problem in dimensionality re-duction from the domain of visual perception is illustrated in Fig. 1A. The input consists of many images of a person’s face observed under different pose and lighting conditions, in no particular order. These images can be thought of as points in a high-dimensional vector space, with each input dimension cor- responding to the brightness of one pixel in the image or the firing rate of one retinal ganglion cell. Although the input dimension-ality may be quite high (e.g., 4096 for these 64 pixel by 64 pixel images), the perceptually meaningful structure of these images has many fewer independent degrees of freedom. Within the 4096-dimensional input space, all of the images lie on an intrinsically three- dimensional manifold, or constraint surface, that can be parameterized by two pose vari- ables plus an azimuthal lighting angle. Our goal is to discover, given only the unordered high-dimensional inputs, low-dimensional representations such as Fig. 1A with coordi- nates that capture the intrinsic degrees of freedom of a data set. This problem is of central importance not only in studies of vi- sion (1–5), but also in speech (6, 7 ), motor control (8, 9), and a range of other physical and biological sciences (10–12).

在图1A中说明了从视觉领域降维的典型问题。输入包括在不同姿势和光照条件下观察到的许多人脸图像,没有特定的顺序。这些图像可以看作是高维向量空间中的点,每个输入维对应于图像中一个像素的亮度或一个视网膜神经节细胞的放电速率。尽管输入维数可能相当高(例如,对于这些64像素×64像素图像,为4096),但是这些图像的感知意义结构具有更少的独立自由度。在4096维输入空间内,所有图像都位于一个本质上三维流形或约束曲面上,该流形可以通过两个姿态变量和一个方位照明角度进行参数化。我们的目标是发现仅给出无序的高维输入的低维表示形式,如图1A所示,其坐标捕获了数据集的固有自由度。这个问题不仅在视觉的研究中非常重要,而且在语言,运动控制和一系列其他物理和生物科学中也是至关重要的。
在这里插入图片描述

he classical techniques for dimensional- ity reduction, PCA and MDS, are simple to implement, efficiently computable, and guar- anteed to discover the true structure of data lying on or near a linear subspace of the high-dimensional input space (13). PCA finds a low-dimensional embedding of the data points that best preserves their variance as measured in the high-dimensional input space. Classical MDS finds an embedding that preserves the interpoint distances, equiv- alent to PCA when those distances are Eu- clidean. However, many data sets contain essential nonlinear structures that are invisible to PCA and MDS (4, 5, 11, 14 ). For example, both methods fail to detect the true degrees of freedom of the face data set (Fig. 1A), or even its intrinsic three-dimensionality (Fig. 2A).

降维的经典技术PCA和MDS易于实现,可高效计算,并且可以保证发现位于高维输入空间的线性子空间上或附近的数据的真实结构(13)。 PCA发现在高维输入空间中测量的数据点的低维嵌入可以最好地保留其方差。 古典MDS发现了一种嵌入,该嵌入保留了点间距离,当这些距离为欧几里得时,其等效于PCA。 但是,许多数据集包含PCA和MDS不可见的基本非线性结构(4、5、11、14)。 例如,这两种方法都无法检测到面部数据集的真实自由度(图1A),甚至无法检测其固有的三维度(图2A)。

在这里插入图片描述
Here we describe an approach that combines the major algorithmic features of PCA and MDS—computational efficiency, global optimality, and asymptotic convergence guarantees—with the flexibility to learn a broad class of nonlinear manifolds. Figure 3A illus- trates the challenge of nonlinearity with data lying on a two-dimensional “Swiss roll”: points far apart on the underlying manifold, as mea- sured by their geodesic, or shortest path, dis- tances, may appear deceptively close in the high-dimensional input space, as measured by their straight-line Euclidean distance. Only the geodesic distances reflect the true low-dimensional geometry of the manifold, but PCA and MDS effectively see just the Euclidean struc- ture; thus, they fail to detect the intrinsic two- dimensionality (Fig. 2B).

在这里,我们描述了一种结合了PCA和MDS的主要算法功能(计算效率,全局最优性和渐近收敛保证)的方法,该方法可以灵活地学习各种非线性流形。 图3A用二维“瑞士卷”上的数据说明了非线性的挑战:在基础流形上相距很远的点(通过测地线或最短路径距离测量)可能看起来像接近 在高维输入空间中,通过它们的直线欧几里得距离来衡量。 只有测地距离反映了流形的真实低维几何形状,但是PCA和MDS仅能看到欧几里得结构; 因此,他们无法检测固有的二维(图2B)。

Our approach builds on classical MDS but seeks to preserve the intrinsic geometry of the data, as captured in the geodesic manifold distances between all pairs of data points. The crux is estimating the geodesic distance between faraway points, given only input-space distances. For neighboring points, input- space distance provides a good approximation to geodesic distance. For faraway points, geodesic distance can be approximated by adding up a sequence of “short hops” between neighboring points. These approximations are computed efficiently by finding shortest paths in a graph with edges connecting neighboring data points.

我们的方法建立在经典MDS的基础上,但是力求保留数据的固有几何形状,如所有对数据点对之间的测地线距离所捕获的一样。 关键在于估计仅在输入空间距离的情况下,遥远点之间的测地线距离。 对于相邻点,输入空间距离提供了与测地距离的良好近似值。 对于较远的点,可以通过将相邻点之间的“短跳”序列相加来近似测地距离。 通过找到图中的最短路径(具有连接相邻数据点的边)可以有效地计算出这些近似值。

The complete isometric feature mapping, or Isomap, algorithm has three steps, which are detailed in Table 1. The first step determines which points are neighbors on the manifold M, based on the distances dX (i, j) between pairs of points i, j in the input space X. Two simple methods are to connect each point to all points within some fixed radius E, or to all of its K nearest neighbors (15). These neighborhood relations are represented as a weighted graph G over the data points, with edges of weight dX(i, j) between neighboring points (Fig. 3B).

完整的等距特征映射算法或Isomap算法包括三个步骤,如表1所示。第一步根据输入空间X中两对点i,j之间的距离dX(i,j)确定流形M上哪些点是相邻点。两种简单的方法是将每个点连接到某个固定半径E内的所有点,或连接到其所有K个最近邻点。这些邻域关系表示为数据点上的加权图G,相邻点之间的权重dX(i,j)的边缘(图3B)。

在这里插入图片描述
n its second step, Isomap estimates the geodesic distances dM (i, j) between all pairs of points on the manifold M by computing their shortest path distances dG (i, j) in the graph G. One simple algorithm (16 ) for finding shortest paths is given in Table 1.
在第二步中,Isomap通过计算图G中最短路径距离dG(i,j)来计算歧管M上所有点对之间的测地距离dM(i,j)。一种简单的算法(16)用于查找 表1给出了最短路径。

The final step applies classical MDS to the matrix of graph distances DG = {dG(i,j)}, constructing an embedding of the data in a d-dimensional Euclidean space Y that best preserves the manifold’s estimated intrinsic geometry (Fig. 3C). The coordinate vectors yi for points in Y are chosen to minimize the cost function E=∣∣τ(DG)−τ(DY)∣∣L2E=∣∣τ(DG)−τ(DY)∣∣L2 E=||τ(D_G)-τ(D_Y)||_{{L2}}

E=∣∣τ(DG​)−τ(DY​)∣∣L2​ (1) where DY denotes the matrix of Euclidean distances {{dY(i,j) = ||yi -yj||} and ∣∣A∣∣L2∣∣A∣∣L2 ||A|| _L {^2}

∣∣A∣∣L​2the L2 matrix norm ∑i,jA2i,j−−−−−−−√∑i,jAi,j2 \sqrt{\sum {i,j}A{i,j}^2}

最后一步将经典MDS应用于图距离矩阵DG= {dG(i,j)},在d维欧氏空间Y中构建数据的嵌入,该空间能最好地保存线形估计的内在几何结构(图3C)。Y中各点的坐标向量yi的选择是为了最小化成本函数E=∣∣τ(DG)−τ(DY)∣∣L2E=∣∣τ(DG)−τ(DY)∣∣L2 E=||τ(D_G)-τ(D_Y)||_{{L2}}

E=∣∣τ(DG​)−τ(DY​)∣∣L2​(1) 其中DY表示欧氏距离矩阵{dY(i,j) = ||yi - yj||}和∣∣A∣∣L2∣∣A∣∣L2 ||A|| _L {^2}

∣∣A∣∣L​2的L2矩阵规范∑i,jA2i,j−−−−−−−√∑i,jAi,j2 \sqrt{\sum {i,j}A{i,j}^2}

∑i,j​Ai,j2​​ 。t算子将距离转换为内积(17),内积以一种支持高效优化的形式独特地描述了数据的几何特征。将坐标yi设为矩阵τ(DG)(13)的前d个特征向量,即可实现式1的全局最小值。As with PCA or MDS, the true dimensionality of the data can be estimated from the decrease in error as the dimensionality of Y is increased. For the Swiss roll, where classical methods fail, the residual variance of Isomap correctly bottoms out at d = 2 (Fig. 2B).
与PCA或MDS一样,可以根据Y的维数增加而减少的误差来估算数据的真实维数。 对于经典方法失败的瑞士卷,Isomap的残差正确地在d = 2处触底(图2B)。

Just as PCA and MDS are guaranteed, given sufficient data, to recover the true structure of linear manifolds, Isomap is guaranteed asymptotically to recover the true dimensionality and geometric structure of a strictly larger class of nonlinear manifolds. Like the Swiss roll, these are manifolds whose intrinsic geometry is that of a convex region of Euclidean space, but whose ambient geometry in the high-dimensional input space may be highly folded, twisted, or curved. For non-Euclidean manifolds, such as a hemisphere or the surface of a doughnut, Isomap still produces a globally optimal lowdimensional Euclidean representation, as measured by Eq. 1.

就像有足够的数据可以保证PCA和MDS恢复线性流形的真实结构一样,Isomap也可以渐近地保证恢复严格更大类别的非线性流形的真实尺寸和几何结构。 像瑞士卷一样,它们是流形,其固有几何形状是欧氏空间的凸形区域的几何形状,但在高维输入空间中的周围几何形状可能会高度折叠,扭曲或弯曲。 对于非欧几里德流形,例如半球或甜甜圈表面,Isomap仍会生成一个全局最优的低维欧几里得表示,如等式1所示。 1。

These guarantees of asymptotic convergence rest on a proof that as the number of data points increases, the graph distances dG(i,j)dG(i,j) d_{G}\left ( i,j \right )dG​(i,j)provide increasingly better approximations to the intrinsic geodesic distances dM(i,j)dM(i,j) d_{M}\left ( i,j \right )dM​(i,j), becoming arbitrarily accurate in the limit of infinite data (18, 19). How quickly dG(i,j)dG(i,j) d_{G}\left ( i,j \right )dG​(i,j) converges to dM(i,j)dM(i,j) d_{M}\left ( i,j \right )dM​(i,j) depends on certain parameters of the manifold as it lies within the high-dimensional space (radius of curvature and branch separation) and on the density of points. To the extent that a data set presents extreme values of these parameters or deviates from a uniform density, asymptotic convergence still holds in general, but the sample size required to estimate geodesic distance accurately may be impractically large

渐近收敛的这些保证基于证明,随着数据点数量的增加,图距离d G(i,j)dG(i,j)d_ {G} \ left(i,j \ right)d G (i,j)为本征测地距离d M(i,j)dM(i,j)d_ {M} \ left(i,j \ right)d M(i,j)提供越来越好的近似值 在无限数据的范围内任意精确(18,19)。 d G(i,j)dG(i,j)d_ {G} \ left(i,j \ right)d G(i,j)收敛到d M(i,j)dM(i,j)有多快 )d_ {M} \ left(i,j \ right)d M(i,j)取决于流形的某些参数,因为它位于高维空间(曲率半径和分支分离半径)内并且取决于密度 点。 就数据集呈现这些参数的极端值或偏离均匀密度的程度而言,渐近收敛通常仍然成立,但是准确估算测地距离所需的样本大小可能不切实际地大

Isomap’s global coordinates provide a simple way to analyze and manipulate high- dimensional observations in terms of their intrinsic nonlinear degrees of freedom. For a set of synthetic face images, known to have three degrees of freedom, Isomap correctly detects the dimensionality (Fig. 2A) and separates out the true underlying factors (Fig. 1A). The algorithm also recovers the known low-dimensional structure of a set of noisy real images, generated by a human hand varying in finger extension and wrist rotation (Fig. 2C) (20). Given a more complex data set of handwritten digits, which does not have a clear manifold geometry, Isomap still finds globally meaningful coordinates (Fig. 1B) and nonlinear structure that PCA or MDS do not detect (Fig. 2D). For all three data sets, the natural appearance of linear interpolations between distant points in the low-dimensional coordinate space confirms that Isomap has captured the data’s perceptually relevant structure (Fig. 4).

Isomap的全局坐标提供了一种根据其固有的非线性自由度来分析和处理高维观测的简单方法。 对于已知具有三个自由度的一组合成人脸图像,Isomap可以正确检测尺寸(图2A)并分离出真正的潜在因素(图1A)。 该算法还恢复了一组嘈杂的真实图像的已知低维结构,该图像是由人的手指延伸和手腕旋转变化产生的(图2C)(20)。 给定更复杂的手写数字数据集(没有清晰的流形几何),Isomap仍然可以找到全局有意义的坐标(图1B)和PCA或MDS无法检测到的非线性结构(图2D)。 对于所有这三个数据集,低维坐标空间中远距离点之间的线性插值的自然出现证实了Isomap已经捕获了数据在感知上相关的结构(图4)。

在这里插入图片描述
Previous attempts to extend PCA and MDS to nonlinear data sets fall into two broad classes, each of which suffers from limitations overcome by our approach. Local linear techniques (21–23) are not designed to represent the global structure of a data set within a single coordinate system, as we do in Fig. 1. Nonlinear techniques based on greedy optimization procedures (24–30) attempt to discover global structure, but lack the crucial algorithmic features that Isomap inherits from PCA and MDS: a noniterative, polynomial time procedure with a guarantee of global optimality; for intrinsically Euclidean manifolds, a guarantee of asymptotic convergence to the true structure; and the ability to discover manifolds of arbitrary dimensionality, rather than requiring a fixed d initialized from the beginning or computational resources that increase exponentially in d.

先前将PCA和MDS扩展到非线性数据集的尝试分为两大类,每类都受到我们方法克服的局限性的困扰。 正如我们在图1中所做的那样,局部线性技术(21–23)并非旨在表示单个坐标系内数据集的全局结构。基于贪婪优化程序(24–30)的非线性技术试图发现全局 结构,但缺少Isomap继承自PCA和MDS的关键算法功能:非迭代的多项式时间过程,可保证全局最优; 对于固有的欧几里德流形,保证了到真实结构的渐近收敛; 发现任意维数的流形的能力,而不需要从开始就初始化的固定d或在d中呈指数增长的计算资源。

Here we have demonstrated Isomap’s performance on data sets chosen for their visually compelling structures, but the technique may be applied wherever nonlinear geometry complicates the use of PCA or MDS. Isomap complements, and may be combined with, linear extensions of PCA based on higher order statistics, such as independent component analysis (31, 32). It may also lead to a better understanding of how the brain comes to represent the dynamic appearance of objects, where psychophysical studies of apparent motion (33, 34 ) suggest a central role for geodesic transformations on nonlinear manifolds (35) much like those studied here.

在这里,我们已经证明了Isomap在为其视觉上引人注目的结构选择的数据集上的性能,但是该技术可以应用于非线性几何使PCA或MDS使用复杂的任何地方。 Isomap可基于更高阶统计量(例如独立分量分析)来补充PCA的线性扩展,并且可以与之结合(31、32)。 这也可能导致人们更好地理解大脑是如何表示物体的动态外观的,其中视运动的心理物理学研究(33,34)提出了非线性流形上的测地变换的中心作用(35),与此处研究的相似。

  • 0
    点赞
  • 1
    收藏
    觉得还不错? 一键收藏
  • 0
    评论

“相关推荐”对你有帮助么?

  • 非常没帮助
  • 没帮助
  • 一般
  • 有帮助
  • 非常有帮助
提交
评论
添加红包

请填写红包祝福语或标题

红包个数最小为10个

红包金额最低5元

当前余额3.43前往充值 >
需支付:10.00
成就一亿技术人!
领取后你会自动成为博主和红包主的粉丝 规则
hope_wisdom
发出的红包
实付
使用余额支付
点击重新获取
扫码支付
钱包余额 0

抵扣说明:

1.余额是钱包充值的虚拟货币,按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载,可以购买VIP、付费专栏及课程。

余额充值