[论文翻译] A Global Geometric Framework for Nonlinear Dimensionality Reduction

[论文翻译] A Global Geometric Framework for Nonlinear Dimensionality Reduction

论文题目:A Global Geometric Framework for Nonlinear Dimensionality Reduction
论文来源:A Global Geometric Framework for Nonlinear Dimensionality Reduction
翻译人:BDML@CQUT实验室

A Global Geometric Framework for Nonlinear Dimensionality Reduction

Joshua B. Tenenbaum,1 Vin de Silva,2 John C. Langford3*

非线性降维的全局几何框架

Joshua B. Tenenbaum,1 Vin de Silva,2 John C. Langford3*

Abstract

Scientists working with large volumes of high-dimensional data, such as global climate patterns, stellar spectra, or human gene distributions, regularly confront the problem of dimensionality reduction: finding meaningful low-dimensional structures hidden in their high-dimensional observations. The human brain confronts the same problem in everyday perception, extracting from its high-dimensional sensory inputs—30,000 auditory nerve fibers or 106optic nerve fibers—a manageably small number of perceptually relevant features.Here we describe an approach to solving dimensionality reduction problems that uses easily measured local metric information to learn the underlying global geometry of a data set. Unlike classical techniques such as principal component analysis (PCA) and multidimensional scaling (MDS), our approach is capable of discovering the nonlinear degrees of freedom that underlie complex natural observations, such as human handwriting or images of a face under different viewing conditions. In contrast to previous algorithms for nonlinear dimensionality reduction,our sefficiently computes a globally optimal solution,and, for an important class of data manifolds, is guaranteed to converge asymptotically to the true structure.

摘要

科学家在处理大量高维数据,如全球气候模式、恒星光谱或人类基因分布等方面,经常面临降维的问题:在高维观测中发现隐藏的有意义的低维结构。人脑在日常感知中也面临着同样的问题,从其高维的感官输入中提取出30000根听觉神经纤维或106根视神经纤维,这是一小部分可管理的感知相关特征。在这里,我们描述了一种解决降维问题的方法,它使用易于测量的局部度量信息来学习数据集的底层全局几何结构。与传统的主成分分析(PCA)和多维尺度分析(MDS)等技术不同,我们的方法能够发现复杂自然观测的非线性自由度,比如在不同的观察条件下,人类的笔迹或脸部图像。与以往的非线性维数导出算法相比,我们的算法有效地实现了全局最优解,并且对于一类重要的数据流形,可以保证渐近收敛到真实结构

正文

A canonical problem in dimensionality reduction from the domain of visual perception is illustrated in Fig. 1A. The input consists of many images of a person’s face observed under different pose and lighting conditions, in no particular order. These images can be thought of as points in a high-dimensional vector space, with each input dimension corresponding to the brightness of one pixel in the image or the firing rate of one retinal
ganglion cell. Although the input dimensionality may be quite high (e.g., 4096 for these 64 pixel by 64 pixel images), the perceptually meaningful structure of these images has many fewer independent degrees of freedom. Within the 4096-dimensional input space, all of the images lie on an intrinsically three-dimensional manifold, or constraint surface,
that can be parameterized by two pose variables plus an azimuthal lighting angle. Our goal is to discover, given only the unordered high-dimensional inputs, low-dimensional representations such as Fig. 1A with coordinates that capture the intrinsic degrees of freedom of a data set. This problem is of central importance not only in studies of vision, but also in speech, motor control, and a range of other physical and biological sciences.

图1A说明了从视觉感知领域进行维数缩减的典型问题。输入包括在不同姿势和光照条件下观察到的许多人脸图像,无特定顺序。这些图像可以看作是高维向量空间中的点,每个输入维对应于图像中一个像素的亮度或一个视网膜神经节细胞的放电速率。尽管输入维度可能相当高(例如,对于这些64像素×64像素图像,4096),这些图像的感知意义结构具有更少的独立自由度。在4096维输入空间内,所有图像都位于一个内在的三维流形或约束曲面上,该流形可以由两个姿势变量和一个方位光照角度来参数化。我们的目标是发现,仅考虑无序的高维输入,低维表示,如图1A所示,其坐标捕捉数据集的固有自由度。这个问题不仅在视觉研究,而且在语言,运动控制和其他一系列物理和生物科学的研究中都是至关重要的。

在这里插入图片描述
Fig. 1. (A) A canonical dimensionality reduction problem from visual perception. The input consists of a sequence of 4096-dimensional vectors, representing the brightness values of 64 pixel by 64 pixel images of a face rendered with different poses and lighting directions. Applied to N = 698 raw images, Isomap (K = 6) learns a three-dimensional embedding of the data’s intrinsic geometric structure. A two-dimensional projection is shown,with a sample of the original input images (red circles) superimposed on all the data points (blue) and horizontal sliders (under the images) representing the third dimension. Each coordinate axis of the embedding correlates highly with one degree of freedom underlying the original data: left-right pose (x axis, R = 0.99), up-down pose (y axis, R = 0.90), and lighting direction (slider position, R 5 0.92). The input-space distances dX(i,j) given to Isomap were Euclidean distances between the 4096-dimensional image vectors. (B) Isomap applied to N = 1000 handwritten “2”s from the MNIST database. The two most significant dimensions in the Isomap embedding, shown here, articulate the major features of the “2”: bottom loop (x axis) and top arch (y axis). Input-space distances dX(i,j) were measured by tangent distance, a metric designed to capture the invariances relevant in handwriting recognition. Here we used e-Isomap (with e = 4.2) because we did not expect a constant dimensionality to hold over the whole data set; consistent with this, Isomap finds several tendrils projecting from the higher dimensional mass of data and representing successive exaggerations of an extra stroke or ornament in the digit.

图1。(A) 一个典型的视觉降维问题。输入由4096个维向量序列组成,代表以不同姿势和照明方向渲染的面部的64像素×64像素图像的亮度值。应用于n=698原始图像,Isomap(k=6)学习数据内在几何结构的三维嵌入。显示了一个二维投影,原始输入图像(红色圆圈)的样本叠加在所有数据点(蓝色)和代表第三维度的水平滑块(图像下方)。嵌入的每个坐标轴与原始数据下的一个自由度高度相关:左右姿势(x轴,r=0.99)、上下姿势(y轴,r=0.90)和照明方向(滑块位置,r=0.92)。给Isomap的输入空间距离dX(i,j)是4096维图像矢量之间的欧氏距离。(B) Isomap应用于MNIST数据库的N = 1000手写“2”。这里显示的Isomap嵌入中最重要的两个维度阐明了“2”的主要特征:底环(x轴)和顶拱(y轴)。输入空间距离dX(i,j)由切线距离测量,切线距离是用于捕获手写识别中相关不变性的度量。在这里,我们使用e-Isomap(与e= 4.2)是因为我们不希望在整个数据集上保持一个恒定的维度;与此一致,Isomap发现了几个从高维数据量中投射出来的卷须,代表了数字中额外笔画或装饰的连续夸张。

The classical techniques for dimensionality reduction, PCA and MDS, are simple to implement, efficiently computable, and guaranteed to discover the true structure of data lying on or near a linear subspace of the high-dimensional input space. PCA finds a low-dimensional embedding of the data points that best preserves their variance as measured in the high-dimensional input space. Classical MDS finds an embedding that preserves the interpoint distances, equivalent to PCA when those distances are Euclidean. However, many data sets contain essential nonlinear structures that are invisible to PCA and MDS. For example, both methods fail to detect the true degrees of freedom of the face data set (Fig.1A), or even its intrinsic three-dimensionality (Fig. 2A).

经典的降维技术,PCA和MDS,实现简单,可高效计算,并保证发现位于高维输入空间的线性子空间上或附近的数据的真实结构。主成分分析发现一个低维嵌入的数据点,最好地保持其方差在高维输入空间测量。经典的MDS发现了一种保持中间点距离的嵌入,当这些距离是欧几里德距离时,就相当于PCA。然而,许多数据集包含PCA和MDS不可见的基本非线性结构。例如,这两种方法都无法检测面部数据集的真实自由度(图1A),甚至其内在的三维度(图2A)。

Here we describe an approach that combines the major algorithmic features of PCA and MDS—computational efficiency, global optimality, and asymptotic convergence guarantees—with the flexibility to learn a broad class of nonlinear manifolds. Figure 3A illustrates the challenge of nonlinearity with data lying on a two-dimensional “Swiss roll”: points far apart on the underlying manifold, as measured by their geodesic, or shortest path, distances, may appear deceptively close in the high-dimensional input space, as measured by their straight-line Euclidean distance. Only the geodesic distances reflect the true low-dimensional geometry of the manifold, but PCA and MDS effectively see just the Euclidean structure; thus, they fail to detect the intrinsic two-dimensionality (Fig. 2B).

在这里,我们描述了一种结合了PCA和MDS的主要算法特点的方法,计算效率,全局最优性,渐近收敛性保证,以及学习广泛的非线性流形的灵活性。图3A说明了二维“瑞士卷”上数据的非线性挑战:底层流形上相距较远的点(通过测地线或最短路径距离测量)在高维输入空间中可能看起来很近,用直线欧几里德距离测量。只有测地距离反映流形的真实低维几何,但是PCA和MDS有效地只看到了欧几里德结构;因此,它们无法检测到内在的二维性(图2B)。

在这里插入图片描述
Fig. 2. The residual variance of PCA (open triangles), MDS [open triangles in (A) through ©; open circles in (D)], and Isomap (filled circles) on four data sets. (A) Face images varying in pose and il-
lumination (Fig. 1A).(B) Swiss roll data (Fig.3). © Hand images varying in finger extension and wrist rotation. (D) Handwritten“2”s (Fig. 1B). In all cases, residual variance decreases as the dimensionality d is increased.
The intrinsic dimensionality of the data can be estimated by looking for the “elbow” at which this curve ceases to decrease significantly with added dimensions. Arrows mark the true or approximate dimensionality, when known. Note the tendency of PCA and MDS to overestimate the dimensionality, in contrast to Isomap.

图2。 在四个数据集上,PCA(开三角形)、MDS[在(A)到(C)中的开放三角形、(D)中的开圆]和Isomap(填充圆)的残差方差。(A) 不同姿势和照明度的人脸图像(图1A)。(B) 瑞士滚动数据(图3)。(C) 手部图像在手指伸展和手腕旋转中变化。(D) 手写“2”(图1B)。在所有情况下,残差方差随着维数d的增加而减小。数据的内在维数可以通过寻找曲线随维数增加而停止显著减小的“弯头”来估计。箭头表示已知的真实或近似维度。注意PCA和MDS高估维度的趋势,与Isomap相反。

Our approach builds on classical MDS but seeks to preserve the intrinsic geometry of the data, as captured in the geodesic manifold distances between all pairs of data points. The crux is estimating the geodesic distance between faraway points, given only input-space distances. For neighboring points, input-space distance provides a good approximation to geodesic distance. For faraway points, geodesic distance can be approximated by adding up a sequence of “short hops” between neighboring points. These approximations are computed efficiently by finding shortest paths in a graph with edges connecting neighboring data points.

我们的方法建立在经典MDS的基础上,但是试图保留数据的内在几何结构,正如在所有对数据点之间的测地流形距离中捕捉到的那样。关键是估计遥远点之间的测地距离,只考虑输入空间距离。对于相邻点,输入空间距离提供了良好的大地测量距离近似。对于远点,测地距离可以通过在相邻点之间加上一系列“短跳”来近似。通过在图中寻找边连接相邻数据点的最短路径,可以有效地计算这些近似值。

The complete isometric feature mapping,or Isomap, algorithm has three steps, which are detailed in Table 1. The first step determines which points are neighbors on the manifold M, based on the distances dX(i,j)
between pairs of points i,j in the input space X. Two simple methods are to connect each point to all points within some fixed radius e,or to all of its K nearest neighbors. These neighborhood relations are represented as a
weighted graph G over the data points, with edges of weight dX(i,j) between neighboring points (Fig. 3B).

完整的等轴测特征映射(Isomap)算法有三个步骤,详见表1。第一步根据输入空间X中两对点i,j之间的距离dX(i,j)确定流形M上哪些点是相邻点。两种简单的方法是将每个点连接到某个固定半径e内的所有点,或连接到其所有K个最近邻点。这些邻域关系表示为数据点上的加权图G,相邻点之间的权重dX(i,j)的边(图3B)。

在这里插入图片描述
Fig. 3. The “Swiss roll” data set, illustrating how Isomap exploits geodesic paths for nonlinear dimensionality reduction. (A) For two arbitrary points (circled) on a nonlinear manifold, their Euclidean distance in the high- dimensional input space (length of dashed line) may not accurately reflect their intrinsic similarity, as measured by geodesic distance along the low-dimensional manifold (length of solid curve). (B) The neighborhood graph G constructed in step one of Isomap (with K = 7 and N = 1000 data points) allows an approximation (red segments) to the true geodesic path to be computed efficiently in step two, as the shortest path in G. © The two-dimensional embedding recovered by Isomap in step three, which best preserves the shortest path distances in the neighborhood graph (overlaid). Straight lines in the embedding (blue) now represent simpler and cleaner approximations to the true geodesic paths than do the corresponding graph paths (red).

图3。“瑞士滚动”数据集,说明Isomap如何利用测地线路径进行非线性降维。(A) 对于非线性流形上的两个任意点(带圆圈),它们在高维输入空间中的欧几里德距离(虚线长度)可能无法准确反映它们的内在相似性,用沿低维流形的测地线距离(实线长度)来衡量。(B) 在Isomap的第一步中构造的邻域图G(K = 7和N = 1000个数据点)允许在第二步中有效计算真实测地线路径的近似值(红色线段),作为G中的最短路径。(C)步骤3中Isomap恢复的二维嵌入,在邻域图中,哪一个最能保持最短路径距离(覆盖)。嵌入中的直线(蓝色)现在比相应的图形路径(红色)更简单、更清晰地近似真实的测地线路径。

In its second step, Isomap estimates the geodesic distances dM(i,j) between all pairs of points on the manifold M by computing their shortest path distances dG(i,j) in the graph G. One simple algorithm for finding shortest paths is given in Table 1.

在第二步中,Isomap通过计算流形M上所有点对之间的最短路径距离dG(i,j)来估计流形M上所有点对之间的测地距离dM(i,j)。表1给出了一个求最短路径的简单算法。

在这里插入图片描述
Table 1. The Isomap algorithm takes as input the distances dX(i,j) between all pairs i,j from N data points in the high-dimensional input space X, measured either in the standard Euclidean metric (as in Fig. 1A) or in some domain-specific metric (as in Fig. 1B). The algorithm outputs coordinate vectors yi in a d-dimensional Euclidean space Y that(according to Eq. 1) best represent the intrinsic geometry of the data. The only free parameter (e or K) appears in Step 1.

表1 Isomap算法将高维输入空间X中N个数据点的所有对i、j之间的距离dX(i,j)作为输入,该距离是用标准欧几里德度量(如图1A)或某个特定领域的度量(如图1B)测量的。该算法在d维欧几里德空间Y中输出坐标向量yi,该空间(根据等式1)最能代表数据的内在几何。唯一的自由参数(e或K)出现在步骤1中。

The final step applies classical MDS to the matrix of graph distances DG= {dG(i,j)}, constructing an embedding of the data in a d-dimensional Euclidean space Y that best preserves the manifold’s estimated intrinsic geometry (Fig. 3C). The coordinate vectors yi for points in Y are chosen to minimize the cost function.

最后一步将经典MDS应用于图距离DG={dG(i,j)}的矩阵,在d维欧几里德空间Y中构造数据的嵌入,该空间最能保持流形估计的内在几何结构(图3C)。选择Y点的坐标向量yi来最小化代价函数。

在这里插入图片描述
where DY denotes the matrix of Euclidean distances {dY(i,j) = || yi - yj||} and ||A||L2 the L2 matrix norm √Σi,jA2i,j. The t operator converts distances to inner products,which uniquely characterize the geometry of the data in a form that supports efficient optimization. The global minimum of Eq. 1 is achieved by setting the coordinates yito the top d eigenvectors of the matrix t(DG).

其中DY表示欧氏距离矩阵{dY(i,j) = || yi - yj||} 和 ||A||L2 L2矩阵范数 √Σi,jA2i,j。t算子将距离转换为内积,它以支持有效优化的形式唯一地描述数据的几何结构。公式1的全局最小值是通过将坐标yi设置为矩阵t(DG)的顶部d特征向量来实现的。

As with PCA or MDS, the true dimensionality of the data can be estimated from the decrease in error as the dimensionality of Y is increased. For the Swiss roll, where classical methods fail, the residual variance of Isomap correctly bottoms out at d = 2(Fig. 2B).

与PCA或MDS一样,数据的真实维数可以通过Y维数增加时误差的减小来估计。对于瑞士卷,在经典方法失败的地方,Isomap的残差方差在d=2处正确地触底(图2B)。

Just as PCA and MDS are guaranteed, given sufficient data, to recover the true structure of linear manifolds, Isomap is guaranteed asymptotically to recover the true dimensionality and geometric structure of a strictly larger class of nonlinear manifolds. Like the Swiss roll, these are manifolds whose intrinsic geometry is that of a convex region of Euclidean space, but whose ambient geometry in the high-dimensional input space may be highly folded, twisted, or curved. For non-Euclidean manifolds, such as a hemisphere or the surface of a doughnut,Isomap still produces a globally optimal low-dimensional Euclidean representation, as measured by Eq. 1.

正如PCA和MDS保证在足够的数据下恢复线性流形的真实结构一样,Isomap被保证渐近地恢复一类严格更大的非线性流形的真实维数和几何结构。与瑞士roll一样,这些流形的内在几何结构是欧几里德空间的凸区域,但其在高维输入空间中的环境几何可能高度折叠、扭曲或弯曲。对于非欧几里德流形,例如半球形或甜甜圈的表面,Isomap仍然产生全局最优的低维欧几里德表示,如等式1所测量的。

These guarantees of asymptotic convergence rest on a proof that as the number of data points increases, the graph distancesdG(i,j) provide increasingly better approximations to the intrinsic geodesic distances dM(i,j), becoming arbitrarily accurate in the limit of infinite data. How quickly dG(i,j) converges to dM(i,j) depends on certain parameters of the manifold as it lies within the high-dimensional space (radius of curvature and branch separation) and on the density of points. To the extent that a data set presents extreme values of these parameters or deviates from a uniform density, asymptotic convergence still holds in general, but the sample size required to estimate geodesic distance accurately may be impractically large.

渐近收敛的这些保证依赖于一个证据,即随着数据点数量的增加,图距离dG(i,j)对内在测地距离dM(i,j)提供了越来越好的近似,在无限数据的限制下变得任意精确。dG(i,j)收敛到dM(i,j)的速度取决于流形在高维空间中的某些参数(曲率半径和分支间距)以及点的密度。当一个数据集呈现这些参数的极值或偏离统一密度时,渐近收敛一般仍然成立,但精确估计测地距离所需的样本量可能不实际。

Isomap’s global coordinates provide a simple way to analyze and manipulate high-dimensional observations in terms of their intrinsic nonlinear degrees of freedom. For a set of synthetic face images, known to have three degrees of freedom, Isomap correctly detects the dimensionality (Fig. 2A) and separates out the true underlying factors (Fig.1A). The algorithm also recovers the known low-dimensional structure of a set of noisy real images, generated by a human hand varying in finger extension and wrist rotation (Fig. 2C) . Given a more complex data set of handwritten digits, which does not have a clear manifold geometry, Isomap still finds globally meaningful coordinates (Fig. 1B) and nonlinear structure that PCA or MDS do not detect (Fig. 2D). For all three data sets, the natural appearance of linear interpolations between distant points in the low-dimensional coordinate space confirms that Isomap has captured the data’s perceptually relevant
structure (Fig. 4).

Isomap的全球坐标系提供了一种简单的方法来分析和操纵高维观测值的内在非线性自由度。对于一组已知具有三个自由度的合成人脸图像,Isomap正确地检测维度(图2A)并分离出真正的潜在因素(图1A)。该算法还恢复了一组噪声真实图像的已知低维结构,所述噪声真实图像是由人类手在手指伸展和手腕旋转中变化而生成的(图2C)。给定一个更复杂的手写数字数据集,它没有明确的流形几何结构,Isomap仍然可以找到全局有意义的坐标(图1B)和PCA或MDS没有检测到的非线性结构(图2D)。对于所有三个数据集,低维坐标空间中远距离点之间的线性插值的自然外观证实了Isomap已经捕捉到数据的感知相关结构(图4)。

在这里插入图片描述
Fig. 4. Interpolations along straight lines in the Isomap coordinate space (analogous to the blue line in Fig. 3C) implement perceptually natural but highly nonlinear “morphs” of the corresponding high-dimensional observations by transforming them approximately along geodesic paths (analogous to the solid curve in Fig. 3A). (A) Interpolations in a three-dimensional embedding of face images (Fig. 1A). (B) Interpolations in a four-dimensional embedding of hand images appear as natural hand movements when viewed in quick succession, even though no such motions occurred in the observed data. © Interpolations in a six-dimensional embedding of handwritten “2”s (Fig. 1B) preserve continuity not only in the visual features of loop and arch articulation, but also in the implied pen trajectories,which are the true degrees of freedom underlying
those appearances.

图4。 沿等值线坐标空间中的直线插值(类似于图3C中的蓝线)通过沿测地路径近似地变换它们(类似于图3A中的实心曲线)来实现相应高维观测值的感性自然但高度非线性的“变形”。(A) 三维人脸图像嵌入中的插值(图1A)。(B) 手图像的四维嵌入中的插值在快速连续观看时显示为手的自然运动,即使在观测数据中没有这种运动。(C) 手写“2”的六维嵌入(图1B)中的插值不仅在环和拱形连接的视觉特征中保持连续性,而且在隐含的笔轨迹中保持连续性,这些轨迹是这些外观的真实自由度。

Previous attempts to extend PCA and MDS to nonlinear data sets fall into two broad classes, each of which suffers from limitations overcome by our approach. Local linear techniques are not designed to represent the global structure of a data set within a single coordinate system, as we do in Fig. 1. Nonlinear techniques based on greedy optimization procedures attempt to discover global structure, but lack the crucial algorithmic features that Isomap inherits from PCA and MDS: a noniterative, polynomial time procedure with a guarantee of global optimality; for intrinsically Euclidean manifolds, a guarantee of asymptotic convergence to the true structure; and the ability to discover manifolds of arbitrary dimensionality, rather than requiring a fixed d initialized from the beginning or computational resources that increase exponentially in d.

以往将PCA和MDS扩展到非线性数据集的尝试分为两大类,每一类都受到我们方法克服的局限性的影响。局部线性技术的设计并不是为了在单个坐标系中表示数据集的全局结构,如图1所示。基于贪婪优化过程的非线性技术试图发现全局结构,但缺乏Isomap从PCA和MDS继承的关键算法特征:一个非迭代的多项式时间过程,保证全局最优性;对于固有的欧几里德流形,渐近收敛到真实结构的保证;以及发现任意维流形的能力,而不需要从一开始就初始化一个固定的d或在d中呈指数增长的计算资源。

Here we have demonstrated Isomap’s performance on data sets chosen for their visually compelling structures, but the technique may be applied wherever nonlinear geometry complicates the use of PCA or MDS. Isomap complements, and may be combined with,linear extensions of PCA based on higher order statistics, such as independent component analysis. It may also lead to a better understanding of how the brain comes to represent the dynamic appearance of objects, where psychophysical studies of apparent motion suggest a central role for geodesic transformations on nonlinear manifolds much like those studied here.

在这里,我们展示了Isomap在为其视觉上引人注目的结构而选择的数据集上的性能,但是该技术可以应用在非线性几何使PCA或MDS的使用复杂化的地方。Isomap补充了基于高阶统计量的PCA线性扩展,如独立成分分析。这也有助于我们更好地理解大脑是如何表现物体的动态外观的,在这种情况下,对表观运动的心理物理研究表明了非线性流形上测地线变换的中心作用,就像这里所研究的那样。

  • 1
    点赞
  • 6
    收藏
    觉得还不错? 一键收藏
  • 0
    评论
【流行学习简介】:假设数据是均匀采样于一个高维欧氏空间中的低维流形,流形学习就是从高维采样数据中恢复低维流形结构,即找到高维空间中的低维流形,并求出相应的嵌入映射,以实现维数约简或者数据可视化。它是从观测到的现象中去寻找事物的本质,找到产生数据的内在规律。流形学习方法是模式识别中的基本方法,分为线性流形学习算法和非线性流形学习算法,线性方法就是传统的方法如主成分分析(PCA)和线性判别分析(LDA),非线行流形学习算法包括等距映射(Isomap),拉普拉斯特征映射(LE)等。 【文件包括】: (1)12篇在流形学习理论中具有里程碑意义的文献: [2000] A Global Geometric Framework for Nonlinear Dimensionality Reduction [2000] Nonlinear Dimensionality Reduction by Locally Linear Embedding [2000] the Manifold Ways of Perception [2003] Hessian Eigen-maps: New Locally Linear Embedding Techniques for High-dimensional Data [2004] Locality Pursuit Embedding [2005] Principal Manifolds and Nonlinear Dimensionality Reduction via Tangent Space Alignment [2005] 高维数据流形的低维嵌入及嵌入维数研究 [2005] 基于放大因子和延伸方向研究流形学习算法 [2005] 一种改进的局部切空间排列算法 [2006] 流形学习概述 [2008] Agent普适机器学习分类器 [2008] 基于流形学习的纤维丛模型研究 其中,前两篇在2000年刊登在Science上。 (2)一篇介绍这些文献的总论短文,梳理了文献的门类,介绍了如何更快地从体系上了解流形学习技术。 【注】:这些资料的总价值在100美元左右,均有英文版本,本人吐血奉献,希望大家能从中收益。
评论
添加红包

请填写红包祝福语或标题

红包个数最小为10个

红包金额最低5元

当前余额3.43前往充值 >
需支付:10.00
成就一亿技术人!
领取后你会自动成为博主和红包主的粉丝 规则
hope_wisdom
发出的红包
实付
使用余额支付
点击重新获取
扫码支付
钱包余额 0

抵扣说明:

1.余额是钱包充值的虚拟货币,按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载,可以购买VIP、付费专栏及课程。

余额充值