Dimensionality Reduction

最新推荐文章于 2024-04-28 08:19:30 发布

weixin_30940783

最新推荐文章于 2024-04-28 08:19:30 发布

阅读量112

点赞数

文章标签：人工智能

原文链接：http://www.cnblogs.com/makino/p/9626871.html

版权

Dimensionality Reduction

--Hands-on Machine Learning with Scikit-Learn and TensorFlow -Chapter 8

Introduction

Two main approaches for Dimensionalty Reduction

Projection 在实际问题当中，训练数据通常是非均匀的分布在整个维度里面。有很多特征是连续的，但是有一些特征非常相似。结果这些训练数据在低纬度空间中挨得非常近。
Manifold Learning （流形学习）

PCA(Principal Component Analysis)

2.2 the axis minimizes the mean squared distance between the original dataset and projection onto the axis

3.　Principal Components:PCA identifies the axis that accounts for the largest amount of variance in the training set.

the unit vector that defines the i(th) axis is called i(th) principal component.

如何找到训练数据的主成分？Singular Value Decomposition(SVD)

PAC默认数据集是以愿数据为中心的。Sklearn 的pca 包已经将数据集中化处理了。而用其他方法构造pca时候，不要忘记首先集中化处理数据。（centering the data）

在降维的时候，一定要尽可能的保证更大的方差。

from sklearn.decomposition import PCA

pca=PCA(n_components=2)
X2D=pcd.fit_transform(X)
pca.explained_variance_ratio_

选择合适的维数

转载于:https://www.cnblogs.com/makino/p/9626871.html

关注