数据降维-主成分分析（PCA）

最新推荐文章于 2023-06-20 17:45:22 发布

weixin_62077732

最新推荐文章于 2023-06-20 17:45:22 发布

阅读量267

点赞数

文章标签： sklearn 机器学习 python

本文链接：https://blog.csdn.net/weixin_62077732/article/details/122471218

版权

PCA：是对于数据的压缩，尽可能降低维度，损失少量的信息。

算法的实现过程：（以二维降为一维为例子）：

关于背后的计算涉及到了很多的线性代数的知识，好在sklearn有API来处理这样的问题，下面是对于API的使用过程

from sklearn.datasets import load_iris
from sklearn.decomposition import PCA
from sklearn.preprocessing import StandardScaler


iris = load_iris()
data = iris.data

# 注意：在这里，n_components的值如果为整数，表示降至几维。 n_components为小数表示要将数据保留的比例
trans = PCA(n_components=2)
stand = StandardScaler()
# 将数据进行标准化处理
data = stand.fit_transform(data)
new_data = trans.fit_transform(data)
print(new_data)

[[-2.26470281  0.4800266 ]
 [-2.08096115 -0.67413356]
 [-2.36422905 -0.34190802]
 [-2.29938422 -0.59739451]
 [-2.38984217  0.64683538]
 [-2.07563095  1.48917752]
 [-2.44402884  0.0476442 ]
 [-2.23284716  0.22314807]
 [-2.33464048 -1.11532768]
 [-2.18432817 -0.46901356]
 [-2.1663101   1.04369065]
 [-2.32613087  0.13307834]
 [-2.2184509  -0.72867617]
 [-2.6331007  -0.96150673]
 [-2.1987406   1.86005711]
 [-2.26221453  2.68628449]
 [-2.2075877   1.48360936]
 [-2.19034951  0.48883832]
 [-1.898572    1.40501879]
 [-2.34336905  1.12784938]
 [-1.914323    0.40885571]
 [-2.20701284  0.92412143]
 [-2.7743447   0.45834367]
 [-1.81866953  0.08555853]
 [-2.22716331  0.13725446]
 [-1.95184633 -0.62561859]
 ......

可以看到，数据都被降到了二维上。