主成分分析-PCA

最新推荐文章于 2022-04-12 18:08:41 发布

susan-wang

最新推荐文章于 2022-04-12 18:08:41 发布

阅读量784

点赞数 2

分类专栏：机器学习统计

本文链接：https://blog.csdn.net/susan_wang1/article/details/54938701

版权

机器学习同时被 2 个专栏收录

10 篇文章 0 订阅

订阅专栏

统计

9 篇文章 0 订阅

订阅专栏

why 主成分分析？

线性回归

$y = X β + ϵ$ $y=X\beta+\epsilon$
假设 $X=\{X_1,\ldots,X_p\}$ 相互独立， $\epsilon\sim N(0,\sigma^2)$ 。
实际应用中， $\{X_1,\ldots,X_p\}$ 未必相互独立，可能是线性相关的。违背线性模型基本假设。

What is 主成分分析？

简单的说就是通过对 $X=\{X_1,\ldots,X_p\}$ 进行线性组合，得到不相关的 $X_c=\{X_{c1},\ldots,X_{cq}\},q\leq p$

维基百科的定义：

Principal component analysis (PCA) uses an orthogonal transformation to convert a set of observations of possibly correlated variables into a set of values of linearly uncorrelated variables called principal components.
The number of principal components is less than or equal to the number of original variables.
The first principal component has the largest possible variance
each succeeding component in turn has the highest variance possible under the constraint that it is orthogonal to the preceding components.
The resulting vectors are an uncorrelated orthogonal basis set.

看图说话：

PCA