机器学习笔记 ---- Principal Component Analysis

最新推荐文章于 2024-04-08 16:01:32 发布

VampireWeekend

最新推荐文章于 2024-04-08 16:01:32 发布

阅读量560

点赞数

分类专栏： Machine Learning

版权声明：本文为博主原创文章，遵循 CC 4.0 BY-SA 版权协议，转载请附上原文出处链接和本声明。

本文链接：https://blog.csdn.net/sinat_35406909/article/details/82319095

版权

Machine Learning 专栏收录该内容

10 篇文章 0 订阅

订阅专栏

PCA（主成分分析）主要用于找到一个方向，将所有数据点投影到该直线上，以最小化投影误差。数据预处理包括特征缩放和均值归一化。PCA算法通过取U的前k个列来实现降维，得到Z=U^TX。选择降维数量时，可以检查保留方差的比例是否大于99%。PCA可加速监督学习，但应在训练集上运行PCA并应用于其他数据集。

摘要由CSDN通过智能技术生成

1. Task of PCA

Find a direction and project all points to that line, thus minimizing the projection error.
Projection error: Sum of distances between points and line

2. Data Preprocessing

Feature Scaling + Mean Normalization

3. PCA Algorithm

Using the first k vectors in $U$ and denote it as $U_r$ , the result is $Z=U_r^T X$

4. Reconstruction from PCA

X a p p r o x = U r Z

$X_{approx}=U_r Z$

5. How to Choose the Reduced Dimension

Using $S=diag(s_1...s_n)$ , Check whether

1 - \sum k i = 1 s i \sum n i = 1 s i < = 0.01

$1- \frac{\sum_{i=1}^k s_i}{\sum_{i=1}^n s_i}<=0.01$

—– an

O(n) O ( n ) $O(n)$ Algorithm

6. Speed Up Supervised Learning by PCA

Train the model using data compressed by PCA
Note: Running PCA which only depends on TRAINING SET when training!
While this mapping can be applied to other sets.

Only use PCA when the original data perform badly on your system!

关注

0
点赞
踩
0

收藏

觉得还不错? 一键收藏
0
评论
复制链接

分享到 QQ

分享到新浪微博

扫一扫

专栏目录

评论

被折叠的条评论为什么被折叠?

到【灌水乐园】发言

查看更多评论

添加红包

成就一亿技术人!

hope_wisdom

发出的红包

实付元

使用余额支付

点击重新获取

扫码支付

钱包余额 0

抵扣说明：

1.余额是钱包充值的虚拟货币，按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载，可以购买VIP、付费专栏及课程。