机器学习笔记 ---- Principal Component Analysis

PCA(主成分分析)主要用于找到一个方向,将所有数据点投影到该直线上,以最小化投影误差。数据预处理包括特征缩放和均值归一化。PCA算法通过取U的前k个列来实现降维,得到Z=U^TX。选择降维数量时,可以检查保留方差的比例是否大于99%。PCA可加速监督学习,但应在训练集上运行PCA并应用于其他数据集。
摘要由CSDN通过智能技术生成

1. Task of PCA

Find a direction and project all points to that line, thus minimizing the projection error.
Projection error: Sum of distances between points and line

2. Data Preprocessing

Feature Scaling + Mean Normalization

3. PCA Algorithm



Using the first k vectors in U U and denote it as Ur , the result is Z=UTrX Z = U r T X

4. Reconstruction from PCA


Xapprox=UrZ X a p p r o x = U r Z

5. How to Choose the Reduced Dimension



Using S=diag(s1...sn) S = d i a g ( s 1 . . . s n ) , Check whether

1ki=1sini=1si<=0.01 1 − ∑ i = 1 k s i ∑ i = 1 n s i <= 0.01

—– an O(n) O ( n ) Algorithm

6. Speed Up Supervised Learning by PCA

Train the model using data compressed by PCA
Note: Running PCA which only depends on TRAINING SET when training!
While this mapping can be applied to other sets.

Only use PCA when the original data perform badly on your system!
评论
添加红包

请填写红包祝福语或标题

红包个数最小为10个

红包金额最低5元

当前余额3.43前往充值 >
需支付:10.00
成就一亿技术人!
领取后你会自动成为博主和红包主的粉丝 规则
hope_wisdom
发出的红包
实付
使用余额支付
点击重新获取
扫码支付
钱包余额 0

抵扣说明:

1.余额是钱包充值的虚拟货币,按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载,可以购买VIP、付费专栏及课程。

余额充值