【机器学习】9 降维

最新推荐文章于 2024-06-18 17:30:32 发布

社恐患者

最新推荐文章于 2024-06-18 17:30:32 发布

阅读量122

点赞数 1

分类专栏：机器学习文章标签：机器学习

本文链接：https://blog.csdn.net/qq_44714521/article/details/108548120

版权

机器学习专栏收录该内容

15 篇文章 1 订阅

订阅专栏

第9章降维

1 Motivation
2 Principal Component Analysis Problem Formulation ( PCA ) 主成分分析问题
3 Advice for Applying PCA
4 Reference

1 Motivation

Data Compression 数据压缩
Visualization 数据可视化

2 Principal Component Analysis Problem Formulation ( PCA ) 主成分分析问题

2.1 PCA与线性回归的区别

PCA	线性回归
最小化投射误差 Projected Error	最小化预测误差
对结果不作预测	为了预测结果

2.2 Discription

Reduce from n-dimension to k-dimension：Find $k$ vectors $u^{(i)},u^{(i)},···,u^{(k)}$ (经过原点？) onto which to project the data, so as to minimize the projection error.
保证降维的同时，保证数据的特性损失最小
完全无参数限制，在PCA的计算过程中完全不需要人为地设定参数或是根据任何经验模型对计算进行干预，最后的结果只与数据相关，与用户是独立的

2.3 Algorithm

2.3.1 Data Preprocessing

Training set： $x^{(1)},x^{(2)},···,x^{(m)}$
Preprocessing ( feature scaling 特征规范化 / mean normalization 均值归一化 )：
$\mu_j=\frac{1}{m}\sum_{i=1}^mx_j^{(i)}$
Replace each $x_j^{(i)}$ with $x_j-\mu_j$
If different features on different scales, scale features to have comparable range of values.

2.3.2 Compute “Covariance Matrix” 计算协方差矩阵

$\Sigma=\frac{1}{m}\sum_{i=1}^n\left(x^{(i)}\right){\left(x^{(i)}\right)}^T$
$x^{(i)}是一个n×1的向量$

2.3.3 Compute “Eigenvectors" of Matrix $\Sigma$ 计算协方差矩阵的特征向量

[U,S,V]=svd(Sigma);

$U$ 是一个具有与数据之间最小投射误差的方向向量构成的矩阵 $U=\left[\begin{matrix} |&|&&|& &|\\ u^{(1)}&u^{(2)}&\cdots&u^{(k)}&\cdots&u^{(n)}\\ |&|& &|&&| \end{matrix}\right]∈\mathbb{R}^{n×n}$
从 $U$ 中选取前 $k$ 个向量，获得一个 $n \times k$ 维度向量 $U_{reduce}$
获取新的 $k \times 1$ 维特征向量 $z^{(i)}={U_{reduce}}^T*x^{(i)}$
不对方差特征进行处理

2.3.4 Reconstruction from compressed representation 重建的压缩表示

近似获得原有的特征： $x_{approx}^{(i)}=U_{reduce}z^{(i)}≈x^{(i)}$

2.3.5 Choosing the Number of Principal Components 选择主成分的数量

Average squared projection error 平均均方误差： $\frac{1}{m}\sum_{i=1}^m{||x^{(i)}-x_{approx}^{(i)}||}^2$
Total variation in the data 训练集的方差： $\frac{1}{m}\sum_{i=1}^m{||x^{(i)}||}^2$

Goal：在平均均方误差与训练集方差的比例尽可能小的情况下选择尽可能小的 $k$ 值 $\frac{\frac{1}{m}\sum_{i=1}^m{||x^{(i)}-x_{approx}^{(i)}||}^2}{\frac{1}{m}\sum_{i=1}^m{||x^{(i)}||}^2}≤0.01\text{ \ \ \ \ \ means 99\% of variance is retained，意味着保留了原本数据99\%的偏差}$

Way（1）

Try PCA with $k = 1, 2, \cdot \cdot \cdot$
Compute $U_{reduce},z^{(1)},z^{(2)},···,z^{(m)},x^{(1)}_{approx},···,x^{(m)}_{approx}$
Check if $\frac{\frac{1}{m}\sum_{i=1}^m{||x^{(i)}-x_{approx}^{(i)}||}^2}{\frac{1}{m}\sum_{i=1}^m{||x^{(i)}||}^2}≤0.01$

Way（2）

[U,S,V]=svd(sigma)

$S$ 是一个 $n \times n$ 的矩阵 $S=\left[\begin{matrix} s_{11}&0&0&\cdots&0\\ 0&s_{22}&0&\cdots&0\\ 0&0&s_{33}&\cdots&0\\ &&&\ddots&\\ 0&0&0&\cdots&s_{nn} \end{matrix}\right]$
$\frac{\frac{1}{m}\sum_{i=1}^m{||x^{(i)}-x_{approx}^{(i)}||}^2}{\frac{1}{m}\sum_{i=1}^m{||x^{(i)}||}^2}=1-\frac{\sum_{i=1}^ks_{ii}}{\sum_{i=1}^ns_{ii}}≤0.01$
即 $\frac{\sum_{i=1}^ks_{ii}}{\sum_{i=1}^ns_{ii}}≥0.99$

3 Advice for Applying PCA

Mapping $x^{(i)}→z^{(i)}$ should be defined by running PCA only on the training set. This mapping can be applied as well to the $x_{cv}^{(i)}$ and $x_{test}^{(i)}$ in the cross validation and test sets.
Advantages：
(1) Compression：
① Reduce memory / disk needed to store data
② Speed up learning algorithm
(2) Visualization
不要用PCA解决过拟合，用正则化解决
最好从所有原始特征开始，因为PCA降维后会丢失一些信息，有必要的时候再用PCA