Stanford ML - Lecture 10 - Dimensionality Reduction

1. Motivation I: Data Compression

  • Reduce data from 2D to 1D

2. Motivation II: Data Visualization

3. Principal Component Analysis problem formulation

  1. reduce from 2D to 1D: find a direction onto which to project the data so as to minimize the projection error
  2. reduce from n-D to k-D: find k vectors onto which to project the data so as to minimize the projection error
  • PCA is not linear regression

4. Principal Component Analysis algorithm

  • data preprocessing

training set: 

preprocessing (feature scaling/mean normalization)


if different features on different scales, scale features to have comparable range of values

  • PCA algorithm - reduce data from n-D to k-D

compute covariance matrix


compute eigenvectors of

5. Reconstruction from compressed representation


6. Choosing the number of principal components

  • average squared projection error

  • total variation in the data

  • choose k to be smallest value so that

7. Advice for applying PCA

  • supervised learning speedup




this mapping can be applied as well to examples in the cross validation and test sets

  • application of PCA
    • compression
      • reduce memory/disk needed to store data
      • speedup learning algorithm
    • visualization
  • bad use of PCA: to prevent overfitting, this might work OK, but isn't a good way to address overfitting, use regularization instead

PCA与Linear Regression的区别

  • PCA衡量的是orthogonal distance,而linear regression是所有x点对应的真实值y=g(x)与估计值f(x)之间的vertical distance距离
  • more general 的解释:PCA中为的是寻找一个surface,将各feature{x1,x2,...,xn}投影到这个surface后使得各点间variance最大(跟y没有关系,是寻找最能够表现这些feature的一个平面);而Linear Regression是给出{x1,x2,...,xn},希望根据x去预测y,所以进行回归
信息熵的做法应该属于projection pursuit降维了,PCA是factor analysis的方法

  • 0
    点赞
  • 0
    收藏
    觉得还不错? 一键收藏
  • 0
    评论
评论
添加红包

请填写红包祝福语或标题

红包个数最小为10个

红包金额最低5元

当前余额3.43前往充值 >
需支付:10.00
成就一亿技术人!
领取后你会自动成为博主和红包主的粉丝 规则
hope_wisdom
发出的红包
实付
使用余额支付
点击重新获取
扫码支付
钱包余额 0

抵扣说明:

1.余额是钱包充值的虚拟货币,按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载,可以购买VIP、付费专栏及课程。

余额充值