Dimensionality Reduction - Reconstruction from compressed representation

摘要: 本文是吴恩达 (Andrew Ng)老师《机器学习》课程,第十五章《降维》中第120课时《压缩重现》的视频原文字幕。为本人在视频学习过程中记录下来并加以修正,使其更加简洁,方便阅读,以便日后查阅使用。现分享给大家。如有错误,欢迎大家批评指正,在此表示诚挚地感谢!同时希望对大家的学习能有所帮助.
————————————————

In some of the early videos, I was talking about PCA as a compression algorithm. You may have, say, a thousand of dimensional data and compress it to a hundred dimensional feature vector. Or have three dimensional data and compres it to a two dimensional representation. So, if this is a compression algorithm, there should be a way to go back from this compressed representation back to an approximation of your original high dimensional data. So, given z^{(i)}, which maybe a hundred dimensional, how do you go back to your original representation x^{(i)} which was maybe a thousand dimensional? In this video, I'd like to describe how to do that.


In the PCA algorithm, we may have an example like this. What we do is take these examples and we project them onto this one dimensional surface. And then we need to use only a real number to specify the location of these points after they've projected onto this one dimensional surface. So given a point z^{(1)}, how can we go back to this original two-dimensional space? In particular, given the point z\in \mathbb{R}, can we map this back to some approximate representation x\in \mathbb{R}^{2}? So,

z=U_{reduced}^{T}x,

if you want to go in the opposite direction, the equation for that is:

x_{approx}=U_{reduced}z

Here, U_{reduced} is n\times k dimensional vector. And z is k \times 1 dimensional vector. So x_{approx} is going to be n \times 1.

And so the intent of PCA is that, if the square projection error is not too big, this x_{approx} will be close to whatever was the original value of x that you have used to derive z in the first place. To show a picture of what this looks like. What you get back of this procedure are points that lie onto the green line. So to take the early example. If we started off with this value of x^{(1)}, and we got this value of z^{(1)}. If you plug z^{(1)} through this formula to get x^{(1)}_{approx}, then this point here will be x^{(1)}_{approx}\in \mathbb{R}^{2}. Similarly, if you do the same procedure, this will be x^{(2)}_{approx}\in \mathbb{R}^{2}. These will be pretty decent approximation to the original data. That's how you go back from your low dimensional representation z back to an uncompressed representation of the data. We call this process reconstruction of the original data where we try to recontruct the original value of x from the compressed representation.

So, given an unlabeld data set, we now know how to apply PCA and take the high dimensional features x and map this to the lower representation z. From this video, we also know how to take this low representation z and map it back to an approximation of the original high dimensional data. Now that we know how to implement and apply PCA,  what we'll like to do next is to talk about some of the mechanics of how to actually use PCA well. In particular, in the next video, I'd like to talk about how to choose k, which is, how to choose the dimension of this reduced representation vector z

<end>

评论
添加红包

请填写红包祝福语或标题

红包个数最小为10个

红包金额最低5元

当前余额3.43前往充值 >
需支付:10.00
成就一亿技术人!
领取后你会自动成为博主和红包主的粉丝 规则
hope_wisdom
发出的红包
实付
使用余额支付
点击重新获取
扫码支付
钱包余额 0

抵扣说明:

1.余额是钱包充值的虚拟货币,按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载,可以购买VIP、付费专栏及课程。

余额充值