Dimensionality Reduction - Reconstruction from compressed representation

最新推荐文章于 2022-05-24 16:17:38 发布

王彩旗 edwardwangcq.com

最新推荐文章于 2022-05-24 16:17:38 发布

阅读量177

点赞数

分类专栏：人工智能 # 机器学习

本文链接：https://blog.csdn.net/edward_wang1/article/details/111190163

版权

人工智能同时被 2 个专栏收录

142 篇文章 0 订阅

订阅专栏

机器学习

109 篇文章 0 订阅

订阅专栏

摘要: 本文是吴恩达 (Andrew Ng)老师《机器学习》课程，第十五章《降维》中第120课时《压缩重现》的视频原文字幕。为本人在视频学习过程中记录下来并加以修正，使其更加简洁，方便阅读，以便日后查阅使用。现分享给大家。如有错误，欢迎大家批评指正，在此表示诚挚地感谢！同时希望对大家的学习能有所帮助.
————————————————

In some of the early videos, I was talking about PCA as a compression algorithm. You may have, say, a thousand of dimensional data and compress it to a hundred dimensional feature vector. Or have three dimensional data and compres it to a two dimensional representation. So, if this is a compression algorithm, there should be a way to go back from this compressed representation back to an approximation of your original high dimensional data. So, given $z^{(i)}$ , which maybe a hundred dimensional, how do you go back to your original representation $x^{(i)}$ which was maybe a thousand dimensional? In this video, I'd like to describe how to do that.

In the PCA algorithm, we may have an example like this. What we do is take these examples and we project them onto this one dimensional surface. And then we need to use only a real number to specify the location of these points after they've projected onto this one dimensional surface. So given a point $z^{(1)}$ , how can we go back to this original two-dimensional space? In particular, given the point $z\in \mathbb{R}$ , can we map this back to some approximate representation $x\in \mathbb{R}^{2}$ ? So,

$z=U_{reduced}^{T}x$ ,

if you want to go in the opposite direction, the equation for that is:

$x_{approx}=U_{reduced}z$

Here, $U_{reduced}$ is $n\times k$ dimensional vector. And is $k \times 1$ dimensional vector. So $x_{approx}$ is going to be $n \times 1$ .

And so the intent of PCA is that, if the square projection error is not too big, this $x_{approx}$ will be close to whatever was the original value of that you have used to derive in the first place. To show a picture of what this looks like. What you get back of this procedure are points that lie onto the green line. So to take the early example. If we started off with this value of $x^{(1)}$ , and we got this value of $z^{(1)}$ . If you plug $z^{(1)}$ through this formula to get $x^{(1)}_{approx}$ , then this point here will be $x^{(1)}_{approx}\in \mathbb{R}^{2}$ . Similarly, if you do the same procedure, this will be $x^{(2)}_{approx}\in \mathbb{R}^{2}$ . These will be pretty decent approximation to the original data. That's how you go back from your low dimensional representation back to an uncompressed representation of the data. We call this process reconstruction of the original data where we try to recontruct the original value of from the compressed representation.

So, given an unlabeld data set, we now know how to apply PCA and take the high dimensional features and map this to the lower representation . From this video, we also know how to take this low representation and map it back to an approximation of the original high dimensional data. Now that we know how to implement and apply PCA, what we'll like to do next is to talk about some of the mechanics of how to actually use PCA well. In particular, in the next video, I'd like to talk about how to choose , which is, how to choose the dimension of this reduced representation vector .