pca 与 whitening

最新推荐文章于 2019-06-30 15:00:55 发布

葛小妞

最新推荐文章于 2019-06-30 15:00:55 发布

阅读量559

点赞数

分类专栏：机器学习 deep learning

本文链接：https://blog.csdn.net/geguojing/article/details/25074747

版权

deep learning 同时被 2 个专栏收录

11 篇文章 0 订阅

订阅专栏

机器学习

7 篇文章 0 订阅

订阅专栏

主成分分析是经常用到的。

今天写了一下，发现自己之前对于一些问题的认识不够透彻。

比如例子给的数据是二维的45个数据。对于PCA，首先我们要算所有样本的均值。然后所有的样本减去均值。这样得到的X 才可以用来求cov，然后对于cov 的结果我们求svd 分解，但是我对u的认识不够的，因为 cov 之后矩阵是个方针，比如2x2 的。然后我就想当然的以为每一行是个向量，其实不是，而是每一列对应SVD 分解的特征值，所以在后面是用的U 的转置。

一个问题是为什么我用的U*X 但是不没有出现错误那？

原因是：cov 求出来的u 等于u 的转置。所以显示的结果是对的。但是这样用是不对的。

数据的whitening必须满足两个条件：

一是不同特征间相关性最小，接近0；（PCA 之后基本我们可以看到维度之间是正交的关系，满足第一个条件）

二是所有特征的方差相等（不一定为1）。常见的白化操作有PCA whitening和ZCA whitening。

　　PCA whitening是指将数据x经过PCA降维为z后，可以看出z中每一维是独立的，满足whitening白化的第一个条件，这是只需要将z中的每一维都除以标准差就得到了每一维的方差为1，也就是说方差相等。公式为：

　　ZCA whitening是指数据x先经过PCA变换为z，但是并不降维，因为这里是把所有的成分都选进去了。这是也同样满足whtienning的第一个条件，特征间相互独立。然后同样进行方差为1的操作，最后将得到的矩阵左乘一个特征向量矩阵U即可。

　　ZCA whitening公式为：

close all

%%================================================================
%% Step 0: Load data
%  We have provided the code to load data from pcaData.txt into x.
%  x is a 2 * 45 matrix, where the kth column x(:,k) corresponds to
%  the kth data point.Here we provide the code to load natural image data into x.
%  You do not need to change the code below.

x = load('pcaData.txt','-ascii');
figure(1);
scatter(x(1, :), x(2, :));
title('Raw data');


%%================================================================
%% Step 1a: Implement PCA to obtain U 
%  Implement PCA to obtain the rotation matrix U, which is the eigenbasis
%  sigma. 

% -------------------- YOUR CODE HERE -------------------- 
u = zeros(size(x, 1)); % You need to compute this
[row,column]=size(x);
x=x-repmat(mean(x,2),1,column)
t=(1/column)*x*x';
[u,s,v]=svd(t);

% -------------------------------------------------------- 
hold on
plot([0 u(1,1)], [0 u(2,1)]);
plot([0 u(1,2)], [0 u(2,2)]);
scatter(x(1, :), x(2, :));
hold off

%%================================================================
%% Step 1b: Compute xRot, the projection on to the eigenbasis
%  Now, compute xRot by projecting the data on to the basis defined
%  by U. Visualize the points by performing a scatter plot.

% -------------------- YOUR CODE HERE -------------------- 
xRot = zeros(size(x)); % You need to compute this
xRot = u'*x;    % why not u*x ? reason 因为 u 的每一列对应一个特征值 。

% -------------------------------------------------------- 

% Visualise the covariance matrix. You should see a line across the
% diagonal against a blue background.
figure(2);
scatter(xRot(1, :), xRot(2, :));
title('xRot');

%%================================================================
%% Step 2: Reduce the number of dimensions from 2 to 1. 
%  Compute xRot again (this time projecting to 1 dimension).
%  Then, compute xHat by projecting the xRot back onto the original axes 
%  to see the effect of dimension reduction

% -------------------- YOUR CODE HERE -------------------- 
k = 1; % Use k = 1 and project the data onto the first eigenbasis
xHat = zeros(size(x)); % You need to compute this
u1=u;
u1(:,2)=0;
xHat=u*u1'*x;          % 为什么要乘以u 原因是什么


% -------------------------------------------------------- 
figure(3);
scatter(xHat(1, :), xHat(2, :));
title('xHat');


%%================================================================
%% Step 3: PCA Whitening
%  Complute xPCAWhite and plot the results.

epsilon = 1e-5;
% -------------------- YOUR CODE HERE -------------------- 
xPCAWhite = zeros(size(x)); % You need to compute this
for i=1:size(x,1)
    xPCAWhite(i,:)= xRot(i,:)/sqrt(s(i,i)+epsilon);
end
% xPCAWhite = diag(1./sqrt(diag(s)+epsilon))*u'*x;


% -------------------------------------------------------- 
figure(4);
scatter(xPCAWhite(1, :), xPCAWhite(2, :));
title('xPCAWhite');

%%================================================================
%% Step 3: ZCA Whitening
%  Complute xZCAWhite and plot the results.

% -------------------- YOUR CODE HERE -------------------- 
xZCAWhite = zeros(size(x)); % You need to compute this
xZCAWhite =u*xPCAWhite;

% -------------------------------------------------------- 
figure(5);
scatter(xZCAWhite(1, :), xZCAWhite(2, :));
title('xZCAWhite');

%% Congratulations! When you have reached this point, you are done!
%  You can now move onto the next PCA exercise. :)