浅谈SVD分解和CUR分解

本文深入探讨了SVD和CUR分解,从Power Iteration算法开始,解释了PCA的原理和应用,接着详细阐述了SVD的定义、计算原理及应用实例。PCA与SVD在数据降维中各有优缺点,而CUR分解则解决了SVD中矩阵稀疏性的问题,通过选取矩阵的部分行和列来构建逼近矩阵。
摘要由CSDN通过智能技术生成

1.Power iteration

In mathematics, the power iteration (also known as power method) is an eigenvalue algorithm: given a matrix A , the algorithm will produce a number λ , which is the greatest (in absolute value) eigenvalue of A , and a nonzero vector v , the corresponding eigenvector of λ , such that Av=λv . The algorithm is also known as the Von Mises iteration.

The power iteration is a very simple algorithm, but it may converge slowly. It does not compute a matrix decomposition, and hence it can be used when {\displaystyle A} A is a very large sparse matrix.

When M is a stochastic matrix(随机矩阵), the limiting vector is the principal eigenvector (主特征向量, the eigenvector with the largest eigenvalue), and its corresponding eigenvalue is 1. This method for finding the principal eigenvector, called power iteration, works quite generally, although if the principal eigenvalue (eigenvalue associated with the principal eigenvector) is not 1, then as i grows, the ratio of Mi+1v to Miv approaches the principal eigenvalue while M iv approaches a vector (probably not a unit vector) with the same direction as the principal eigenvector.

1.2Power iteration算法

这里写图片描述

1.3实例

M=[3226],λ1=7,v1=[0.4470.894],λ2=2,v2=[0.8940.447]

M=[3 2; 2 6];
x0=[1 1]';
err=1;
%对M采用power iteration
while (err>0.001)
    x1=M*x0/norm(M*x0);
    err=norm([x1-x0])
    x0=x1;
end
x1    %主特征向量
lamda1=x1'*M*x1    %主特征值

M1=M-lamda1*x1*x1'
%对M1采用power iteration 
x0=[1 1]';
err=1;
while (err>0.001)
    x2=M1*x0/norm(M1*x0);
    err=norm([x2-x0])
    x0=x2;
end
x2%第二特征向量
lamda2=x2'*M1*x2%第二特征值

容易知道采用迭代法球出来的特征对与定义法求出来的特征对非常接近。

2.Principal-Component Analysis

Principal-component analysis, or PCA, is a technique for taking a dataset consisting of a set of tuples representing points in a high-dimensional space and finding the directions along which the tuples line up best. The idea is to treat the set of tuples as a matrix M and find the eigenvectors for MMT or MTM .The matrix of these eigenvectors can be thought of as a rigid rotation in a high dimensional space. When you apply this transformation to the original data, the axis corresponding to the principal eigenvector is the one along which the points are most “spread out,” More precisely, this axis is the one along which the variance of the data is maximized. Put another way, the points can best be viewed as lying along this axis, with small deviations from this axis. Likewise,the axis corresponding to the second eigenvector (the eigenvector corresponding to the second-largest eigenvalue) is the axis along which the variance of distances from the first axis is greatest, and so on.

We can view PCA as a data-mining technique. The high-dimensional data can be replaced by its projection onto the most important axes. These axes are the onescorresponding to the largest eigenvalues. Thus, the original data is approximated by data with many fewer dimensions, which summarizes well the original data.

主成分分析(Principal Component Analysis,PCA), 是一种统计方法。通过正交变换将一组可能存在相关性的变量转换为一组线性不相关的变量,转换后的这组变量叫主成分。主成分分析首先是由K.皮尔森(Karl Pearson)对非随机变量引入的,尔后H.霍特林将此方法推广到随机向量的情形。信息的大小通常用离差平方和或方差来衡量。

3. MMT MTM 的共同特征值

对于 Mn×m 矩阵,矩阵 MMT 的特征值是矩阵 MTM 的特征值加上 nm 个0,如果 n>m .反之也是成立,如果 n<m ,则矩阵 MTM 的特征值是矩阵 MMT 的特征值加上 nm 个0

证明如下:
对于矩阵 Mnm ,我们假设 n

  • 3
    点赞
  • 11
    收藏
    觉得还不错? 一键收藏
  • 0
    评论
评论
添加红包

请填写红包祝福语或标题

红包个数最小为10个

红包金额最低5元

当前余额3.43前往充值 >
需支付:10.00
成就一亿技术人!
领取后你会自动成为博主和红包主的粉丝 规则
hope_wisdom
发出的红包
实付
使用余额支付
点击重新获取
扫码支付
钱包余额 0

抵扣说明:

1.余额是钱包充值的虚拟货币,按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载,可以购买VIP、付费专栏及课程。

余额充值