低秩矩阵恢复重新解读和随机SVD算法

Galerkin码农选手

于 2021-05-16 14:52:09 发布

阅读量2.1k

点赞数 2

分类专栏：优化方法

本文链接：https://blog.csdn.net/forrestguang/article/details/116891241

版权

优化方法专栏收录该内容

12 篇文章 24 订阅

订阅专栏

随机SVD

给定矩阵 $\in R^{m \times n}$ ，求最大的前p个奇异值和对应的左右奇异向量。\
1：执行下面两个算法中的任意一个(如果执行两个就视为加分项)。\
$\bullet$ 在参考文献Petros Drineas, Ravi Kannan, and MichaelW. Mahoney, Fast Monte Carlo Algorithms for Matrices II: Computing a Low-Rank Approximation to a Matrix, SIAM J.Comput., 36(1), 158183中166页的LinearTimesSVD算法。\
$\bullet$ 在文献N. Halko, P. G. Martinsson, and J. A. Tropp, Finding Structure with Randomness: Probabilistic Algorithms for Constructing Approximate Matrix Decompositions,
SIAM Rev., 53(2), 217288.中227页的Prototype for Randomized SVD算法。\par
2：计算j随机矩阵A的前r个最大的奇异值和对应的左右奇异向量，其中 $\in \{5,10,15,20\}$ ，矩阵A的设置如下：\
$\bullet$ m = 2048,\
$\bullet$ n = 512,\
$\bullet$ p = 20,\
$\bullet$ A = randn(m,p)*randn(p,n).\par
3：使用的数据集参考下列文献的第7章：\
$\bullet$ N. Halko, P. G. Martinsson, and J. A. Tropp, Finding Structure with Randomness: Probabilistic Algorithms for Constructing Approximate Matrix Decompositions,SIAM Rev., 53(2), 217288.\
$\bullet$ 数据集参考\url{https://github.com/WenjianYu/rSVD-single-pass}.

LinearTimesSVD

下面我们先给出LinearTimesSVD算法的结构：
在这里插入图片描述
这里再做迭代的时候涉及到索引的选取，这里做一点说明。\
$\bullet$ 首先生成n个独立随机变量 $p_i,p_i > 0,\sum_{i=1}^n=1$ ，其中 $p_i$ 代表选取第 $i$ 行或者列的概率，\
$\bullet$ 然后定义 $q_0 = 0,q_k = \sum_{i = 1}^k p_i$ ，\
$\bullet$ 随机生成一个 $[0, 1]$ 区间的随机数 $p$ ，此时必然存在一个K，使得 $p\in (q_{K -1},q_K)$ ，则此时我们选择第 $K$ 列。\par
另外，我们指出这个算法里面的矩阵 $C\in R^{m\times c},H_k \in R^{m\times k}$ ，其中 $A^{j}$ 表示矩阵A的第 $j$ 列， $A_i$ 表示矩阵A的第 $i$ 行。关于这个算法，我们细致地做一个说明：首先这个算法输出了 $\sigma_{t}(C)$ ，即返回了矩阵C最大的前 $k$ 个奇异值，这个也就是矩阵A最大的前 $k$ 个奇异值，算法中间出现了 $c$ 个中间变量 $y_t,t = 1,\ldots,c$ ，这个就是矩阵C的左奇异特征向量， $h_t$ 矩阵A左奇异特征向量的一个近似，右特征向量需要根据 $A = USV^{*}$ 计算出来。

Prototype for Randomized SVD算法

在这里插入图片描述
关于这个Prototype for Randomized SVD算法，虽然论文上的算法就像这个示意图一样，如果我们想要得到前 $k$ 个最大的奇异值核相应的左右奇异特征向量，那么在stage B的后面两步需要做一个截断，最终得到左奇异特征向量 $Q\widetilde{U_k}$ ，右奇异特征向量 $V_k$ 和奇异值 $\sigma$ ，只需要把前 $k$ 个 $\sigma$ 取出来即可。

低秩矩阵恢复

$min_{X \in R^{m \times n}} \sum_{(i,j)\in \Omega}(X_{i,j} - M_{i,j})^2 + \mu \| X\|_{*}.$
转化为： $min_{X \in R^{m \times n}} \sum_{(i,j)\in \Omega}(E_{i,j} - M_{i,j})^2 + \mu \| X\|_{*},\\ s.t \quad X - E = 0.$
增广lagrange函数为：
$\mu \|X\|_{*} + <Y,X - E> + \frac{1}{2\tau}\|X - E\|_{F}^2 + \frac{1}{2} \sum_{(i,j)\in \Omega}(E_{i,j} - M_{i,j})^2.$
在这里插入图片描述

通过上面的表格，我们发现似乎这样的随机算法没有起到加速效果，事实上，当本人把规模取到 $m = 500, n = 500$ 的时候这种现象更加明显。\par
这里给出一点本人自己的思考，一般来说，我们使用随机算法来获得前 $k$ 个奇异值，当 $k$ 比较小的时候，这个时候计算近似的SVD分解会快一些，然而这个时候在使用算法的过程中， $k$ 太小的话，会导致我们的近似SVD和真实的SVD相差很大，最终导致的结果是收敛的迭代次数会相应增多。在本人跑上面这个结果的时候，当 $m = 40, n = 40$ ，本人选取的 $k = 40$ ，当本人选取 $m = 100, n = 100$ 时， $k = 90$ ，这个在我们具体运行代码的时候.

代码

本人这次只写了matlab代码，读者自己注意空格
fun.m

function value = fun(mu,X,M,omega)
s = svd(X);
Y = sample(1,omega,X - M);
value = mu*sum(s) + norm(Y,'fro');
end

prox_ker.m

function mat = prox_ker(mu,X)
k = 90;
%k = 40;
[U,s,V] = RSVD(X,k,k);
%[U,s,V] = LTSVD(X,k,k + 2);
d = max(s - mu,0);
mat = U*diag(d)*V';
end

sample.m

function M = sample(mu,omega,A)
[m,n] = size(A);
M = zeros(m,n);
p = length(omega);
for k = 1:p
    i = mod(omega(k),m);
    if(i == 0)
        i = m;
    end
    j = ceil(omega(k)/m);
    M(i,j) = A(i,j)*mu;
end
end

RSVD.m

function [U,s,V] = RSVD(A,k,p)
[m,n] = size(A);
omega = randn(n,k + p);
[Q,R] = qr(A*A'*A*omega);
[U_,s_,V_] = svd(Q'*A);
U = Q*U_(:,1:k);
s = diag(s_);
s = s(1:k);
V = V_(:,1:k);
end

LTSVD.m

function [U,s,V] = LTSVD(A,k,c)
[m,n] = size(A);
pro = rand(n,1);
Pro = pro/sum(pro);
q = zeros(n + 1,1);
for i = 2:n + 1
    q(i) = sum(Pro(1:i - 1,1));
end
C = zeros(m,c);
i = 1;
while i < c + 1
    pra = rand(1);
    for j = 1:n
        if pra >= q(j) & pra <= q(j + 1)
            C(:,i) = A(:,j)/sqrt(c*pra);
            break
        end
    end
    i = i + 1;
end
[us,ss,vs] = svd(C'*C);
eps = 1e-8;
sigma = sqrt(diag(abs(ss)));
U = C*us(:,1:k)*inv(ss(1:k,1:k) + + eye(k,k)*eps);
s = sigma(1:k);
V = A'*U*inv(ss(1:k,1:k) + + eye(k,k)*eps);
end

ADMM.m

function [X,out] = ADMM(mu,X0,M,opts)
st = tic;
[m,n] = size(M);
X_old = X0;
E_old = X_old;
Y_old = X_old;
gama = opts.gama;
k = 0;
while k < opts.epoch
    X_new = prox_ker(mu*opts.tau,E_old - opts.tau*Y_old);
    E_new = sample(1/(2*opts.tau + 1),opts.omega,2*opts.tau*M + X_new + opts.tau*Y_old) + ...
    sample(1,opts.ome,X_new + opts.tau*Y_old);
    Y_new = Y_old + gama*(X_new - E_new)/opts.tau;
    rho = norm(X_new - X_old,'fro');
    if rho < opts.eps
        break;
    end
    if mod(k,50) == 0
        gama = gama*0.96;
        err = norm(sample(1,opts.omega,X_new - M),'fro');
        fprintf('the epoch:%d,the norm:%.3e,the err:%.3e\n',k,rho,err);
    end
    X_old = X_new;
    Y_old = Y_new;
    E_old = E_new;
    k = k + 1;
end
X = X_new;
out.time = toc(st);
out.value = fun(mu,X,M,opts.omega);
out.error = norm(X_new - opts.A,'fro');
out.Me = norm(sample(1,opts.omega,X_new - M),'fro');
out.rank = rank(X);
out.epoch = k;

main.m

clc;
m = 100; n = 100; sr = 0.3; p = round(m*n*sr); r = 3; 
fr = r*(m+n-r)/p; maxr = floor(((m+n)-sqrt((m+n)^2-4*p))/2);
rs = 2021; randn('state',rs); rand('state',rs);
xl = randn(m,r); xr = randn(n,r); A = xl*xr';
Omega = randperm(m*n); omega = Omega(1:p);
ome = Omega(p + 1:m*n);
M = sample(1,omega,A);
mu = 0.002;
X0 = randn(m,n);
%----------------------------
opts = struct();
opts.epoch = 2000;
opts.tau = 1e3;
opts.eps = 1e-8;
opts.A = A;
opts.gama = 1.618;
opts.omega = omega;
opts.ome = ome;
%-------------------
out = struct();
[x,out] = ADMM(mu,X0,M,opts);

fprintf('the time:%.2f,the error:%.3e,the X - M on omega:%.3e,the value:%.4e,the rank:%d,the epoch:%d\n',...
out.time,out.error,out.Me,out.value,out.rank,out.epoch);

Galerkin码农选手

关注

2
点赞
踩
21

收藏

觉得还不错? 一键收藏
打赏
1
评论
低秩矩阵恢复重新解读和随机SVD算法

随机SVD给定矩阵A∈Rm×nA \in R^{m \times n}A∈Rm×n，求最大的前p个奇异值和对应的左右奇异向量。\1：执行下面两个算法中的任意一个(如果执行两个就视为加分项)。\∙\bullet∙在参考文献Petros Drineas, Ravi Kannan, and MichaelW. Mahoney, Fast Monte Carlo Algorithms for Matrices II: Computing a Low-Rank Approximation to a Matrix
复制链接

扫一扫