Reduce-Rank Regression通俗解释及与其他降维方法的关系

最新推荐文章于 2022-08-25 13:55:26 发布

billy145533

最新推荐文章于 2022-08-25 13:55:26 发布

阅读量2.1k

点赞数 1

分类专栏：数据科学机器学习文章标签： 1024程序员节 Reduce-Rank RRR

本文链接：https://blog.csdn.net/billy145533/article/details/105650198

版权

数据科学同时被 2 个专栏收录

38 篇文章 8 订阅

订阅专栏

机器学习

20 篇文章 2 订阅

订阅专栏

对于两组数据 $\mathbf{X} \in \mathbb{R}^{m\times p},\mathbf{Y} \in \mathbb{R}^{m\times q }$ ，常见的降维方法有

$\mathrm{PCA:}\operatorname{Var}(\mathbf{Xw}) \\ \mathrm{RRR:}\phantom{\operatorname{Var}(\mathbf {Xw})\cdot{}}\operatorname{Corr}^2(\mathbf{Xw},\mathbf {Yv})\cdot\operatorname{Var}(\mathbf{Yv}) \\ \mathrm{PLS:}\operatorname{Var}(\mathbf{Xw})\cdot\operatorname{Corr}^2(\mathbf{Xw},\mathbf {Yv})\cdot\operatorname{Var}(\mathbf {Yv}) = \operatorname{Cov}^2(\mathbf{Xw},\mathbf {Yv})\\ \mathrm{CCA:}\phantom{\operatorname{Var}(\mathbf {Xw})\cdot {}}\operatorname{Corr}^2(\mathbf {Xw},\mathbf {Yv})$
上述都是通过减秩降维的办法，从 $X, Y$ 中提取数据。不同之处在于提取注重的方向不同，但只要提取的维度够多，在理论上，这些算法效果和OLS没有什么差别

RRR: Reduce-Rank Regression的解释
OLS目标如下
$L=\|\mathbf Y-\mathbf X\mathbf B\|^2$
对系数矩阵 $B$ 进行约束，希望它的秩越小越好，同时又不希望降低其拟合精度
$L=\|\mathbf Y-\mathbf X\hat{\mathbf B}_\mathrm{OLS}\|^2+\|\mathbf X\hat{\mathbf B}_\mathrm{OLS}-\mathbf X\mathbf B\|^2$

第一项是常数，可以忽略。优化第二项是一个经典的低秩逼近。网上资料以及文献上有一些关于RRR令人费解的定义，这里就略过了。化简上式得到
$\|\mathbf X\hat{\mathbf B}_\mathrm{OLS}-\mathbf X\mathbf B\|^2=\|\hat{\mathbf Y}_\mathrm{OLS}-\mathbf{\hat{Y}_{RRR}} \|^2$
令 $\hat{\mathbf Y}_\mathrm{OLS}=\mathbf U\Sigma\mathbf V^T$
假定秩限制为r, 根据Eckart-young定理得到 $\mathbf{\hat{Y}_{RRR}}=\hat{\mathbf Y}_\mathrm{OLS}\mathbf V_r\mathbf V_r^T=\mathbf X\hat{\mathbf B}_\mathrm{OLS}\mathbf V_r\mathbf V_r^T\Rightarrow$
$\hat{\mathbf B}_\mathrm{RRR}=\hat{\mathbf B}_\mathrm{OLS}\mathbf V_r\mathbf V_r^\top$