Reduce-Rank Regression通俗解释及与其他降维方法的关系

对于两组数据 X ∈ R m × p , Y ∈ R m × q \mathbf{X} \in \mathbb{R}^{m\times p},\mathbf{Y} \in \mathbb{R}^{m\times q } XRm×p,YRm×q,常见的降维方法有

P C A : Var ⁡ ( X w ) R R R : Var ⁡ ( X w ) ⋅ Corr ⁡ 2 ( X w , Y v ) ⋅ Var ⁡ ( Y v ) P L S : Var ⁡ ( X w ) ⋅ Corr ⁡ 2 ( X w , Y v ) ⋅ Var ⁡ ( Y v ) = Cov ⁡ 2 ( X w , Y v ) C C A : Var ⁡ ( X w ) ⋅ Corr ⁡ 2 ( X w , Y v ) \mathrm{PCA:}\operatorname{Var}(\mathbf{Xw}) \\ \mathrm{RRR:}\phantom{\operatorname{Var}(\mathbf {Xw})\cdot{}}\operatorname{Corr}^2(\mathbf{Xw},\mathbf {Yv})\cdot\operatorname{Var}(\mathbf{Yv}) \\ \mathrm{PLS:}\operatorname{Var}(\mathbf{Xw})\cdot\operatorname{Corr}^2(\mathbf{Xw},\mathbf {Yv})\cdot\operatorname{Var}(\mathbf {Yv}) = \operatorname{Cov}^2(\mathbf{Xw},\mathbf {Yv})\\ \mathrm{CCA:}\phantom{\operatorname{Var}(\mathbf {Xw})\cdot {}}\operatorname{Corr}^2(\mathbf {Xw},\mathbf {Yv}) PCA:Var(Xw)RRR:Var(Xw)Corr2(Xw,Yv)Var(Yv)PLS:Var(Xw)Corr2(Xw,Yv)Var(Yv)=Cov2(Xw,Yv)CCA:Var(Xw)Corr2(Xw,Yv)
上述都是通过减秩降维的办法,从 X , Y X,Y X,Y中提取数据。不同之处在于提取注重的方向不同,但只要提取的维度够多,在理论上,这些算法效果和OLS没有什么差别

RRR: Reduce-Rank Regression的解释
OLS目标如下
L = ∥ Y − X B ∥ 2 L=\|\mathbf Y-\mathbf X\mathbf B\|^2 L=YXB2
对系数矩阵 B B B进行约束,希望它的秩越小越好,同时又不希望降低其拟合精度
L = ∥ Y − X B ^ O L S ∥ 2 + ∥ X B ^ O L S − X B ∥ 2 L=\|\mathbf Y-\mathbf X\hat{\mathbf B}_\mathrm{OLS}\|^2+\|\mathbf X\hat{\mathbf B}_\mathrm{OLS}-\mathbf X\mathbf B\|^2 L=YXB^OLS2+XB^OLSXB2

第一项是常数,可以忽略。优化第二项是一个经典的低秩逼近。网上资料以及文献上有一些关于RRR令人费解的定义,这里就略过了。化简上式得到
∥ X B ^ O L S − X B ∥ 2 = ∥ Y ^ O L S − Y ^ R R R ∥ 2 \|\mathbf X\hat{\mathbf B}_\mathrm{OLS}-\mathbf X\mathbf B\|^2=\|\hat{\mathbf Y}_\mathrm{OLS}-\mathbf{\hat{Y}_{RRR}} \|^2 XB^OLSXB2=Y^OLSY^RRR2
Y ^ O L S = U Σ V T \hat{\mathbf Y}_\mathrm{OLS}=\mathbf U\Sigma\mathbf V^T Y^OLS=UΣVT
假定秩限制为r, 根据Eckart-young定理得到 Y ^ R R R = Y ^ O L S V r V r T = X B ^ O L S V r V r T ⇒ \mathbf{\hat{Y}_{RRR}}=\hat{\mathbf Y}_\mathrm{OLS}\mathbf V_r\mathbf V_r^T=\mathbf X\hat{\mathbf B}_\mathrm{OLS}\mathbf V_r\mathbf V_r^T\Rightarrow Y^RRR=Y^OLSVrVrT=XB^OLSVrVrT
B ^ R R R = B ^ O L S V r V r ⊤ \hat{\mathbf B}_\mathrm{RRR}=\hat{\mathbf B}_\mathrm{OLS}\mathbf V_r\mathbf V_r^\top B^RRR=B^OLSVrVr

从直觉上,RRR SVD分解 B ^ O L S \hat{\mathbf B}_\mathrm{OLS} B^OLS矩阵,前面几个特征向量代表着和X与Y最有相关的方向。从这一点看RRR和CCA异曲同工,不知道结果是否是一样的,没有验证过。

PS: 冲着1024,为赋新诗强说愁

参考

https://stats.stackexchange.com/questions/152517/what-is-reduced-rank-regression-all-about

Sure, here is an example Python code for Uplift model with low-rank regularization logistic regression for multiple correlated binary responses: ```python import numpy as np from sklearn.linear_model import LogisticRegression # define the Uplift model with low-rank regularization logistic regression class UpliftLowRankLogisticRegression: def __init__(self, rank=5, alpha=1.0): self.rank = rank self.alpha = alpha self.models = [] self.w = None def fit(self, X, y, t): # calculate the treatment and control groups X_treatment = X[t == 1] y_treatment = y[t == 1] X_control = X[t == 0] y_control = y[t == 0] # fit the logistic regression model for each response for i in range(y.shape[1]): model = LogisticRegression(penalty='l2', C=self.alpha) model.fit(np.hstack((X_treatment, y_treatment[:, i].reshape(-1, 1))), y_treatment[:, i]) self.models.append(model) # use SVD to learn the low-rank representation w U, S, Vt = np.linalg.svd(y_control - self.predict(X_control)) self.w = Vt[:self.rank].T def predict(self, X): # calculate the uplift score for each response uplift_scores = np.zeros((X.shape[0], len(self.models))) for i, model in enumerate(self.models): uplift_scores[:, i] = model.predict_proba(X)[:, 1] # calculate the predicted response for the control group y_control_pred = np.dot(X, self.w) # calculate the predicted response for the treatment group y_treatment_pred = y_control_pred + uplift_scores # return the predicted response matrix return np.vstack((y_control_pred, y_treatment_pred)) ``` The `UpliftLowRankLogisticRegression` class takes two hyperparameters: `rank` for the rank of the low-rank representation w and `alpha` for the regularization strength of logistic regression. In the `fit` method, the treatment and control groups are separated, and logistic regression models are fitted for each response using the treatment group. Then, SVD is used to learn the low-rank representation w from the predicted responses of the control group. In the `predict` method, the uplift scores for each response are calculated using the logistic regression models and added to the predicted responses of the control group to obtain the predicted responses of the treatment group. The predicted response matrix is returned by stacking the predicted responses of the control and treatment groups vertically.
评论
添加红包

请填写红包祝福语或标题

红包个数最小为10个

红包金额最低5元

当前余额3.43前往充值 >
需支付:10.00
成就一亿技术人!
领取后你会自动成为博主和红包主的粉丝 规则
hope_wisdom
发出的红包
实付
使用余额支付
点击重新获取
扫码支付
钱包余额 0

抵扣说明:

1.余额是钱包充值的虚拟货币,按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载,可以购买VIP、付费专栏及课程。

余额充值