low rank representation

去年已经开始在接触低秩表达,最近学习到一些paper,发现对这个还是不是很理解,今天从这里开始记录一下对低秩表达的学习。

目前低秩表达主要用在子空间分割上,也就是给定一组数据,这组数据是从某几个子空间上来的,通过低秩表达可以达到对来自这几个子空间的数据进行聚类,可以找到哪些数据时来自具体的哪个子空间。


首先子空间分割有很多种方法比如基于概率模型的(由于高斯分布最能代表一个子空间,所以一般基于这种方法的数据都符合高斯分布)

其次是基于分解的方法,一般是基于现有分解方法的修改,并且是基于多次迭代完成的

接着是比较火的基于稀疏的子空间分割(sparse subspaceclustering),通过对表达系数矩阵进行一个稀疏的约束完成(通过对每一列的系数约束完成整个稀疏矩阵的获取)。


上面三个分解方法的缺点就是:对noise和outliers很敏感,一旦有噪声那么分解方法就不精确。

所以提出来了low rank representation的方法,因为lowrank是对系数矩阵整体的约束,所以llr的方法是从全局的观点出发来表达的,其次由于噪声会提高数据的rank,故在lowrank的约束下自然就去掉了噪声,所以该方法对噪声的鲁棒性很强。

Sure, here is an example Python code for Uplift model with low-rank regularization logistic regression for multiple correlated binary responses: ```python import numpy as np from sklearn.linear_model import LogisticRegression # define the Uplift model with low-rank regularization logistic regression class UpliftLowRankLogisticRegression: def __init__(self, rank=5, alpha=1.0): self.rank = rank self.alpha = alpha self.models = [] self.w = None def fit(self, X, y, t): # calculate the treatment and control groups X_treatment = X[t == 1] y_treatment = y[t == 1] X_control = X[t == 0] y_control = y[t == 0] # fit the logistic regression model for each response for i in range(y.shape[1]): model = LogisticRegression(penalty='l2', C=self.alpha) model.fit(np.hstack((X_treatment, y_treatment[:, i].reshape(-1, 1))), y_treatment[:, i]) self.models.append(model) # use SVD to learn the low-rank representation w U, S, Vt = np.linalg.svd(y_control - self.predict(X_control)) self.w = Vt[:self.rank].T def predict(self, X): # calculate the uplift score for each response uplift_scores = np.zeros((X.shape[0], len(self.models))) for i, model in enumerate(self.models): uplift_scores[:, i] = model.predict_proba(X)[:, 1] # calculate the predicted response for the control group y_control_pred = np.dot(X, self.w) # calculate the predicted response for the treatment group y_treatment_pred = y_control_pred + uplift_scores # return the predicted response matrix return np.vstack((y_control_pred, y_treatment_pred)) ``` The `UpliftLowRankLogisticRegression` class takes two hyperparameters: `rank` for the rank of the low-rank representation w and `alpha` for the regularization strength of logistic regression. In the `fit` method, the treatment and control groups are separated, and logistic regression models are fitted for each response using the treatment group. Then, SVD is used to learn the low-rank representation w from the predicted responses of the control group. In the `predict` method, the uplift scores for each response are calculated using the logistic regression models and added to the predicted responses of the control group to obtain the predicted responses of the treatment group. The predicted response matrix is returned by stacking the predicted responses of the control and treatment groups vertically.
评论
添加红包

请填写红包祝福语或标题

红包个数最小为10个

红包金额最低5元

当前余额3.43前往充值 >
需支付:10.00
成就一亿技术人!
领取后你会自动成为博主和红包主的粉丝 规则
hope_wisdom
发出的红包
实付
使用余额支付
点击重新获取
扫码支付
钱包余额 0

抵扣说明:

1.余额是钱包充值的虚拟货币,按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载,可以购买VIP、付费专栏及课程。

余额充值