【ML】线性回归——Lasso回归与最小角回归

最近在看一下Sparse Linear Regression的内容,其中常用的方法就是Lasso回归。主要思想就是在一般的最小二乘上加一个一范数正则项,添加这个正则项之后,得到的回归系数中有些会被置为0,从而得到了一个系数的回归系数。这方面的参考很多,就不详细说明了。

这里,主要要说明的是最小角回归和Lasso回归的关系与区别

在许多参考资料中,都会说最小角回归是解决Lasso的一种高效方法,但是在学习最小角回归的时候发现了一个问题,就是最小角回归的过程中完全没有涉及到Lasso回归中的正则化参数lamda,那么为什么说它可以解决Lasso回归问题,以及它是如何解决Lasso回归的呢?

下面是我总结的一些内容,基本能够说明二者的关系。

首先,总结一下,Lasso回归的计算是一个二次规划问题,可以通过标准数值分析算法来解决。但最小角度回归程序是一种更好的方法,该算法利用了套索问题的特殊结构,为lamda的所有值提供了同时计算解的有效方法。

这句话是参考了Stanford的学习资料:https://statweb.stanford.edu/~tibs/lasso/simple.html

下面来详细说明一下

1.最小角回归是一种求解线性回归的方法,它主要用于防止由于自变量纬度高防止过拟合的情形,可以进行变量筛选、降维,使模型易于解释

 

参考:https://www.quora.com/What-is-Least-Angle-Regression-and-when-should-it-be-used

 

2.最小角回归方法的动机来源:

(1)Forward selection

  • Forward selection starts with no variables in the model, and at each step it adds to the model the variable with the most explanatory power, stopping if the explanatory power falls below some threshold. This is a fast and simple method, but it can also be too greedy: we fully add variables at each step, so correlated predictors don't get much of a chance to be included in the model. (For example, suppose we want to build a model for the deliciousness of a PB&J sandwich, and two of our variables are the amount of peanut butter and the amount of jelly. We'd like both variables to appear in our model, but since amount of peanut butter is (let's assume) strongly correlated with the amount of jelly, once we fully add peanut butter to our model, jelly doesn't add much explanatory power anymore, and so it's unlikely to be added.)

(2)Forward stagewise regression

  • Forward stagewise regression tries to remedy the greediness of forward selection by only partially adding variables. Whereas forward selection finds the variable with the most explanatory power and goes all out in adding it to the model, forward stagewise finds the variable with the most explanatory power and updates its weight by only epsilon in the correct direction. (So we might first increase the weight of peanut butter a little bit, then increase the weight of peanut butter again, then increase the weight of jelly, then increase the weight of bread, and then increase the weight of peanut butter once more.) The problem now is that we have to make a ton of updates, so forward stagewise can be very inefficient.

 

最小角回归结合了上述两种方法

LARS, then, is essentially forward stagewise made fast. Instead of making tiny hops in the direction of one variable at a time, LARS makes optimally-sized leaps in optimal directions. These directions are chosen to make equal angles (equal correlations) with each of the variables currently in our model. (We like peanut butter best, so we start eating it first; as we eat more, we get a little sick of it, so jelly starts looking equally appetizing, and we start eating peanut butter and jelly simultaneously; later, we add bread to the mix, etc.)

 

3.最小角回归的基本步骤

  • Assume for simplicity that we've standardized our explanatory variables to have zero mean and unit variance, and that our response variable also has zero mean.

  • Start with no variables in your model.

  • Find the variable x1x1 most correlated with the residual. (Note that the variable most correlated with the residual is equivalently the one that makes the least angle with the residual, whence the name.)

  • Move in the direction of this variable until some other variable x2x2 is just as correlated.

  • At this point, start moving in a direction such that the residual stays equally correlated with x1x1 and x2x2 (i.e., so that the residual makes equal angles with both variables), and keep moving until some variable x3x3becomes equally correlated with our residual.

  • And so on, stopping when we've decided our model is big enough.

关于最小角回归的过程,这篇文章解释的很到位、形象

http://www.cnblogs.com/pinard/p/6018889.html

注意:什么时候使用LARS,而不是其他的正则化方法(比如lasso)。但是,Lasso、forward stage-wise 和 LARS的结果是相似的。对LARS进行轻微的修改,就可以得到另外两者。(这个问题在提出LARS的论文中有提到并证明,还没有仔细推敲)

 

 

参考:LEAST ANGLE REGRESSION.pdf BY BRADLEY EFRON,1 TREVOR HASTIE,2 IAIN JOHNSTONE3 AND ROBERT TIBSHIRANI4

4.LARS 与Lasso的关系与区别

LARS是一种解决lasso的简洁的方法,对于Lasso模型中的正则化参数lamda,LARS的过程相当于lamda从0到无穷变化。

LARS的计算量要比取定lamda的lasso计算量高一些,但是Lasso的求解需要一个固定的lamda,如果不知道最优的这个lamda,LARS是一种首选方法。

参考:https://www.quora.com/What-is-Least-Angle-Regression-and-when-should-it-be-used

 

4.求解Lasso的方法(真正意义上)

循环坐标下降法(cyclical coordinate descent)

需要优化参数

参考:https://blog.csdn.net/mousever/article/details/50513409

 

5.关于如何确定Lasso模型中的最优参数

一般使用的方法是:交叉验证

交叉验证的一般定义:在给定的建模样本中,拿出大部分样本进行建模型,留小部分样本用刚建立的模型进行预报,并求这小部分样本的预报误差,记录它们的平方加和。

Lasso模型的交叉验证

参考:https://blog.csdn.net/nya0731/article/details/79895307

 

上面这些只是一些粗略的总结,之后有时间会详细整理。

但是,感觉国外的资料真的是比较多的比较详细的,翻墙出去总能发现新世界。。。

祝大家学习工作顺利!

有问题欢迎留言~

  • 5
    点赞
  • 20
    收藏
    觉得还不错? 一键收藏
  • 1
    评论
评论 1
添加红包

请填写红包祝福语或标题

红包个数最小为10个

红包金额最低5元

当前余额3.43前往充值 >
需支付:10.00
成就一亿技术人!
领取后你会自动成为博主和红包主的粉丝 规则
hope_wisdom
发出的红包
实付
使用余额支付
点击重新获取
扫码支付
钱包余额 0

抵扣说明:

1.余额是钱包充值的虚拟货币,按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载,可以购买VIP、付费专栏及课程。

余额充值