R语言与LASSO回归和LAR算法

最新推荐文章于 2025-03-27 20:12:14 发布

Ethan_pika

最新推荐文章于 2025-03-27 20:12:14 发布

阅读量6.1k

点赞数 6

分类专栏： R语言与机器学习文章标签：机器学习 R

本文链接：https://blog.csdn.net/yitian_z/article/details/103097406

版权

16 篇文章

订阅专栏

LASSO介绍

Tibshirani（1996）提出了LASSO（The Least Absolute Shringkage and Selectionator operator）算法
通过构造一个一阶惩罚函数获得一个精炼的模型；通过最终确定一些指标（变量）的系数为零（岭回归估计系数等于0的机会微乎其微，造成筛选变量困难），解释力很强
擅长处理具有多重共线性的数据，与岭回归一样是有偏估计

岭回归与LASSO之间的本质区别

在几何意义上的区别（前者为LASSO回归，后者为岭回归）

LAR（最小角回归：Least Angel Regression）

Efron于2004年提出的一种变量选择的方法，类似于向前逐步回归（Forward Stepwise）的形式。
是LASSO REGRESSION的一种高效解法
向前逐步回归（Forward Stepwise）不同在于，Forward Stepwise每次都是根据选择的变量子集，完全拟合出线性模型，计算出RSS，再设计统计量（如AIC）对较高的模型复杂度作出惩罚。
而LAR是每次先找出和因变量相关度最高的那个变量，再沿着LSE的方向一点点调整这个predictor的系数。在这个过程中，这个变量和残差的相关系数会逐渐减小，等到这个相关性没有那么显著的时候，就要选出新的相关性最高的变量，然后重新沿着LSE的方向进行变动。而到最后，所有变量都被选中，就和LSE相同了。

Algorithm 3.2 Least Angle Regression

Standardize the predictors to have mean zero and unit norm. Start with the residual r = y – y-, β，β，…, β = 0
Find the predictor Xj most correlated with r
Move βj from 0 towards its least-squares coeffcient <Xj, r>, until some other competitor Xk has as much correlation with the current residual as does Xj.
Move βj and βk in the directin defined by their joint least squares coefficient of the current residual on (Xj, Xk), until some other competitor Xl has as much correlation with the current residual.
Continue in this way until all p predictors have ben entered. After min(N-1, p) steps, we arrive at the full least-squares solution.

Algorithm 3.2a Least Angle Regression: Lasso Modification