3.Lasso线性模型

最新推荐文章于 2024-06-07 23:48:24 发布

alex_陈

最新推荐文章于 2024-06-07 23:48:24 发布

阅读量1.6w

点赞数 2

分类专栏：机器学习算法文章标签：机器学习 python

本文链接：https://blog.csdn.net/qq_21904665/article/details/52312382

版权

机器学习算法专栏收录该内容

12 篇文章 0 订阅

订阅专栏

Lasso 是一种估计稀疏线性模型的方法.由于它倾向具有少量参数值的情况，对于给定解决方案是相关情况下，有效的减少了变量数量。因此，Lasso及其变种是压缩感知(压缩采样)的基础。在约束条件下，它可以回复一组非零精确的权重系数（参考下文中的 CompressIve sensing(压缩感知：重建医学图像通过lasso L₁）点击打开链接）。

用数学形式表达，Lasso 包含一个使用 $ell_1$ 先验作为正则化因子的线性模型。其目标函数是最小化:

lasso 解决带 $\alpha ||w||_1$ 惩罚项的最小平方和，其中 $\alpha$ 是一个常量， $||w||_1$ 是参数向量的 $\ell_1$ -norm

Lasso 类实现使用了坐标下降法(一种非梯度优化算法) 来拟合系数.参考另一种实现 Least Angle Regression最小角回归

from sklearn import linear_model
clf = linear_model.Lasso(alpha = 0.1)
clf.fit([[0, 0], [1, 1]], [0, 1])
Lasso(alpha=0.1, copy_X=True, fit_intercept=True, max_iter=1000,normalize=False, positive=False, precompute=False, random_state=None,selection='cyclic', tol=0.0001, warm_start=False)
clf.predict([[1, 1]])

函数 lasso_path 对于lower-level任务非常有用。它能够通过搜索所有可能的路径上的值来计算系数.

3.1 设置正则化参数

alpha 参数控制估计的系数的稀疏程度。

3.1.1 使用交叉验证

scikit-learn 暴露以下两个类 LassoCV 和 LassoLarsCV 可以设置 Lasso alpha 参数.
LassoCV 基于下面解释的算法 Least Angle Regression最小角回归
对于含有很多共线性的高维的数据集，LassoCV 是最合适不过了。
然而，LassoLarsCV 在寻找 alpha 参数更相关的值时更具有优势，并且如果样本相比于观测的数量时，
通常比 LassoCV 更快.

3.1.2 基于模型选择的信息约束

LassoLarsIC 建议使用Akaike information criterion (AIC) 和 Bayes Information criterion (BIC)。由于在计算:math:alpha 过程中，当使用k-折交叉验证的时候，正则化路径只计算1次而不是k+1次，所以在计算上代价非常小。然而，这种约束需要一个合适的对于解的自由度的估计（可参考矩阵的解的自由度）,这可以从大量的样本（渐进结果）导出并且假设模型是正确的。例如，数据实际上是有该模型产生的，但是当问题是病态条件时这种数据可能会有问题(参考病态矩阵，条件数等概念)，比如特征维数大于样本数.（小样本问题）

print(__doc__)

import time

import numpy as np
import matplotlib.pyplot as plt

from sklearn.linear_model import LassoCV, LassoLarsCV, LassoLarsIC
from sklearn import datasets

diabetes = datasets.load_diabetes()
X = diabetes.data
y = diabetes.target

rng = np.random.RandomState(42)
X = np.c_[X, rng.randn(X.shape[0], 14)]  # add some bad features

# normalize data as done by Lars to allow for comparison
X /= np.sqrt(np.sum(X ** 2, axis=0))

##############################################################################
# LassoLarsIC: least angle regression with BIC/AIC criterion

model_bic = LassoLarsIC(criterion='bic')
t1 = time.time()
model_bic.fit(X, y)
t_bic = time.time() - t1
alpha_bic_ = model_bic.alpha_

model_aic = LassoLarsIC(criterion='aic')
model_aic.fit(X, y)
alpha_aic_ = model_aic.alpha_


def plot_ic_criterion(model, name, color):
    alpha_ = model.alpha_
    alphas_ = model.alphas_
    criterion_ = model.criterion_
    plt.plot(-np.log10(alphas_), criterion_, '--', color=color,
             linewidth=3, label='%s criterion' % name)
    plt.axvline(-np.log10(alpha_), color=color, linewidth=3,
                label='alpha: %s estimate' % name)
    plt.xlabel('-log(alpha)')
    plt.ylabel('criterion')

plt.figure()
plot_ic_criterion(model_aic, 'AIC', 'b')
plot_ic_criterion(model_bic, 'BIC', 'r')
plt.legend()
plt.title('Information-criterion for model selection (training time %.3fs)'
          % t_bic)

##############################################################################
# LassoCV: coordinate descent

# Compute paths
print("Computing regularization path using the coordinate descent lasso...")
t1 = time.time()
model = LassoCV(cv=20).fit(X, y)
t_lasso_cv = time.time() - t1

# Display results
m_log_alphas = -np.log10(model.alphas_)

plt.figure()
ymin, ymax = 2300, 3800
plt.plot(m_log_alphas, model.mse_path_, ':')
plt.plot(m_log_alphas, model.mse_path_.mean(axis=-1), 'k',
         label='Average across the folds', linewidth=2)
plt.axvline(-np.log10(model.alpha_), linestyle='--', color='k',
            label='alpha: CV estimate')

plt.legend()

plt.xlabel('-log(alpha)')
plt.ylabel('Mean square error')
plt.title('Mean square error on each fold: coordinate descent '
          '(train time: %.2fs)' % t_lasso_cv)
plt.axis('tight')
plt.ylim(ymin, ymax)

##############################################################################
# LassoLarsCV: least angle regression

# Compute paths
print("Computing regularization path using the Lars lasso...")
t1 = time.time()
model = LassoLarsCV(cv=20).fit(X, y)
t_lasso_lars_cv = time.time() - t1

# Display results
m_log_alphas = -np.log10(model.cv_alphas_)

plt.figure()
plt.plot(m_log_alphas, model.cv_mse_path_, ':')
plt.plot(m_log_alphas, model.cv_mse_path_.mean(axis=-1), 'k',
         label='Average across the folds', linewidth=2)
plt.axvline(-np.log10(model.alpha_), linestyle='--', color='k',
            label='alpha CV')
plt.legend()

plt.xlabel('-log(alpha)')
plt.ylabel('Mean square error')
plt.title('Mean square error on each fold: Lars (train time: %.2fs)'
          % t_lasso_lars_cv)
plt.axis('tight')
plt.ylim(ymin, ymax)

plt.show()

alex_陈

关注

2
点赞
踩
30

收藏

觉得还不错? 一键收藏
3
评论
3.Lasso线性模型

Lasso 是一种估计稀疏线性模型的方法.由于它倾向具有少量参数值的情况，对于给定解决方案是相关情况下，有效的减少了变量数量。因此，Lasso及其变种是压缩感知(压缩采样)的基础。在约束条件下，它可以回复一组非零精确的权重系数（参考下文中的 CompressIve sensing(压缩感知：重建医学图像通过lasso L1））。用数学形式表达，Lasso 包含一个使用先验作为正则化因子的
复制链接

扫一扫