广义线性模型4

最新推荐文章于 2024-02-29 01:00:00 发布

NLP工程化

最新推荐文章于 2024-02-29 01:00:00 发布

阅读量803

点赞数 1

分类专栏：自然语言处理

本文链接：https://blog.csdn.net/ssw_1990/article/details/51464566

版权

自然语言处理专栏收录该内容

109 篇文章 7 订阅

订阅专栏

1.1.9 Orthogonal Matching Pursuit

OMP即正交匹配追踪算法。匹配追踪[Matching Pursuit]算法在稀疏表达领域是一个很常用的算法，而OMP在分解的每一步对所选择的全部原子进行正交化处理，这使得在精度要求相同的情况下，它的收敛速度更快。

1.1.10 Bayesian Regression

1.1.10.1 Bayesian Ridge Regression

Bayesian Ridge Regression代码，如下所示：

from sklearn import linear_model
X = [[0., 0.], [1., 1.], [2., 2.], [3., 3.]]
Y = [0., 1., 2., 3.]
clf = linear_model.BayesianRidge()
clf.fit(X, Y)
BayesianRidge(alpha_1=1e-06, alpha_2=1e-06, compute_score=False, copy_X=True,
       fit_intercept=True, lambda_1=1e-06, lambda_2=1e-06, n_iter=300,
       normalize=False, tol=0.001, verbose=False)
clf.predict ([[1, 0.]])
array([ 0.50000013])
clf.coef_
array([ 0.49999993,  0.49999993])

说明：除了贝叶斯岭回归，还有贝叶斯线性回归，贝叶斯逻辑回归等。

1.1.10.2 Automatic Relevance Determination

解析：ARD即自动相关确定回归。

1.1.11 Logistic regression

逻辑回归是一个线性模型，用来做分类而不是回归的。L2阶正则化项逻辑回归的损失函数。L1阶正则化项逻辑回归的损失函数。如下所示：

说明：one-versus-rest：训练时依次把某个类别的样本归为一类，其它剩余的样本归为另一类。

在类LogisticRegression中实现的Solver包括“liblinear”，“newton-cg”，“lbfgs”和“sag”。如下所示：

Examples: L1 Penalty and Sparsity in Logistic Regression

import numpy as np
import matplotlib.pyplot as plt

from sklearn.linear_model import LogisticRegression
from sklearn import datasets
from sklearn.preprocessing import StandardScaler

digits = datasets.load_digits()
X, y = digits.data, digits.target
X = StandardScaler().fit_transform(X)

# classify small against large digits
y = (y > 4).astype(np.int)

# Set regularization parameter
for i, C in enumerate((100, 1, 0.01)):
    # turn down tolerance for short training time
    clf_l1_LR = LogisticRegression(C=C, penalty='l1', tol=0.01)
    clf_l2_LR = LogisticRegression(C=C, penalty='l2', tol=0.01)
    clf_l1_LR.fit(X, y)
    clf_l2_LR.fit(X, y)

    coef_l1_LR = clf_l1_LR.coef_.ravel()
    coef_l2_LR = clf_l2_LR.coef_.ravel()

    # coef_l1_LR contains zeros due to the
    # L1 sparsity inducing norm

    sparsity_l1_LR = np.mean(coef_l1_LR == 0) * 100
    sparsity_l2_LR = np.mean(coef_l2_LR == 0) * 100

    print("C=%.2f" % C)
    print("Sparsity with L1 penalty: %.2f%%" % sparsity_l1_LR)
    print("score with L1 penalty: %.4f" % clf_l1_LR.score(X, y))
    print("Sparsity with L2 penalty: %.2f%%" % sparsity_l2_LR)
    print("score with L2 penalty: %.4f" % clf_l2_LR.score(X, y))

    l1_plot = plt.subplot(3, 2, 2 * i + 1)
    l2_plot = plt.subplot(3, 2, 2 * (i + 1))
    if i == 0:
        l1_plot.set_title("L1 penalty")
        l2_plot.set_title("L2 penalty")

    l1_plot.imshow(np.abs(coef_l1_LR.reshape(8, 8)), interpolation='nearest',
                   cmap='binary', vmax=1, vmin=0)
    l2_plot.imshow(np.abs(coef_l2_LR.reshape(8, 8)), interpolation='nearest',
                   cmap='binary', vmax=1, vmin=0)
    plt.text(-8, 3, "C = %.2f" % C)

    l1_plot.set_xticks(())
    l1_plot.set_yticks(())
    l2_plot.set_xticks(())
    l2_plot.set_yticks(())

plt.show()

输出结果，如下所示：

C=100.00
Sparsity with L1 penalty: 6.25%
score with L1 penalty: 0.9098
Sparsity with L2 penalty: 4.69%
score with L2 penalty: 0.9098
C=1.00
Sparsity with L1 penalty: 9.38%
score with L1 penalty: 0.9104
Sparsity with L2 penalty: 4.69%
score with L2 penalty: 0.9093
C=0.01
Sparsity with L1 penalty: 85.94%
score with L1 penalty: 0.8603
Sparsity with L2 penalty: 4.69%
score with L2 penalty: 0.8915

解析：

[1]fit_transform(X, y=None, **fit_params)