岭回归分类器RidgeClassifier及RidgeCV(代码详解)

最新推荐文章于 2025-04-01 17:17:44 发布

管牛牛

最新推荐文章于 2025-04-01 17:17:44 发布

阅读量1.7w

点赞数 10

分类专栏：机器学习 python 深度学习文章标签：机器学习 python 深度学习

本文链接：https://blog.csdn.net/LOLUN9/article/details/106012418

版权

python 同时被 3 个专栏收录

22 篇文章

订阅专栏

机器学习

19 篇文章

订阅专栏

深度学习

12 篇文章

订阅专栏

由于文章长度有限，上次大管和大家简单聊了下岭回归，今天咱们来看一下如何用岭回归做分类——岭回归分类器。

RidgeClassifier

岭回归器有一个分类器变体:RidgeClassifier，这个分类器有时被称为带有线性核的最小二乘支持向量机。该分类器首先将二进制目标转换为{- 1,1}，然后将该问题视为回归任务，优化与上面相同的目标。预测类对应于回归预测的符号，对于多类分类，将问题视为多输出回归，预测类对应的输出值最大。该分类器使用(惩罚)最小二乘损失来适应分类模型，而不是使用更传统的逻辑或铰链损失(最大边界损失)，在实践中，所有这些模型在准确性或精度/召回率方面都可能导致类似的交叉验证分数，而RidgeClassifier使用的惩罚最小二乘损失允许对具有不同计算性能概要的数值求解器进行各自不同的选择。

RidgeClassifier比LogisticRegression要快得多，因为它只需要一次计算投影矩阵。下面来一起看一看sklearn中如何调用。

#调用函数

sklearn.linear_model.RidgeClassifier(alpha=1.0, fit_intercept=True, normalize=False, copy_X=True, max_iter=None, tol=0.001, class_weight=None, solver='auto', random_state=None)

#Parameters(参数)解释:

##alpha:float, default=1.0，正则化强度;必须是正的浮点数。正则化改进了问题的条件，减少了估计的方差。较大的值指定更强的正则化,在其他线性模型中，alpha对应于C^-1，如逻辑回归或线性vc。

##fit_intercept:bool, default=True，是否为该模型计算截距。如果设置为false，在计算中将不使用截距.

##normalize:bool, default=False，当fit_intercept被设置为False时，将忽略该参数。若为真，则回归前将对回归数X进行归一化处理，方法是减去均值再除以l2-范数。

##copy_X:bool, default=True，如果为真，则复制X;否则，它可能被覆盖。

##max_iter:int, default=None，共轭梯度求解器的最大迭代次数。

##tol:float, default=1e-3，求解的精度。

##class_weight:dict or ‘balanced’, default=None 与{class_label: weight}，形式的类相关联的权重。如果没有给出，所有的类都应该有权重1。“平衡”模式使用y的值自动调整权重，与输入数据中的类频率成反比。

##solver{‘auto’, ‘svd’, ‘cholesky’, ‘lsqr’, ‘sparse_cg’, ‘sag’, ‘saga’}, default=’auto’ ， “auto”:根据数据类型自动选择求解器。"svd”:使用X的奇异值分解来计算脊线系数。对于奇异矩阵比“cholesky”更稳定。"cholesky”:求函数的一个闭型解。“sparse_cg”:使用了在sci .sparse.linalg.cg中找到的共轭梯度求解器。作为一种迭代算法，该求解器比‘cholesky’更适用于大规模数据(可能设置tol和max_iter)。“lsqr”:使用专用的正则化最小二乘例程sci .sparse.linalg.lsqr。它是最快的，并使用迭代过程。"sag":使用的是随机平均梯度下降法，"saga":使用的是它的无偏法和更灵活的版本saga。这两种方法都使用迭代过程，当n_samples和n_features都很大时，通常比其他求解器更快。random_state:int, RandomState instance, default=None 在数据变换时使用的伪随机数生成器的种子。如果int, random_state是随机数生成器使用的种子;如果RandomState实例，random_state是随机数生成器;如果没有，随机数生成器就是np.random使用的RandomState实例。

#属性Attributes

##coef_: ndarray of shape (1, n_features) or (n_classes, n_features)

决策函数中的系数。当给定的问题是二进制时，coef_的形状为(1,n_features)。

##intercept_: float or ndarray of shape (n_targets,)

决策函数中的截距(常数项)。如果fit_intercept = False，设置为0.0。

##n_iter_: None or ndarray of shape (n_targets,)

每个目标的实际迭代次数。仅适用于sag和lsqr解决方案。其他求解器将返回None

##classes_: ndarray of shape (n_classes,)

类的标签

#代码举例

>>> from sklearn.datasets import load_breast_cancer
>>> from sklearn.linear_model import RidgeClassifier
>>> X, y = load_breast_cancer(return_X_y=True)
>>> clf = RidgeClassifier().fit(X, y)
>>> clf.score(X, y)
0.9595...

#方法Methods

##decision_function(self, X) 预测样本的置信得分

参数Parameters

X: array_like or sparse matrix, shape (n_samples, n_features) 样本

返回值Returns

array, shape=(n_samples,) if n_classes == 2 else (n_samples, n_classes) 每个(样本，班级)组合的信任度得分。在二元情况下，self的置信度得分。classes_[1]，其中>0表示该类将被预测。

##fit(self, X, y, sample_weight=None) 拟合岭分类器模型

参数Parameters

X: {ndarray, sparse matrix} of shape (n_samples, n_features) 训练数据集

y: ndarray of shape (n_samples,) 标签值

sample_weight: float or ndarray of shape (n_samples,), default=None 每个样本的单独权重。如果给定一个浮点数，每个样本都有相同的权重。

返回值Returns

self: object 实例化估计器

##get_params(self, deep=True) 从估计器中取参数

参数Parameters

deep: bool, default=True 如果为真，将返回此估计器的参数以及包含的作为估计器的子对象

返回值Returns

根据参数名返回他们自己真实的值

##predict(self, X) 预测样本X的标签值

参数Parameters

X: array_like or sparse matrix, shape (n_samples, n_features) 要预测的样本

返回值Returns

C: array, shape [n_samples] 返回每一个样本的标签值

##score(self, X, y, sample_weight=None) 返回给定测试数据和标签的平均精度

参数Parameters

X：array-like of shape (n_samples, n_features) 测试的样本

y：array-like of shape (n_samples,) or (n_samples, n_outputs) 测试样本的真实标签值

sample_weight: array-like of shape (n_samples,), default=None 样本的权重，默认为None

返回值Returns

score:float 平均准确度

##set_params(self, **params) 给评估器设置从参数

参数Parameters

**params:dict 评估器的参数(字典的形式)

返回值Returns

返回评估器的初始化

最后使用岭回归分类器对文本进行分类并于其他分类器进行比较，结果如下图所示

可以看到岭回归分类器的分数比较高，且训练时间短。

RidgeCV

它通过内建的alpha参数交叉验证实现ridge回归。该对象的工作方式与GridSearchCV相同，但它默认使用通用交叉验证(GCV)，这是一种有效的遗漏交叉验证形式。

#调用函数

sklearn.linear_model.RidgeCV(alphas=(0.1, 1.0, 10.0), fit_intercept=True, normalize=False, scoring=None, cv=None, gcv_mode=None, store_cv_values=False)

#参数Parameters

##alpha: sndarray of shape (n_alphas,), default=(0.1, 1.0, 10.0) ，正则化强度;必须是正的浮点数。正则化改进了问题的条件，减少了估计的方差。较大的值指定更强的正则化。在其他线性模型中，阿尔法对应于C^-1，如逻辑回归或线性vc。

##fit_intercept:bool, default=True，是否为该模型计算截距。如果设置为false，在计算中将不使用截距。

##normalize:bool, default=False，当fit_intercept被设置为False时，将忽略该参数。若为真，则回归前将对回归数X进行归一化处理，方法是减去均值再除以l2-范数。

##scoring: string, callable, default=None，一个字符串或一个有签名的记分器(estimator, X, y)的记分器可调用的对象/函数，如果没有，如果cv为“自动”或没有(即当使用广义交叉验证时)，负的均方误差，否则r2得分。

##cv: int, cross-validation generator or an iterable, default=None，定交叉验证分割策略,默认无，使用有效的遗漏交叉验证(也称为通用交叉验证)，可以是整数，指定分割的次数。也可以是一个可迭代的结果(序列，测试)被分割为索引数组。

##gcv_mode: {‘auto’, ‘svd’, eigen’}, default=’auto’，标志，指示执行通用交叉验证时使用的策略。选项有:

'auto':如果n_samples > n_features使用'svd'，否则使用'eigen'；

'svd':当X为时，对X进行奇异值分解,X^T的稠密特征值分解。当X是稀疏的；

'eigen':通过x。x ^T的本征分解计算力。

##store_cv_values: bool, default=False ，标志，指示与每个alpha相对应的交叉验证值是否应该存储在cv_values_属性中。此标志仅与cv=None兼容(即使用通用交叉验证)

#属性Attributes

##cv_values_: ndarray of shape (n_samples, n_alphas) or shape (n_samples, n_targets, n_alphas), optional

每个alpha的交叉验证值(如果store_cv_values=True和cv=None)。在调用fit()之后，该属性将包含均方误差(默认情况下)或{loss,score}_func函数的值

##coef_: ndarray of shape (n_features) or (n_targets, n_features)

样本的系数

##intercept_: float or ndarray of shape (n_targets,)

决策函数中的截距(常数项)。如果fit_intercept = False，设置为0.0。

##alpha_: float

估计的正则化参数

#代码举例

>>> from sklearn.datasets import load_diabetes
>>> from sklearn.linear_model import RidgeCV
>>> X, y = load_diabetes(return_X_y=True)
>>> clf = RidgeCV(alphas=[1e-3, 1e-2, 1e-1, 1]).fit(X, y)
>>> clf.score(X, y)
0.5166...

#方法Methods

##fit(self, X, y, sample_weight=None) 拟合岭分类器模型

参数Parameters

X: {ndarray, sparse matrix} of shape (n_samples, n_features) 训练数据集

y: ndarray of shape (n_samples,) 标签值

sample_weight: float or ndarray of shape (n_samples,), default=None 每个样本的单独权重。如果给定一个浮点数，每个样本都有相同的权重。

返回值Returns

self: object 实例化估计器

##get_params(self, deep=True) 从估计器中取参数

参数Parameters

deep: bool, default=True 如果为真，将返回此估计器的参数以及包含的作为估计器的子对象

返回值Returns

根据参数名返回他们自己真实的值

##predict(self, X) 预测样本X的标签值

参数Parameters

X: array_like or sparse matrix, shape (n_samples, n_features) 要预测的样本

返回值Returns

C: array, shape [n_samples] 返回每一个样本的标签值

##score(self, X, y, sample_weight=None) 返回给定测试数据和标签的平均精度

参数Parameters

X：array-like of shape (n_samples, n_features) 测试的样本

y：array-like of shape (n_samples,) or (n_samples, n_outputs) 测试样本的真实标签值

sample_weight: array-like of shape (n_samples,), default=None 样本的权重，默认为None

返回值Returns

score:float 用X预测Y的R^2数

##set_params(self, **params) 给评估器设置从参数

参数Parameters

**params:dict 评估器的参数(字典的形式)

返回值Returns

返回评估器的初始化

下面分别使用极端随机森林回归、K近邻回归、线性回归、RidgeCV回归来对人脸的下半部分进行预测:

import numpy as np
import matplotlib.pyplot as plt
from sklearn.datasets import fetch_olivetti_faces
from sklearn.utils.validation import check_random_state
from sklearn.ensemble import ExtraTreesRegressor
from sklearn.neighbors import KNeighborsRegressor
from sklearn.linear_model import LinearRegression
from sklearn.linear_model import RidgeCV
# Load the faces datasets
data = fetch_olivetti_faces()
targets = data.target
data = data.images.reshape((len(data.images), -1))
train = data[targets < 30]
test = data[targets >= 30]  # Test on independent people
# Test on a subset of people
n_faces = 5
rng = check_random_state(4)
face_ids = rng.randint(test.shape[0], size=(n_faces, ))
test = test[face_ids, :]
n_pixels = data.shape[1]
# Upper half of the faces
X_train = train[:, :(n_pixels + 1) // 2]
# Lower half of the faces
y_train = train[:, n_pixels // 2:]
X_test = test[:, :(n_pixels + 1) // 2]
y_test = test[:, n_pixels // 2:]
# Fit estimators
ESTIMATORS = {
    "Extra trees": ExtraTreesRegressor(n_estimators=10, max_features=32,
                                       random_state=0),
    "K-nn": KNeighborsRegressor(),
    "Linear regression": LinearRegression(),
    "Ridge": RidgeCV(),
}
y_test_predict = dict()
for name, estimator in ESTIMATORS.items():
    estimator.fit(X_train, y_train)
    y_test_predict[name] = estimator.predict(X_test)
# Plot the completed faces
image_shape = (64, 64)
n_cols = 1 + len(ESTIMATORS)
plt.figure(figsize=(2. * n_cols, 2.26 * n_faces))
plt.suptitle("Face completion with multi-output estimators", size=16)
for i in range(n_faces):
    true_face = np.hstack((X_test[i], y_test[i]))
    if i:
        sub = plt.subplot(n_faces, n_cols, i * n_cols + 1)
    else:
        sub = plt.subplot(n_faces, n_cols, i * n_cols + 1,
                          title="true faces")
    sub.axis("off")
    sub.imshow(true_face.reshape(image_shape),
               cmap=plt.cm.gray,
               interpolation="nearest")
    for j, est in enumerate(sorted(ESTIMATORS)):
        completed_face = np.hstack((X_test[i], y_test_predict[est][i]))
        if i:
            sub = plt.subplot(n_faces, n_cols, i * n_cols + 2 + j)
        else:
            sub = plt.subplot(n_faces, n_cols, i * n_cols + 2 + j,
                              title=est)
        sub.axis("off")
        sub.imshow(completed_face.reshape(image_shape),
                   cmap=plt.cm.gray,
                   interpolation="nearest")
plt.show()

预测结果如下所示: