Scikit-learn：分类classification

-柚子皮-

已于 2023-05-30 14:45:15 修改

阅读量3.1k

点赞数 1

分类专栏： Scikit-Learn 文章标签： svm Scikit-learn

于 2016-11-04 14:38:13 首次发布

本文链接：https://blog.csdn.net/pipisorry/article/details/53034340

版权

Scikit-Learn 专栏收录该内容

15 篇文章

订阅专栏

http://blog.csdn.net/pipisorry/article/details/53034340

lr模型训练和部署

[sklearn.linear_model.LogisticRegression]

from sklearn.linear_model import LogisticRegression
from sklearn import datasets
from sklearn.preprocessing import StandardScaler

X, y = datasets.load_digits(return_X_y=True)
X = StandardScaler().fit_transform(X)
y = (y > 4).astype(int)

# 训练一个lr模型
clf_l2_LR = LogisticRegression(C=0.1, penalty="l2", tol=0.01, solver="saga")
clf_l2_LR.fit(X, y)
print(clf_l2_LR.predict(X[0:5]))

# 使用训练好的lr模型的参数覆盖新lr模型
clf_l2_LR1 = LogisticRegression(C=0.1, penalty="l2", tol=0.01, solver="saga")
clf_l2_LR1.coef_ = clf_l2_LR.coef_
clf_l2_LR1.intercept_ = clf_l2_LR.intercept_
clf_l2_LR1.classes_ = clf_l2_LR.classes_
print(clf_l2_LR1.predict(X[0:5]))

从上可以看出，只需要coef_、intercept_和classes_就可以使用之前训练好的模型了。
所以，部署时候，我们只需要保存训练模型的coef_、intercept_这些参数
import json
model_params={"intercept_": model.intercept_.tolist(),
"coef_": _model.coef_.T.tolist(),
"classes_": _model.classes_.T.tolist()}
with open("lr_binary_parameter.json","w") as f:
json.dump(model_params,f)
然后predict时候load这些参数就可以进行predict了。

[coef_ 的形状是 (3, 2)]

[L1 Penalty and Sparsity in Logistic Regression — scikit-learn 1.2.2 documentation]

支持向量机SVM分类

svm分类有多种不同的算法。SVM是非常流行的机器学习算法，主要用于分类问题，如同逻辑回归问题，它可以使用一对多的方法进行多类别的分类。

svc

Implementation of Support Vector Machine classifier using libsvm: the kernel can be non-linear but its SMO algorithm does not scale to large number of samples as LinearSVC does. Furthermore SVC multi-class mode is implemented using one vs one scheme while LinearSVC uses one vs the rest. It is possible to implement one vs the rest with SVC by using the sklearn.multiclass.OneVsRestClassifier wrapper. Finally SVC can fit dense data without memory copy if the input is C-contiguous. Sparse data will still incur memory copy though.

class sklearn.svm.SVC(C=1.0, kernel='rbf', degree=3, gamma='auto', coef0=0.0, shrinking=True, probability=False, tol=0.001, cache_size=200, class_weight=None, verbose=False, max_iter=-1, decision_function_shape=None, random_state=None)

常用参数

probability : boolean, optional (default=False)

Whether to enable probability estimates. This must be enabled priorto calling fit, and will slow down that method.

常用属性

coef_ : array, shape = [n_class-1, n_features]

常用方法Methods

decision_function(X)	Distance of the samples X to the separating hyperplane.
fit(X, y[, sample_weight])	Fit the SVM model according to the given training data.
get_params([deep])	Get parameters for this estimator.
predict(X)	Perform classification on samples in X.
score(X, y[, sample_weight])	Returns the mean accuracy on the given test data and labels.
set_params(**params)	Set the parameters of this estimator.

如果之前设置了参数probability=True，则可以使用输出概率函数

predict_proba

Compute probabilities of possible outcomes for samples in X.

The model need to have probability information computed at trainingtime: fit with attribute probability set to True.

Parameters:	X : array-like, shape (n_samples, n_features) For kernel=”precomputed”, the expected shape of X is[n_samples_test, n_samples_train]
Returns:	T : array-like, shape (n_samples, n_classes) Returns the probability of the sample for each class inthe model. The columns correspond to the classes in sortedorder, as they appear in the attribute classes_.

Parameters:

X : array-like, shape (n_samples, n_features)

For kernel=”precomputed”, the expected shape of X is[n_samples_test, n_samples_train]

Returns:

T : array-like, shape (n_samples, n_classes)

Returns the probability of the sample for each class inthe model. The columns correspond to the classes in sortedorder, as they appear in the attribute classes_.

Notes The probability model is created using cross validation, sothe results can be slightly different than those obtained bypredict. Also, it will produce meaningless results on very smalldatasets.

使用示例

>>> import numpy as np
>>> X = np.array([[-1, -1], [-2, -1], [1, 1], [2, 1]])
>>> y = np.array([1, 1, 2, 2])
>>> from sklearn.svm import SVC
>>> clf = SVC()
>>> clf.fit(X, y) 
SVC(C=1.0, cache_size=200, class_weight=None, coef0=0.0,
    decision_function_shape=None, degree=3, gamma='auto', kernel='rbf',
    max_iter=-1, probability=False, random_state=None, shrinking=True,
    tol=0.001, verbose=False)
>>> print(clf.predict([[-0.8, -1]]))
[1]

[sklearn.svm.SVC¶]

LinearSVC

Implementation of Support Vector Machine classifier using the same library as this class (liblinear).
Scalable Linear Support Vector Machine for classification implemented using liblinear. Check the See also section of LinearSVC for more comparison element.

皮皮blog

from: Scikit-learn：分类classification_-柚子皮-的博客-CSDN博客

ref: