sklearn之Multiclass 估计器

sklearn.multiclass 可以处理多类别 (multi-class) 的多标签 (multi-label) 的分类问题。

多类别分类

手写数字有 0-9 十类,但手头上只有两分类估计器 (比如像支撑向量机) 怎么用呢?我们可以采取下面三种常见策略:

  • 一对一 (One vs One, OvO):一个分类器用来处理数字 0 和数字 1,一个用来处理数字 0 和数字 2,一个用来处理数字 1 和 2,以此类推。N 个类需要 N(N-1)/2 个分类器。

  • 一对其他 (One vs All, OvA):训练 10 个二分类器,每一个对应一个数字,第一个分类 1 和「非1」,第二个分类 2 和「非2」,以此类推。N 个类需要 N 个分类器。

OneVsOneClassifier:

from sklearn.datasets import load_digits
from sklearn.model_selection import train_test_split
import matplotlib.pyplot as plt
from sklearn.multiclass import OneVsOneClassifier
from sklearn.linear_model import LogisticRegression
from sklearn import metrics#metrics 来计算各种性能指标

digits = load_digits()

X_train, X_test, y_train, y_test= train_test_split( digits['data'], digits['target'], test_size=0.2 )
print( 'The size of X_train is ', X_train.shape )
print( 'The size of y_train is ', y_train.shape )
print( 'The size of X_test is ', X_test.shape )
print( 'The size of y_test is ', y_test.shape )

fig,axes = plt.subplots(10,10,figsize=(8,8))
fig.subplots_adjust(hspace=0.1,wspace=0.1)

for i,ax in enumerate(axes.flat):
    ax.imshow(X_train[i,:].reshape(8,8),cmap="binary",interpolation="nearest")
    ax.text(0.05,0.05,str(y_train[i]),transform=ax.transAxes,color="blue")
    ax.set_xticks([])
    ax.set_yticks([])

plt.show()



ovo_lr=OneVsOneClassifier(LogisticRegression(solver="lbfgs",max_iter=200))#创建一个一对一的多分类器
ovo_lr.fit(X_train,y_train) #开始分类

print(len(ovo_lr.estimators_))#查看分类器的数量
print("train_OVO LR",metrics.accuracy_score(y_train,ovo_lr.predict(X_train)))
print("test_ovo LR",metrics.accuracy_score(y_test,ovo_lr.predict(X_test)))

查看结果:

F:\开发工具\pythonProject\tools\venv\Scripts\python.exe F:/开发工具/pythonProject/tools/python的sklear学习/sklearn01.py
The size of X_train is  (1437, 64)
The size of y_train is  (1437,)
The size of X_test is  (360, 64)
The size of y_test is  (360,)
45
train_OVO LR 1.0
test_ovo LR 0.9833333333333333

Process finished with exit code 0

OneVsRestClassifier:

from sklearn.datasets import load_digits
from sklearn.model_selection import train_test_split
from sklearn.multiclass import OneVsRestClassifier
from sklearn import metrics#metrics 来计算各种性能指标
from sklearn.linear_model import LogisticRegression

digits = load_digits()
X_train, X_test, y_train, y_test= train_test_split( digits['data'], digits['target'], test_size=0.2 )

ova_lr=OneVsRestClassifier(LogisticRegression(solver="lbfgs",max_iter=800))

ova_lr.fit(X_train,y_train)



print( len(ova_lr.estimators_) )	#查看分类器的数量
print("train_ova_lr",metrics.accuracy_score(y_train,ova_lr.predict(X_train)))
print("test_ova_lr",metrics.accuracy_score(y_test,ova_lr.predict(X_test)))

测试结果:

F:\开发工具\pythonProject\tools\venv\Scripts\python.exe F:/开发工具/pythonProject/tools/python的sklear学习/sklearn02.py
10
train_ova_lr 0.9972164231036882
test_ova_lr 0.9527777777777777

Process finished with exit code 0

多标签分类:

我们特意为每个数字设计了多标签:

  • 标签 1 - 奇数、偶数

  • 标签 2 - 小于等于 4,大于 4

 


from sklearn.datasets import load_digits
from sklearn.model_selection import train_test_split
from sklearn.multiclass import OneVsRestClassifier
from sklearn import metrics# metrics 来计算各种性能指标
from sklearn.linear_model import LogisticRegression
import numpy as np

digits = load_digits()
X_train, X_test, y_train, y_test= train_test_split( digits['data'], digits['target'], test_size=0.2 )
y_train_multilabel 	= np.c_[ y_train%2==0, y_train<=4 ]
ova_lr =OneVsRestClassifier(LogisticRegression(solver="lbfgs" ,max_iter=800))

ova_lr.fit(X_train ,y_train_multilabel)


print(y_train_multilabel)
print( len(ova_lr.estimators_) )  # 查看分类器的数量
print( y_test[:1] )
print( ova_lr.predict(X_test[:1,:]) )

 测试结果:

F:\开发工具\pythonProject\tools\venv\Scripts\python.exe F:/开发工具/pythonProject/tools/python的sklear学习/sklearn03.py
[[False False]
 [False False]
 [ True False]
 ...
 [ True False]
 [False  True]
 [False  True]]
2
[6]
[[1 0]]

Process finished with exit code 0

 

 

  • 1
    点赞
  • 7
    收藏
    觉得还不错? 一键收藏
  • 0
    评论
评论
添加红包

请填写红包祝福语或标题

红包个数最小为10个

红包金额最低5元

当前余额3.43前往充值 >
需支付:10.00
成就一亿技术人!
领取后你会自动成为博主和红包主的粉丝 规则
hope_wisdom
发出的红包
实付
使用余额支付
点击重新获取
扫码支付
钱包余额 0

抵扣说明:

1.余额是钱包充值的虚拟货币,按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载,可以购买VIP、付费专栏及课程。

余额充值