集成学习笔记06-分类模型的评估及调优

分类模型的评估及超参数调优

分类模型与回归模型一样,可以通过网格搜索进行超参数的调优。
我们在这里用两种方式尝试超参数调优:

# 加载IRIS鸢尾花数据集
import pandas as pd
from sklearn import datasets
ImportErrorris = datasets.load_iris() 
X = iris.data
y = iris.target
feature = iris.feature_names
data = pd.DataFrame(X,columns=feature)
  1. 网格搜索(GridSearchCV)
from sklearn.pipeline import make_pipeline
from sklearn.preprocessing import StandardScaler
from sklearn.model_selection import GridSearchCV
from sklearn.svm import SVC
import time

start_time = time.time()
pipe_svc = make_pipeline(StandardScaler(),SVC(random_state=1))
param_range = [0.0001,0.001,0.01,0.1,1.0,10.0,100.0,1000.0]
param_grid = [{'svc__C':param_range,'svc__kernel':['linear']},{'svc__C':param_range,'svc__gamma':param_range,'svc__kernel':['rbf']}]
gs = GridSearchCV(estimator=pipe_svc,param_grid=param_grid,scoring='accuracy',cv=10,n_jobs=-1)
gs = gs.fit(X,y)
end_time = time.time()
print("网格搜索经历时间:%.3f S" % float(end_time-start_time))
print(gs.best_score_)
print(gs.best_params_)
网格搜索经历时间:0.879 S
0.9800000000000001
{'svc__C': 1.0, 'svc__gamma': 0.1, 'svc__kernel': 'rbf'}
  1. 随机网格搜索(RandomizedGridSearchCV)
from sklearn.model_selection import RandomizedSearchCV
from sklearn.svm import SVC
import time

start_time = time.time()
pipe_svc = make_pipeline(StandardScaler(),SVC(random_state=1))
param_range = [0.0001,0.001,0.01,0.1,1.0,10.0,100.0,1000.0]
param_grid = [{'svc__C':param_range,'svc__kernel':['linear']},{'svc__C':param_range,'svc__gamma':param_range,'svc__kernel':['rbf']}]
# param_grid = [{'svc__C':param_range,'svc__kernel':['linear','rbf'],'svc__gamma':param_range}]
gs = RandomizedSearchCV(estimator=pipe_svc, param_distributions=param_grid,scoring='accuracy',cv=10,n_jobs=-1)
gs = gs.fit(X,y)
end_time = time.time()
print("随机网格搜索经历时间:%.3f S" % float(end_time-start_time))
print(gs.best_score_)
print(gs.best_params_)
随机网格搜索经历时间:0.165 S
0.9733333333333334
{'svc__kernel': 'linear', 'svc__C': 1000.0}
机器学习用于人脸识别练习

采用LFW数据集作为一个人脸识别的例子。
LFW全称为Labeled Faces in the Wild, 是一个应用于人脸识别问题的数据库。

import time
from sklearn.datasets import fetch_lfw_people
from sklearn.decomposition import PCA
from sklearn.model_selection import train_test_split
from sklearn.pipeline import make_pipeline
from sklearn.preprocessing import StandardScaler
from sklearn.svm import SVC
from sklearn.model_selection import GridSearchCV
#获取LFW人脸数据集
lfw_people = fetch_lfw_people(min_faces_per_person=70, resize=0.4)
# 拆分训练集,测试集
x_train,x_test,y_train,y_test=train_test_split(lfw_people.data, lfw_people.target, test_size=0.2, random_state=2)
print(x_train.shape,y_train.shape, x_test.shape, y_test.shape)
(1030, 1850) (1030,) (258, 1850) (258,)

让我们来看看数据集的人脸图像

import matplotlib.pyplot as plt
%matplotlib inline
plt.rcParams['font.sans-serif']=['SimHei'] #用来正常显示中文标签
plt.rcParams['figure.figsize'] = (10.0, 8.0) #设置画布大小
fig, ax=plt.subplots(3,4)
fig.subplots_adjust(left=0.1,top=1, wspace=0.6)
for i, axi in enumerate(ax.flat):
    axi.imshow(lfw_people.images[i],cmap="bone")
    axi.set(xticks=[], yticks=[], xlabel=lfw_people.target_names[lfw_people.target[i]])

在这里插入图片描述
使用网格搜索调优

start_time = time.time()
pca = PCA(n_components=150, whiten=True, random_state=42)
svc = SVC(kernel='rbf', class_weight='balanced')
model = make_pipeline(pca, svc)
param_grid = {'svc__C': [1,5,10,50], 'svc__gamma':[0.0001, 0.0005, 0.001, 0.005]}
gs = GridSearchCV(estimator=model,param_grid=param_grid)
gs = gs.fit(x_train,y_train)
end_time = time.time()
print("网格搜索经历时间:%.3f S" % float(end_time-start_time))
print(gs.best_score_)
print(gs.best_params_)
网格搜索经历时间:30.850 S
0.8475728155339806
{'svc__C': 5, 'svc__gamma': 0.001}

从风格搜索中挑选最优模型并把部分结果显示出来

best_model=gs.best_estimator_
y_pred=best_model.predict(x_test)
#预测结果
plt.figure(figsize=(20,20))
fig, ax = plt.subplots(4,5)
plt.subplots_adjust(wspace=0.3, hspace=0.3)
for i, axi in enumerate(ax.flat):
    axi.imshow(x_test[i].reshape(50, 37), cmap='bone')
    axi.set(xticks=[], yticks=[])
    axi.set_xlabel("预测:"+lfw_people.target_names[y_pred[i]].split()[-1]+"\n真实:"+lfw_people.target_names[y_test[i]].split()[-1],
                  color='black' if y_pred[i] == y_test[i] else 'red')

在这里插入图片描述

  • 0
    点赞
  • 0
    收藏
    觉得还不错? 一键收藏
  • 0
    评论
评论
添加红包

请填写红包祝福语或标题

红包个数最小为10个

红包金额最低5元

当前余额3.43前往充值 >
需支付:10.00
成就一亿技术人!
领取后你会自动成为博主和红包主的粉丝 规则
hope_wisdom
发出的红包
实付
使用余额支付
点击重新获取
扫码支付
钱包余额 0

抵扣说明:

1.余额是钱包充值的虚拟货币,按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载,可以购买VIP、付费专栏及课程。

余额充值