网格搜索,随机搜索

本文介绍了两种常用的机器学习模型参数调优方法:网格搜索(GridSearchCV)和随机搜索(RandomizedSearchCV)。通过示例展示了如何在鸢尾花数据集上应用这两种方法调整SVM模型的参数,包括参数设置、模型训练以及结果评估。最终,找到最优模型参数,以提高模型的预测性能。

摘要生成于 C知道 ,由 DeepSeek-R1 满血版支持, 前往体验 >

调参的工具

网格搜索

导入需要的库

from sklearn import svm
from sklearn import datasets
from sklearn.model_selection import GridSearchCV
import pandas as pd 
from sklearn.datasets import load_iris
iris=datasets.load_iris()
parrameters={'kernel':('linear','rbf'),'C':[1,10]}#模型参数
svc=svm.SVC(probability=True)
clf=GridSearchCV(svc,parrameters)
clf.fit(iris.data,iris.target)
GridSearchCV(cv=None, error_score=nan,
             estimator=SVC(C=1.0, break_ties=False, cache_size=200,
                           class_weight=None, coef0=0.0,
                           decision_function_shape='ovr', degree=3,
                           gamma='scale', kernel='rbf', max_iter=-1,
                           probability=True, random_state=None, shrinking=True,
                           tol=0.001, verbose=False),
             iid='deprecated', n_jobs=None,
             param_grid={'C': [1, 10], 'kernel': ('linear', 'rbf')},
             pre_dispatch='2*n_jobs', refit=True, return_train_score=False,
             scoring=None, verbose=0)
print(clf.cv_results_)
{'mean_fit_time': array([0.00120158, 0.00120668, 0.00139761, 0.00099792]), 'std_fit_time': array([3.76263297e-04, 3.77709886e-04, 1.01544355e-03, 2.90693166e-05]), 'mean_score_time': array([0.        , 0.00039043, 0.0001986 , 0.        ]), 'std_score_time': array([0.        , 0.00047848, 0.00039721, 0.        ]), 'param_C': masked_array(data=[1, 1, 10, 10],
             mask=[False, False, False, False],
       fill_value='?',
            dtype=object), 'param_kernel': masked_array(data=['linear', 'rbf', 'linear', 'rbf'],
             mask=[False, False, False, False],
       fill_value='?',
            dtype=object), 'params': [{'C': 1, 'kernel': 'linear'}, {'C': 1, 'kernel': 'rbf'}, {'C': 10, 'kernel': 'linear'}, {'C': 10, 'kernel': 'rbf'}], 'split0_test_score': array([0.96666667, 0.96666667, 1.        , 0.96666667]), 'split1_test_score': array([1.        , 0.96666667, 1.        , 1.        ]), 'split2_test_score': array([0.96666667, 0.96666667, 0.9       , 0.96666667]), 'split3_test_score': array([0.96666667, 0.93333333, 0.96666667, 0.96666667]), 'split4_test_score': array([1., 1., 1., 1.]), 'mean_test_score': array([0.98      , 0.96666667, 0.97333333, 0.98      ]), 'std_test_score': array([0.01632993, 0.02108185, 0.03887301, 0.01632993]), 'rank_test_score': array([1, 4, 3, 1])}
#显示所有拟合模型的参数设定
pd.DataFrame(clf.cv_results_)
mean_fit_timestd_fit_timemean_score_timestd_score_timeparam_Cparam_kernelparamssplit0_test_scoresplit1_test_scoresplit2_test_scoresplit3_test_scoresplit4_test_scoremean_test_scorestd_test_scorerank_test_score
00.0012020.0003760.0000000.0000001linear{'C': 1, 'kernel': 'linear'}0.9666671.0000000.9666670.9666671.00.9800000.0163301
10.0012070.0003780.0003900.0004781rbf{'C': 1, 'kernel': 'rbf'}0.9666670.9666670.9666670.9333331.00.9666670.0210824
20.0013980.0010150.0001990.00039710linear{'C': 10, 'kernel': 'linear'}1.0000001.0000000.9000000.9666671.00.9733330.0388733
30.0009980.0000290.0000000.00000010rbf{'C': 10, 'kernel': 'rbf'}0.9666671.0000000.9666670.9666671.00.9800000.0163301

最优模型结果

clf.best_params_#最优模型
{'C': 1, 'kernel': 'linear'}

估计样本的类别

print(clf.decision_function(iris.data))#估计样本的类别
[[ 2.24627744  1.2980152  -0.30616012]
 [ 2.23781119  1.29663601 -0.30453043]
 [ 2.24548583  1.2968967  -0.30542241]

随机搜索

[外链图片转存失败,源站可能有防盗链机制,建议将图片保存下来直接上传(img-15Z3kXC0-1648963649100)(attachment:image.png)]
[外链图片转存失败,源站可能有防盗链机制,建议将图片保存下来直接上传(img-nun7VvbU-1648963649101)(attachment:image-2.png)]

import scipy.stats as stats
from sklearn import datasets
from sklearn.model_selection import RandomizedSearchCV
import pandas as pd 
from sklearn.svm import SVC

iris=datasets.load_iris()
parrameters={'kernel':('linear','rbf'),
            'C':stats.expon(scale=100),
            'gamma':stats.expon(scale=.1),
            'class_weight':('balanced',None)}
svc=SVC()
clf=RandomizedSearchCV(svc,parrameters)
clf.fit(iris.data,iris.target)
pd.DataFrame(clf.cv_results_)
clf.best_params_
{'C': 333.0779879298101,
 'class_weight': None,
 'gamma': 0.004641512813065941,
 'kernel': 'rbf'}
评论
添加红包

请填写红包祝福语或标题

红包个数最小为10个

红包金额最低5元

当前余额3.43前往充值 >
需支付:10.00
成就一亿技术人!
领取后你会自动成为博主和红包主的粉丝 规则
hope_wisdom
发出的红包
实付
使用余额支付
点击重新获取
扫码支付
钱包余额 0

抵扣说明:

1.余额是钱包充值的虚拟货币,按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载,可以购买VIP、付费专栏及课程。

余额充值