大家都知道,SVM如果在调参比较好的情况下,可以达到很好的分类效果,不过SVM也确实参数比较多,例如在这里介绍的:
https://blog.csdn.net/xiaodongxiexie/article/details/70667101
也有些朋友对调参过程做了比较详细的解释:
https://blog.csdn.net/baidu_15113429/article/details/72673466
据网友介绍,SVM调参过程中应主要调kernel,C 和gamma,对于SKLearn,我们可以使用GridSearchCV对参数进行自动化搜索,有网友对其使用方法进行了详细介绍:
https://blog.csdn.net/cherdw/article/details/54970366
其官网的文档在这里:http://scikit-learn.org/stable/modules/generated/sklearn.model_selection.GridSearchCV.html
不过上面的博客和教程都有些复杂,我这里给出最简单的示例:
from sklearn import svm
from sklearn.model_selection import GridSearchCV
svr = svm.SVC()
parameters = {'kernel':('linear', 'rbf'), 'C':[1, 2, 4], 'gamma':[0.125, 0.25, 0.5 ,1, 2, 4]}
clf = GridSearchCV(svr, parameters, scoring='f1')
clf.fit(X, y)
print('The parameters of the best model are: ')
print(clf.best_params_)
关于scoring这个参数,在这里有更多的介绍:
http://scikit-learn.org/stable/modules/model_evaluation.html#scoring-parameter
我这里用的是f1,其他的参数值可以参考这个表:
Scoring | Function | Comment |
---|---|---|
Classification | ||
‘accuracy’ | metrics.accuracy_score | |
‘average_precision’ | metrics.average_precision_score | |
‘f1’ | metrics.f1_score | for binary targets |
‘f1_micro’ | metrics.f1_score | micro-averaged |
‘f1_macro’ | metrics.f1_score | macro-averaged |
‘f1_weighted’ | metrics.f1_score | weighted average |
‘f1_samples’ | metrics.f1_score | by multilabel sample |
‘neg_log_loss’ | metrics.log_loss | requires predict_proba support |
‘precision’ etc. | metrics.precision_score | suffixes apply as with ‘f1’ |
‘recall’ etc. | metrics.recall_score | suffixes apply as with ‘f1’ |
‘roc_auc’ | metrics.roc_auc_score | |
Clustering | ||
‘adjusted_mutual_info_score’ | metrics.adjusted_mutual_info_score | |
‘adjusted_rand_score’ | metrics.adjusted_rand_score | |
‘completeness_score’ | metrics.completeness_score | |
‘fowlkes_mallows_score’ | metrics.fowlkes_mallows_score | |
‘homogeneity_score’ | metrics.homogeneity_score | |
‘mutual_info_score’ | metrics.mutual_info_score | |
‘normalized_mutual_info_score’ | metrics.normalized_mutual_info_score | |
‘v_measure_score’ | metrics.v_measure_score | |
Regression | ||
‘explained_variance’ | metrics.explained_variance_score | |
‘neg_mean_absolute_error’ | metrics.mean_absolute_error | |
‘neg_mean_squared_error’ | metrics.mean_squared_error | |
‘neg_mean_squared_log_error’ | metrics.mean_squared_log_error | |
‘neg_median_absolute_error’ | metrics.median_absolute_error | |
‘r2’ | metrics.r2_score |