代码篇
GridSearchCV
sklearn.model_selection.GridSearchCV
class sklearn.model_selection.GridSearchCV(estimator, param_grid, scoring=None, n_jobs=None, iid=’warn’, refit=True, cv=’warn’, verbose=0, pre_dispatch=‘2*n_jobs’, error_score=’raise-deprecating’, return_train_score=False)
Exhaustive search over specified parameter values for an estimator.
Important members are fit, predict.
GridSearchCV implements a “fit” and a “score” method. It also implements “predict”, “predict_proba”, “decision_function”, “transform” and “inverse_transform” if they are implemented in the estimator used.
The parameters of the estimator used to apply these methods are optimized by cross-validated grid-search over a parameter grid.
返回一个model,等待训练,示例如下:
#Finding best parameters for our SVC model
'''
svm的参数,
'''
param = {
'C': [0.1,0.8,0.9,1,1.1,1.2,1.3,1.4],
'kernel':['linear', 'rbf'],
'gamma' :[0.1,0.8,0.9,1,1.1,1.2,1.3,1.4]
}
grid_svc = GridSearchCV(svc, param_grid=param, scoring='accuracy', cv=10) # 10折交叉验证,返回模型
训练模型:
grid_svc.fit(X_train, y_train)
#Best parameters for our svc model
grid_svc.best_params_ # 打印最好的一组参数
grid_svc.best_score_ # Mean cross-validated score of the best_estimator
# 找到最好的一组参数后,使用最好的参数去训练模型
svc2 = SVC(C = 1.2, gamma = 0.9, kernel= 'rbf')
svc2.fit(X_train, y_train)
pred_svc2 = svc2.predict(X_test)
# Build a text report showing the main classification metrics
print(classification_report(y_test, pred_svc2))
print(confusion_matrix(y_test, pred_svc2))
理论篇
regression和classification里的损失函数选择
一般地,regression => Cross-Entropy
, classification => Square Error
正则化参数 α \alpha α 的选取
对于正则化项: α ∣ ∣ W ∣ ∣ p p \alpha||W||^p_p α∣∣W∣∣pp 一般常使用GridSearchCV来搜寻最佳参数。
机器学习常用的评价指标
学习中遇到的分类任务中的评价指标有准确率(Accuracy)、FPR、FNR、Recall、Precision、F-score、MAP、ROC曲线和AUC等,回归任务中的指标有**®MSE、MAE、CC/PCC**等。
-
R
2
R^2
R2
- MSE / RMSE / MAE