机器学习-寻找最优超参数

from sklearn.cross_validation import train_test_split
from sklearn.svm import SVC
from sklearn.metrics import classification_report
from sklearn.grid_search import GridSearchCV
from loadFile import load_data
from plot_classifier import plot_classifier
import numpy as np
import matplotlib.pyplot as plt

input_file = 'data_multivar_imbalance.txt'
X, y = load_data(input_file)

X_train, X_test, y_train, y_test = train_test_split(X, y, test_size=0.2, random_state=33)
#通过交叉验证设置参数
parameter_grid = [  {'kernel': ['linear'], 'C': [1, 10, 50, 600]},
                    {'kernel': ['poly'], 'degree': [2, 3]},
                    {'kernel': ['rbf'], 'gamma': [0.01, 0.001], 'C': [1, 10, 50, 600]},
                 ]
#定义需要使用的指标
metrics = ['precision', 'recall_weighted']
#为每个指标搜索最优超参数
for metric in metrics:
    print('Searching optimal hyperparameters for',metric)
    classifier = GridSearchCV(SVC(C=1),parameter_grid, cv=5, scoring=metric)
    classifier.fit(X_train, y_train)
    print("\nScores across the parameter grid:")
    for params, avg_score, _ in classifier.grid_scores_:
        print(params, '-->', round(avg_score, 3))

    print("\nHighest scoring parameter set:", classifier.best_params_)

    y_true, y_pred = y_test, classifier.predict(X_test)
    print("\nFull performance report:\n")
    print(classification_report(y_true, y_pred))

  • 2
    点赞
  • 8
    收藏
    觉得还不错? 一键收藏
  • 0
    评论
好的,我们可以使用Python中的sklearn库来进行网格搜索。以下是实现的步骤: 1. 导入需要的库 ```python from sklearn.ensemble import GradientBoostingClassifier from sklearn.model_selection import GridSearchCV ``` 2. 准备数据 ```python # X_train 和 y_train 分别为训练数据和标签 X_train, y_train = ... # X_test 和 y_test 分别为测试数据和标签 X_test, y_test = ... ``` 3. 定义GBDT分类器 ```python gbdt = GradientBoostingClassifier() ``` 4. 定义需要搜索的超参数范围 ```python param_grid = { 'n_estimators': [50, 100, 200], 'max_depth': [3, 5, 7], 'learning_rate': [0.1, 0.05, 0.01] } ``` 5. 进行网格搜索 ```python grid_search = GridSearchCV(gbdt, param_grid, cv=3, n_jobs=-1) grid_search.fit(X_train, y_train) ``` 其中,cv代表交叉验证的折数,n_jobs代表并行处理的数量,-1表示使用所有可用的CPU。 6. 输出最优超参数以及在测试集上的准确率 ```python print("Best parameters: ", grid_search.best_params_) print("Test accuracy: {:.4f}".format(grid_search.score(X_test, y_test))) ``` 完整代码如下: ```python from sklearn.ensemble import GradientBoostingClassifier from sklearn.model_selection import GridSearchCV # 准备数据 X_train, y_train = ... X_test, y_test = ... # 定义GBDT分类器 gbdt = GradientBoostingClassifier() # 定义需要搜索的超参数范围 param_grid = { 'n_estimators': [50, 100, 200], 'max_depth': [3, 5, 7], 'learning_rate': [0.1, 0.05, 0.01] } # 进行网格搜索 grid_search = GridSearchCV(gbdt, param_grid, cv=3, n_jobs=-1) grid_search.fit(X_train, y_train) # 输出最优超参数以及在测试集上的准确率 print("Best parameters: ", grid_search.best_params_) print("Test accuracy: {:.4f}".format(grid_search.score(X_test, y_test))) ```

“相关推荐”对你有帮助么?

  • 非常没帮助
  • 没帮助
  • 一般
  • 有帮助
  • 非常有帮助
提交
评论
添加红包

请填写红包祝福语或标题

红包个数最小为10个

红包金额最低5元

当前余额3.43前往充值 >
需支付:10.00
成就一亿技术人!
领取后你会自动成为博主和红包主的粉丝 规则
hope_wisdom
发出的红包
实付
使用余额支付
点击重新获取
扫码支付
钱包余额 0

抵扣说明:

1.余额是钱包充值的虚拟货币,按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载,可以购买VIP、付费专栏及课程。

余额充值