针对特定的数据集选择合适的机器学习算法是冗长的过程,即使是针对特定的机器学习算法,亦需要花费大量时间和精力调整参数,才能让模型获得好的效果,Hyperopt-sklearn可以辅助解决这样的问题。
主页:http://hyperopt.github.io/hyperopt-sklearn/
安装方法:
git clone https://github.com/hyperopt/hyperopt-sklearn.git
cd hyperopt
pip install -e .
基础实例:
from hpsklearn import HyperoptEstimator
# Load Data
# ...
# Create the estimator object
estim = HyperoptEstimator()
# Search the space of classifiers and preprocessing steps and their
# respective hyperparameters in sklearn to fit a model to the data
estim.fit(train_data, train_label)
# Make a prediction using the optimized model
prediction = estim.predict(unknown_data)
# Report the accuracy of the classifier on a given set of data
score = estim.score(test_data, test_label)
# Return instances of the classifier and preprocessing steps
model = estim.best_model()
针对分类问题,可以如下指定HyperoptEstimator
from hyperopt import tpe
from hpsklearn import HyperoptEstimator, any_classifier
estim = HyperoptEstimator(classifier=any_classifier('clf'),algo=tpe.suggest)
estim.fit(X_train,y_train)
其中any_classifier是常用分类器的集合,根据源码
def any_classifier(name):
return hp.choice('%s' % name, [
svc(name + '.svc'),
knn(name + '.knn'),