HyperOpt参数优化

版权声明:本文为博主原创文章,遵循 CC 4.0 BY-SA 版权协议,转载请附上原文出处链接和本声明。
本文链接: https://blog.csdn.net/u012735708/article/details/84820101

当我们创建好模型后,还要调整各个模型的参数,才找到最好的匹配。即使模型还可以,如果它的参数设置不匹配,同样无法输出好的结果。 常用的调参方式有Grid search 和 Random search ,Grid search 是全空间扫描,所以比较慢,Random search 虽然快,但可能错失空间上的一些重要的点,精度不够。 而Hyperopt是一种通过贝叶斯优化来调整参数的工具,该方法较快的速度,并有较好的效果。此外,Hyperopt结合MongoDB可以进行分布式调参,快速找到相对较优的参数。安装的时候需要指定dev版本才能使用模拟退火调参,也支持暴力调参、随机调参等策略。

(贝叶斯优化,又叫序贯模型优化(Sequential model-based optimization,SMBO),是最有效的函数优化方法之一。与共轭梯度下降法等标准优化策略相比,SMBO的优势有:利用平滑性而无需计算梯度;可处理实数、离散值、条件变量等;可处理大量变量并行优化。)

Let's go!!!

1 安装

pip install hyperopt
 
 

安装hyperopt时也会安装 networkx,如果在调用时出现 TypeError: 'generator' object is not subscriptable 报错,可以将其换成1.11版本。


 
 
  1. pip uninstall networkx
  2. pip install networkx== 1.11

2 重点知识

2.1 fmin


 
 
  1. from hyperopt import fmin, tpe, hp
  2. best = fmin(
  3. fn= lambda x: x,
  4. space=hp.uniform( 'x', 0, 1),
  5. algo=tpe.suggest,
  6. max_evals= 100)
  7. print best

输出结果为:{'x': 0.0006154621520631152}

函数fmin首先接受一个函数来最小化,记为fn,在这里用一个函数lambda x: x来指定。该函数可以是任何有效的值返回函数,例如回归中的平均绝对误差。

下一个参数指定搜索空间,在本例中,它是0到1之间的连续数字范围,由hp.uniform('x', 0, 1)指定。hp.uniform是一个内置的hyperopt函数,它有三个参数:名称x,范围的下限和上限01

algo参数指定搜索算法,本例中tpe表示 tree of Parzen estimators。该主题超出了本文的范围,但有数学背景的读者可以细读这篇文章。algo参数也可以设置为hyperopt.random,但是这里我们没有涉及,因为它是众所周知的搜索策略。

最后,我们指定fmin函数将执行的最大评估次数max_evals。这个fmin函数将返回一个python字典。

当我们调整max_evals=1000时,输出结果为:{'x': 3.7023587264309516e-06},可以发现结果更接近于0。

为了更好的理解,可以看下面这个更复杂一些的例子。


 
 
  1. best = fmin(
  2. fn= lambda x: (x -1)** 2,
  3. space=hp.uniform( 'x', -2, 2),
  4. algo=tpe.suggest,
  5. max_evals= 100)
  6. print best

输出结果为:{'x': 1.007633842139922}

2.2 space

对于变量的变化范围与取值概率,有以下几类。

 看个例子,


 
 
  1. from hyperopt import hp
  2. import hyperopt.pyll.stochastic
  3. space = {
  4. 'x': hp.uniform( 'x', 0, 1),
  5. 'y': hp.normal( 'y', 0, 1),
  6. 'name': hp.choice( 'name', [ 'alice', 'bob']),
  7. }
  8. print hyperopt.pyll.stochastic.sample(space)

输出结果为:{'y': -1.3901709472842074, 'x': 0.4335747017293238, 'name': 'bob'}

2.3 通过 Trials 捕获信息

Trials用来记录每次eval的时候,具体使用了什么参数以及相关的返回值。这时候,fn的返回值变为dict,除了loss,还有一个status。Trials对象将数据存储为一个BSON对象,可以利用MongoDB做分布式运算。


 
 
  1. from hyperopt import fmin, tpe, hp, STATUS_OK, Trials
  2. from matplotlib import pyplot as plt
  3. fspace = {
  4. 'x': hp.uniform( 'x', -5, 5)
  5. }
  6. def f(params):
  7. x = params[ 'x']
  8. val = x** 2
  9. return { 'loss': val, 'status': STATUS_OK}
  10. trials = Trials()
  11. best = fmin(fn=f, space=fspace, algo=tpe.suggest, max_evals= 50, trials=trials)
  12. print 'best:', best
  13. print 'trials:'
  14. for trial in trials.trials[: 2]:
  15. print trial

对于STATUS_OK的返回,会统计它的loss值,而对于STATUS_FAIL的返回,则会忽略。

输出结果如下,


 
 
  1. best: { 'x': -0.0025882455372094326}
  2. trials:
  3. { 'refresh_time': datetime.datetime( 2018, 12, 5, 3, 5, 43, 152000), 'book_time': datetime.datetime( 2018, 12, 5, 3, 5, 43, 152000), 'misc': { 'tid': 0, 'idxs': { 'x': [ 0]}, 'cmd': ( 'domain_attachment', 'FMinIter_Domain'), 'vals': { 'x': [ -2.511797855178682]}, 'workdir': None}, 'state': 2, 'tid': 0, 'exp_key': None, 'version': 0, 'result': { 'status': 'ok', 'loss': 6.309128465280228}, 'owner': None, 'spec': None}
  4. { 'refresh_time': datetime.datetime( 2018, 12, 5, 3, 5, 43, 153000), 'book_time': datetime.datetime( 2018, 12, 5, 3, 5, 43, 153000), 'misc': { 'tid': 1, 'idxs': { 'x': [ 1]}, 'cmd': ( 'domain_attachment', 'FMinIter_Domain'), 'vals': { 'x': [ 3.43836093884876]}, 'workdir': None}, 'state': 2, 'tid': 1, 'exp_key': None, 'version': 0, 'result': { 'status': 'ok', 'loss': 11.822325945800927}, 'owner': None, 'spec': None}

可以通过这里面的值,把一些变量与loss的点绘图,来看匹配度。或者tid与变量绘图,看它搜索的位置收敛(非数学意义上的收敛)情况。 
trials有这几种:

  • trials.trials - a list of dictionaries representing everything about the search
  • trials.results - a list of dictionaries returned by ‘objective’ during the search
  • trials.losses() - a list of losses (float for each ‘ok’ trial)
  • trials.statuses() - a list of status strings
    我们可以将上述trials进行可视化,值 vs. 时间与损失 vs. 值。
    
       
       
    1. f, ax = plt.subplots( 1)
    2. xs = [t[ 'tid'] for t in trials.trials]
    3. ys = [t[ 'misc'][ 'vals'][ 'x'] for t in trials.trials]
    4. ax.set_xlim(xs[ 0] -10, xs[ -1]+ 10)
    5. ax.scatter(xs, ys, s= 20, linewidth= 0.01, alpha= 0.75)
    6. ax.set_title( '$x$ $vs$ $t$ ', fontsize= 18)
    7. ax.set_xlabel( '$t$', fontsize= 16)
    8. ax.set_ylabel( '$x$', fontsize= 16)
    <p style="text-align:center;"><img alt="" class="has" height="302" src="https://img-blog.csdnimg.cn/20181205111833307.png?x-oss-process=image/watermark,type_ZmFuZ3poZW5naGVpdGk,shadow_10,text_aHR0cHM6Ly9ibG9nLmNzZG4ubmV0L3UwMTI3MzU3MDg=,size_16,color_FFFFFF,t_70" width="405"></p>
    </li>
    

 
 
  1. f, ax = plt.subplots( 1)
  2. xs = [t[ 'misc'][ 'vals'][ 'x'] for t in trials.trials]
  3. ys = [t[ 'result'][ 'loss'] for t in trials.trials]
  4. ax.scatter(xs, ys, s= 20, linewidth= 0.01, alpha= 0.75)
  5. ax.set_title( '$val$ $vs$ $x$ ', fontsize= 18)
  6. ax.set_xlabel( '$x$', fontsize= 16)
  7. ax.set_ylabel( '$val$', fontsize= 16)

3 Hyperopt应用

3.1 K近邻

需要注意的是,由于我们试图最大化交叉验证的准确率,而hyperopt只知道如何最小化函数,所以必须对准确率取负。最小化函数f与最大化f的负数是相等的。


 
 
  1. from sklearn.datasets import load_iris
  2. from sklearn.neighbors import KNeighborsClassifier
  3. from sklearn.cross_validation import cross_val_score
  4. from hyperopt import hp,STATUS_OK,Trials,fmin,tpe
  5. from matplotlib import pyplot as plt
  6. iris=load_iris()
  7. X=iris.data
  8. y=iris.target
  9. def hyperopt_train(params):
  10. clf=KNeighborsClassifier(**params)
  11. return cross_val_score(clf,X,y).mean()
  12. space_knn={ 'n_neighbors':hp.choice( 'n_neighbors',range( 1, 100))}
  13. def f(parmas):
  14. acc=hyperopt_train(parmas)
  15. return { 'loss':-acc, 'status':STATUS_OK}
  16. trials=Trials()
  17. best=fmin(f,space_knn,algo=tpe.suggest,max_evals= 100,trials=trials)
  18. print 'best',best

输出结果为:best {'n_neighbors': 4}


 
 
  1. f, ax = plt.subplots( 1) #, figsize=(10,10))
  2. xs = [t[ 'misc'][ 'vals'][ 'n_neighbors'] for t in trials.trials]
  3. ys = [-t[ 'result'][ 'loss'] for t in trials.trials]
  4. ax.scatter(xs, ys, s= 20, linewidth= 0.01, alpha= 0.5)
  5. ax.set_title( 'Iris Dataset - KNN', fontsize= 18)
  6. ax.set_xlabel( 'n_neighbors', fontsize= 12)
  7. ax.set_ylabel( 'cross validation accuracy', fontsize= 12)

k 大于63后,准确率急剧下降。这是因为数据集中每个类的数量。这三个类中每个类只有50个实例。所以让我们将'n_neighbors'的值限制为较小的值来进一步探索。 

'n_neighbors': hp.choice('n_neighbors', range(1,50))
 
 

 重新运行后,得到的图像如下,

 现在我们可以清楚地看到k的最佳值为4

3.2 支持向量机(SVM)

由于这是一个分类任务,我们将使用sklearnSVC类。代码如下


 
 
  1. from sklearn.datasets import load_iris
  2. from sklearn.cross_validation import cross_val_score
  3. from hyperopt import hp,STATUS_OK,Trials,fmin,tpe
  4. from matplotlib import pyplot as plt
  5. from sklearn.svm import SVC
  6. import numpy as np
  7. iris=load_iris()
  8. X=iris.data
  9. y=iris.target
  10. def hyperopt_train_test(params):
  11. clf =SVC(**params)
  12. return cross_val_score(clf, X, y).mean()
  13. space_svm = {
  14. 'C': hp.uniform( 'C', 0, 20),
  15. 'kernel': hp.choice( 'kernel', [ 'linear', 'sigmoid', 'poly', 'rbf']),
  16. 'gamma': hp.uniform( 'gamma', 0, 20),
  17. }
  18. def f(params):
  19. acc = hyperopt_train_test(params)
  20. return { 'loss': -acc, 'status': STATUS_OK}
  21. trials = Trials()
  22. best = fmin(f, space_svm, algo=tpe.suggest, max_evals= 100, trials=trials)
  23. print 'best:',best
  24. parameters = [ 'C', 'kernel', 'gamma']
  25. cols = len(parameters)
  26. f, axes = plt.subplots(nrows= 1, ncols=cols, figsize=( 20, 5))
  27. cmap = plt.cm.jet
  28. for i, val in enumerate(parameters):
  29. xs = np.array([t[ 'misc'][ 'vals'][val] for t in trials.trials]).ravel()
  30. ys = [-t[ 'result'][ 'loss'] for t in trials.trials]
  31. axes[i].scatter(xs, ys, s= 20, linewidth= 0.01, alpha= 0.25, c=cmap(float(i)/len(parameters)))
  32. axes[i].set_title(val)
  33. axes[i].set_ylim([ 0.9, 1.0])

输出结果为:best:{'kernel': 3, 'C': 3.6332677642526985, 'gamma': 2.0192849151350796} 

3.3 决策树

我们将尝试只优化决策树的一些参数。代码如下。


 
 
  1. from sklearn.datasets import load_iris
  2. from sklearn.cross_validation import cross_val_score
  3. from hyperopt import hp,STATUS_OK,Trials,fmin,tpe
  4. from matplotlib import pyplot as plt
  5. from sklearn.tree import DecisionTreeClassifier
  6. import numpy as np
  7. iris=load_iris()
  8. X=iris.data
  9. y=iris.target
  10. def hyperopt_train_test(params):
  11. clf = DecisionTreeClassifier(**params)
  12. return cross_val_score(clf, X, y).mean()
  13. space_dt = {
  14. 'max_depth': hp.choice( 'max_depth', range( 1, 20)),
  15. 'max_features': hp.choice( 'max_features', range( 1, 5)),
  16. 'criterion': hp.choice( 'criterion', [ "gini", "entropy"]),
  17. }
  18. def f(params):
  19. acc = hyperopt_train_test(params)
  20. return { 'loss': -acc, 'status': STATUS_OK}
  21. trials = Trials()
  22. best = fmin(f, space_dt, algo=tpe.suggest, max_evals= 300, trials=trials)
  23. print 'best:',best
  24. parameters = [ 'max_depth', 'max_features', 'criterion'] # decision tree
  25. cols = len(parameters)
  26. f, axes = plt.subplots(nrows= 1, ncols=cols, figsize=( 20, 5))
  27. cmap = plt.cm.jet
  28. for i, val in enumerate(parameters):
  29. xs = np.array([t[ 'misc'][ 'vals'][val] for t in trials.trials]).ravel()
  30. ys = [-t[ 'result'][ 'loss'] for t in trials.trials]
  31. axes[i].scatter(xs, ys, s= 20, linewidth= 0.01, alpha= 0.25, c=cmap(float(i)/len(parameters)))
  32. axes[i].set_title(val)
  33. axes[i].set_ylim([ 0.9, 1.0])

输出结果为:best:{'max_features': 1, 'criterion': 1, 'max_depth': 13}

3.4 随机森林

让我们来看看集成分类器随机森林发生了什么,随机森林只是在不同分区数据上训练的决策树集合,每个分区都对输出类进行投票,并将绝大多数类的选择为预测。


 
 
  1. from sklearn.datasets import load_iris
  2. from sklearn.cross_validation import cross_val_score
  3. from hyperopt import hp,STATUS_OK,Trials,fmin,tpe
  4. from matplotlib import pyplot as plt
  5. from sklearn.ensemble import RandomForestClassifier
  6. import numpy as np
  7. iris=load_iris()
  8. X=iris.data
  9. y=iris.target
  10. def hyperopt_train_test(params):
  11. clf = RandomForestClassifier(**params)
  12. return cross_val_score(clf, X, y).mean()
  13. space4rf = {
  14. 'max_depth': hp.choice( 'max_depth', range( 1, 20)),
  15. 'max_features': hp.choice( 'max_features', range( 1, 5)),
  16. 'n_estimators': hp.choice( 'n_estimators', range( 1, 20)),
  17. 'criterion': hp.choice( 'criterion', [ "gini", "entropy"]),
  18. }
  19. best = 0
  20. def f(params):
  21. global best
  22. acc = hyperopt_train_test(params)
  23. if acc > best:
  24. best = acc
  25. print 'new best:', best, params
  26. return { 'loss': -acc, 'status': STATUS_OK}
  27. trials = Trials()
  28. best = fmin(f, space4rf, algo=tpe.suggest, max_evals= 300, trials=trials)
  29. print 'best:',best
  30. parameters = [ 'n_estimators', 'max_depth', 'max_features', 'criterion']
  31. f, axes = plt.subplots(nrows= 1,ncols= 4, figsize=( 20, 5))
  32. cmap = plt.cm.jet
  33. for i, val in enumerate(parameters):
  34. print i, val
  35. xs = np.array([t[ 'misc'][ 'vals'][val] for t in trials.trials]).ravel()
  36. ys = [-t[ 'result'][ 'loss'] for t in trials.trials]
  37. ys = np.array(ys)
  38. axes[i].scatter(xs, ys, s= 20, linewidth= 0.01, alpha= 0.25, c=cmap(float(i)/len(parameters)))
  39. axes[i].set_title(val)

输出结果为:best: {'max_features': 3, 'n_estimators': 11, 'criterion': 1, 'max_depth': 2}

 4 多模型调优

从众多模型和众多参数中找到最优模型及其参数


 
 
  1. from sklearn.datasets import load_iris
  2. from sklearn.cross_validation import cross_val_score
  3. from hyperopt import hp,STATUS_OK,Trials,fmin,tpe
  4. from sklearn.tree import DecisionTreeClassifier
  5. from sklearn.neighbors import KNeighborsClassifier
  6. from sklearn.naive_bayes import BernoulliNB
  7. from sklearn.svm import SVC
  8. iris=load_iris()
  9. X=iris.data
  10. y=iris.target
  11. def hyperopt_train_test(params):
  12. t = params[ 'type']
  13. del params[ 'type']
  14. if t == 'naive_bayes':
  15. clf = BernoulliNB(**params)
  16. elif t == 'svm':
  17. clf = SVC(**params)
  18. elif t == 'dtree':
  19. clf = DecisionTreeClassifier(**params)
  20. elif t == 'knn':
  21. clf = KNeighborsClassifier(**params)
  22. else:
  23. return 0
  24. return cross_val_score(clf, X, y).mean()
  25. space = hp.choice( 'classifier_type', [
  26. {
  27. 'type': 'naive_bayes',
  28. 'alpha': hp.uniform( 'alpha', 0.0, 2.0)
  29. },
  30. {
  31. 'type': 'svm',
  32. 'C': hp.uniform( 'C', 0, 10.0),
  33. 'kernel': hp.choice( 'kernel', [ 'linear', 'rbf']),
  34. 'gamma': hp.uniform( 'gamma', 0, 20.0)
  35. },
  36. {
  37. 'type': 'randomforest',
  38. 'max_depth': hp.choice( 'max_depth', range( 1, 20)),
  39. 'max_features': hp.choice( 'max_features', range( 1, 5)),
  40. 'n_estimators': hp.choice( 'n_estimators', range( 1, 20)),
  41. 'criterion': hp.choice( 'criterion', [ "gini", "entropy"]),
  42. 'scale': hp.choice( 'scale', [ 0, 1])
  43. },
  44. {
  45. 'type': 'knn',
  46. 'n_neighbors': hp.choice( 'knn_n_neighbors', range( 1, 50))
  47. }
  48. ])
  49. count = 0
  50. best = 0
  51. def f(params):
  52. global best, count
  53. count += 1
  54. acc = hyperopt_train_test(params.copy())
  55. if acc > best:
  56. print 'new best:', acc, 'using', params[ 'type']
  57. best = acc
  58. if count % 50 == 0:
  59. print 'iters:', count, ', acc:', acc, 'using', params
  60. return { 'loss': -acc, 'status': STATUS_OK}
  61. trials = Trials()
  62. best = fmin(f, space, algo=tpe.suggest, max_evals= 1500, trials=trials)
  63. print 'best:',best

输出结果为:best:{'kernel': 0, 'C': 1.4211568317201784, 'classifier_type': 1, 'gamma': 8.74017707300719} 

5 参考

  • 0
    点赞
  • 4
    收藏
    觉得还不错? 一键收藏
  • 0
    评论
评论
添加红包

请填写红包祝福语或标题

红包个数最小为10个

红包金额最低5元

当前余额3.43前往充值 >
需支付:10.00
成就一亿技术人!
领取后你会自动成为博主和红包主的粉丝 规则
hope_wisdom
发出的红包
实付
使用余额支付
点击重新获取
扫码支付
钱包余额 0

抵扣说明:

1.余额是钱包充值的虚拟货币,按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载,可以购买VIP、付费专栏及课程。

余额充值