Hyperopt入门

当我们创建好模型后,还要调整各个模型的参数,才找到最好的匹配。即使模型还可以,如果它的参数设置不匹配,同样无法输出好的结果。 常用的调参方式有Grid search 和 Random search ,Grid search 是全空间扫描,所以比较慢,Random search 虽然快,但可能错失空间上的一些重要的点,精度不够。 而Hyperopt是一种通过贝叶斯优化来调整参数的工具,该方法较快的速度,并有较好的效果。此外,Hyperopt结合MongoDB可以进行分布式调参,快速找到相对较优的参数。安装的时候需要指定dev版本才能使用模拟退火调参,也支持暴力调参、随机调参等策略。

(贝叶斯优化,又叫序贯模型优化(Sequential model-based optimization,SMBO),是最有效的函数优化方法之一。与共轭梯度下降法等标准优化策略相比,SMBO的优势有:利用平滑性而无需计算梯度;可处理实数、离散值、条件变量等;可处理大量变量并行优化。)

Let's go!!!

1 安装

pip install hyperopt

安装hyperopt时也会安装 networkx,如果在调用时出现 TypeError: 'generator' object is not subscriptable 报错,可以将其换成1.11版本。

pip uninstall networkx
pip install networkx==1.11

2 重点知识

2.1 fmin

from hyperopt import fmin, tpe, hp
best = fmin(
    fn=lambda x: x,
    space=hp.uniform('x', 0, 1),
    algo=tpe.suggest,
    max_evals=100)
print best

输出结果为:{'x': 0.0006154621520631152}

函数fmin首先接受一个函数来最小化,记为fn,在这里用一个函数lambda x: x来指定。该函数可以是任何有效的值返回函数,例如回归中的平均绝对误差。

下一个参数指定搜索空间,在本例中,它是0到1之间的连续数字范围,由hp.uniform('x', 0, 1)指定。hp.uniform是一个内置的hyperopt函数,它有三个参数:名称x,范围的下限和上限01

algo参数指定搜索算法,本例中tpe表示 tree of Parzen estimators。该主题超出了本文的范围,但有数学背景的读者可以细读这篇文章。algo参数也可以设置为hyperopt.random,但是这里我们没有涉及,因为它是众所周知的搜索策略。

最后,我们指定fmin函数将执行的最大评估次数max_evals。这个fmin函数将返回一个python字典。

当我们调整max_evals=1000时,输出结果为:{'x': 3.7023587264309516e-06},可以发现结果更接近于0。

为了更好的理解,可以看下面这个更复杂一些的例子。

best = fmin(
    fn=lambda x: (x-1)**2,
    space=hp.uniform('x', -2, 2),
    algo=tpe.suggest,
    max_evals=100)
print best

输出结果为:{'x': 1.007633842139922}

2.2 space

对于变量的变化范围与取值概率,有以下几类。

 看个例子,

from hyperopt import hp
import hyperopt.pyll.stochastic

space = {
    'x': hp.uniform('x', 0, 1),
    'y': hp.normal('y', 0, 1),
    'name': hp.choice('name', ['alice', 'bob']),
}
print hyperopt.pyll.stochastic.sample(space)

输出结果为:{'y': -1.3901709472842074, 'x': 0.4335747017293238, 'name': 'bob'}

2.3 通过 Trials 捕获信息

Trials用来记录每次eval的时候,具体使用了什么参数以及相关的返回值。这时候,fn的返回值变为dict,除了loss,还有一个status。Trials对象将数据存储为一个BSON对象,可以利用MongoDB做分布式运算。

from hyperopt import fmin, tpe, hp, STATUS_OK, Trials
from matplotlib import pyplot as plt

fspace = {
    'x': hp.uniform('x', -5, 5)
}

def f(params):
    x = params['x']
    val = x**2
    return {'loss': val, 'status': STATUS_OK}

trials = Trials()
best = fmin(fn=f, space=fspace, algo=tpe.suggest, max_evals=50, trials=trials)

print 'best:', best
print 'trials:'
for trial in trials.trials[:2]:
    print trial

对于STATUS_OK的返回,会统计它的loss值,而对于STATUS_FAIL的返回,则会忽略。

输出结果如下,

best: {'x': -0.0025882455372094326}
trials:
{'refresh_time': datetime.datetime(2018, 12, 5, 3, 5, 43, 152000), 'book_time': datetime.datetime(2018, 12, 5, 3, 5, 43, 152000), 'misc': {'tid': 0, 'idxs': {'x': [0]}, 'cmd': ('domain_attachment', 'FMinIter_Domain'), 'vals': {'x': [-2.511797855178682]}, 'workdir': None}, 'state': 2, 'tid': 0, 'exp_key': None, 'version': 0, 'result': {'status': 'ok', 'loss': 6.309128465280228}, 'owner': None, 'spec': None}
{'refresh_time': datetime.datetime(2018, 12, 5, 3, 5, 43, 153000), 'book_time': datetime.datetime(2018, 12, 5, 3, 5, 43, 153000), 'misc': {'tid': 1, 'idxs': {'x': [1]}, 'cmd': ('domain_attachment', 'FMinIter_Domain'), 'vals': {'x': [3.43836093884876]}, 'workdir': None}, 'state': 2, 'tid': 1, 'exp_key': None, 'version': 0, 'result': {'status': 'ok', 'loss': 11.822325945800927}, 'owner': None, 'spec': None}

可以通过这里面的值,把一些变量与loss的点绘图,来看匹配度。或者tid与变量绘图,看它搜索的位置收敛(非数学意义上的收敛)情况。 
trials有这几种:

  • trials.trials - a list of dictionaries representing everything about the search
  • trials.results - a list of dictionaries returned by ‘objective’ during the search
  • trials.losses() - a list of losses (float for each ‘ok’ trial)
  • trials.statuses() - a list of status strings
    我们可以将上述trials进行可视化,值 vs. 时间与损失 vs. 值。
    f, ax = plt.subplots(1)
    xs = [t['tid'] for t in trials.trials]
    ys = [t['misc']['vals']['x'] for t in trials.trials]
    ax.set_xlim(xs[0]-10, xs[-1]+10)
    ax.scatter(xs, ys, s=20, linewidth=0.01, alpha=0.75)
    ax.set_title('$x$ $vs$ $t$ ', fontsize=18)
    ax.set_xlabel('$t$', fontsize=16)
    ax.set_ylabel('$x$', fontsize=16)

f, ax = plt.subplots(1)
xs = [t['misc']['vals']['x'] for t in trials.trials]
ys = [t['result']['loss'] for t in trials.trials]
ax.scatter(xs, ys, s=20, linewidth=0.01, alpha=0.75)
ax.set_title('$val$ $vs$ $x$ ', fontsize=18)
ax.set_xlabel('$x$', fontsize=16)
ax.set_ylabel('$val$', fontsize=16)

3 Hyperopt应用

3.1 K近邻

需要注意的是,由于我们试图最大化交叉验证的准确率,而hyperopt只知道如何最小化函数,所以必须对准确率取负。最小化函数f与最大化f的负数是相等的。

from sklearn.datasets import load_iris
from sklearn.neighbors import KNeighborsClassifier
from sklearn.cross_validation import cross_val_score
from hyperopt import hp,STATUS_OK,Trials,fmin,tpe
from matplotlib import pyplot as plt

iris=load_iris()
X=iris.data
y=iris.target

def hyperopt_train(params):
    clf=KNeighborsClassifier(**params)
    return cross_val_score(clf,X,y).mean()
    
space_knn={'n_neighbors':hp.choice('n_neighbors',range(1,100))}

def f(parmas):
    acc=hyperopt_train(parmas)
    return {'loss':-acc,'status':STATUS_OK}

trials=Trials()
best=fmin(f,space_knn,algo=tpe.suggest,max_evals=100,trials=trials)
print 'best',best    

输出结果为:best {'n_neighbors': 4}

f, ax = plt.subplots(1)#, figsize=(10,10))
xs = [t['misc']['vals']['n_neighbors'] for t in trials.trials]
ys = [-t['result']['loss'] for t in trials.trials]
ax.scatter(xs, ys, s=20, linewidth=0.01, alpha=0.5)
ax.set_title('Iris Dataset - KNN', fontsize=18)
ax.set_xlabel('n_neighbors', fontsize=12)
ax.set_ylabel('cross validation accuracy', fontsize=12)

k 大于63后,准确率急剧下降。这是因为数据集中每个类的数量。这三个类中每个类只有50个实例。所以让我们将'n_neighbors'的值限制为较小的值来进一步探索。 

'n_neighbors': hp.choice('n_neighbors', range(1,50))

 重新运行后,得到的图像如下,

 现在我们可以清楚地看到k的最佳值为4

3.2 支持向量机(SVM)

由于这是一个分类任务,我们将使用sklearnSVC类。代码如下

from sklearn.datasets import load_iris
from sklearn.cross_validation import cross_val_score
from hyperopt import hp,STATUS_OK,Trials,fmin,tpe
from matplotlib import pyplot as plt
from sklearn.svm import SVC 
import numpy as np

iris=load_iris()
X=iris.data
y=iris.target

def hyperopt_train_test(params):
    clf =SVC(**params)
    return cross_val_score(clf, X, y).mean()

space_svm = {
    'C': hp.uniform('C', 0, 20),
    'kernel': hp.choice('kernel', ['linear', 'sigmoid', 'poly', 'rbf']),
    'gamma': hp.uniform('gamma', 0, 20),
}

def f(params):
    acc = hyperopt_train_test(params)
    return {'loss': -acc, 'status': STATUS_OK}

trials = Trials()
best = fmin(f, space_svm, algo=tpe.suggest, max_evals=100, trials=trials)
print 'best:',best

parameters = ['C', 'kernel', 'gamma']
cols = len(parameters)
f, axes = plt.subplots(nrows=1, ncols=cols, figsize=(20,5))
cmap = plt.cm.jet
for i, val in enumerate(parameters):
    xs = np.array([t['misc']['vals'][val] for t in trials.trials]).ravel()
    ys = [-t['result']['loss'] for t in trials.trials]
    axes[i].scatter(xs, ys, s=20, linewidth=0.01, alpha=0.25, c=cmap(float(i)/len(parameters)))
    axes[i].set_title(val)
    axes[i].set_ylim([0.9, 1.0])

输出结果为:best:{'kernel': 3, 'C': 3.6332677642526985, 'gamma': 2.0192849151350796} 

3.3 决策树

我们将尝试只优化决策树的一些参数。代码如下。

from sklearn.datasets import load_iris
from sklearn.cross_validation import cross_val_score
from hyperopt import hp,STATUS_OK,Trials,fmin,tpe
from matplotlib import pyplot as plt
from sklearn.tree import DecisionTreeClassifier
import numpy as np

iris=load_iris()
X=iris.data
y=iris.target

def hyperopt_train_test(params):
    clf = DecisionTreeClassifier(**params)
    return cross_val_score(clf, X, y).mean()

space_dt = {
    'max_depth': hp.choice('max_depth', range(1,20)),
    'max_features': hp.choice('max_features', range(1,5)),
    'criterion': hp.choice('criterion', ["gini", "entropy"]),
}

def f(params):
    acc = hyperopt_train_test(params)
    return {'loss': -acc, 'status': STATUS_OK}

trials = Trials()
best = fmin(f, space_dt, algo=tpe.suggest, max_evals=300, trials=trials)
print 'best:',best

parameters = ['max_depth', 'max_features', 'criterion'] # decision tree
cols = len(parameters)
f, axes = plt.subplots(nrows=1, ncols=cols, figsize=(20,5))
cmap = plt.cm.jet

for i, val in enumerate(parameters):
    xs = np.array([t['misc']['vals'][val] for t in trials.trials]).ravel()
    ys = [-t['result']['loss'] for t in trials.trials]
    axes[i].scatter(xs, ys, s=20, linewidth=0.01, alpha=0.25, c=cmap(float(i)/len(parameters)))
    axes[i].set_title(val)
    axes[i].set_ylim([0.9, 1.0])

输出结果为:best:{'max_features': 1, 'criterion': 1, 'max_depth': 13}

3.4 随机森林

让我们来看看集成分类器随机森林发生了什么,随机森林只是在不同分区数据上训练的决策树集合,每个分区都对输出类进行投票,并将绝大多数类的选择为预测。

from sklearn.datasets import load_iris
from sklearn.cross_validation import cross_val_score
from hyperopt import hp,STATUS_OK,Trials,fmin,tpe
from matplotlib import pyplot as plt
from sklearn.ensemble import RandomForestClassifier
import numpy as np

iris=load_iris()
X=iris.data
y=iris.target

def hyperopt_train_test(params):
    clf = RandomForestClassifier(**params)
    return cross_val_score(clf, X, y).mean()

space4rf = {
    'max_depth': hp.choice('max_depth', range(1,20)),
    'max_features': hp.choice('max_features', range(1,5)),
    'n_estimators': hp.choice('n_estimators', range(1,20)),
    'criterion': hp.choice('criterion', ["gini", "entropy"]),
}

best = 0
def f(params):
    global best
    acc = hyperopt_train_test(params)
    if acc > best:
        best = acc
    print 'new best:', best, params
    return {'loss': -acc, 'status': STATUS_OK}

trials = Trials()
best = fmin(f, space4rf, algo=tpe.suggest, max_evals=300, trials=trials)
print 'best:',best

parameters = ['n_estimators', 'max_depth', 'max_features', 'criterion']
f, axes = plt.subplots(nrows=1,ncols=4, figsize=(20,5))
cmap = plt.cm.jet
for i, val in enumerate(parameters):
    print i, val
    xs = np.array([t['misc']['vals'][val] for t in trials.trials]).ravel()
    ys = [-t['result']['loss'] for t in trials.trials]
    ys = np.array(ys)
    axes[i].scatter(xs, ys, s=20, linewidth=0.01, alpha=0.25, c=cmap(float(i)/len(parameters)))
    axes[i].set_title(val)

输出结果为:best: {'max_features': 3, 'n_estimators': 11, 'criterion': 1, 'max_depth': 2}

 4 多模型调优

从众多模型和众多参数中找到最优模型及其参数

from sklearn.datasets import load_iris
from sklearn.cross_validation import cross_val_score
from hyperopt import hp,STATUS_OK,Trials,fmin,tpe
from sklearn.tree import DecisionTreeClassifier
from sklearn.neighbors import KNeighborsClassifier
from sklearn.naive_bayes import BernoulliNB
from sklearn.svm import SVC

iris=load_iris()
X=iris.data
y=iris.target

def hyperopt_train_test(params):
    t = params['type']
    del params['type']
    if t == 'naive_bayes':
        clf = BernoulliNB(**params)
    elif t == 'svm':
        clf = SVC(**params)
    elif t == 'dtree':
        clf = DecisionTreeClassifier(**params)
    elif t == 'knn':
        clf = KNeighborsClassifier(**params)
    else:
        return 0
    return cross_val_score(clf, X, y).mean()

space = hp.choice('classifier_type', [
    {
        'type': 'naive_bayes',
        'alpha': hp.uniform('alpha', 0.0, 2.0)
    },
    {
        'type': 'svm',
        'C': hp.uniform('C', 0, 10.0),
        'kernel': hp.choice('kernel', ['linear', 'rbf']),
        'gamma': hp.uniform('gamma', 0, 20.0)
    },
    {
        'type': 'randomforest',
        'max_depth': hp.choice('max_depth', range(1,20)),
        'max_features': hp.choice('max_features', range(1,5)),
        'n_estimators': hp.choice('n_estimators', range(1,20)),
        'criterion': hp.choice('criterion', ["gini", "entropy"]),
        'scale': hp.choice('scale', [0, 1])
    },
    {
        'type': 'knn',
        'n_neighbors': hp.choice('knn_n_neighbors', range(1,50))
    }
])

count = 0
best = 0
def f(params):
    global best, count
    count += 1
    acc = hyperopt_train_test(params.copy())
    if acc > best:
        print 'new best:', acc, 'using', params['type']
        best = acc
    if count % 50 == 0:
        print 'iters:', count, ', acc:', acc, 'using', params
    return {'loss': -acc, 'status': STATUS_OK}

trials = Trials()
best = fmin(f, space, algo=tpe.suggest, max_evals=1500, trials=trials)
print 'best:',best

输出结果为:best:{'kernel': 0, 'C': 1.4211568317201784, 'classifier_type': 1, 'gamma': 8.74017707300719} 

5 参考

  • 31
    点赞
  • 201
    收藏
    觉得还不错? 一键收藏
  • 8
    评论

“相关推荐”对你有帮助么?

  • 非常没帮助
  • 没帮助
  • 一般
  • 有帮助
  • 非常有帮助
提交
评论 8
添加红包

请填写红包祝福语或标题

红包个数最小为10个

红包金额最低5元

当前余额3.43前往充值 >
需支付:10.00
成就一亿技术人!
领取后你会自动成为博主和红包主的粉丝 规则
hope_wisdom
发出的红包
实付
使用余额支付
点击重新获取
扫码支付
钱包余额 0

抵扣说明:

1.余额是钱包充值的虚拟货币,按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载,可以购买VIP、付费专栏及课程。

余额充值