Hyperopt入门

最新推荐文章于 2024-05-23 23:12:38 发布

浅笑古今

最新推荐文章于 2024-05-23 23:12:38 发布

阅读量2.1w

点赞数 31

分类专栏：自学

本文链接：https://blog.csdn.net/u012735708/article/details/84820101

版权

自学专栏收录该内容

59 篇文章 22 订阅

订阅专栏

当我们创建好模型后，还要调整各个模型的参数，才找到最好的匹配。即使模型还可以，如果它的参数设置不匹配，同样无法输出好的结果。常用的调参方式有Grid search 和 Random search ，Grid search 是全空间扫描，所以比较慢，Random search 虽然快，但可能错失空间上的一些重要的点，精度不够。而Hyperopt是一种通过贝叶斯优化来调整参数的工具，该方法较快的速度，并有较好的效果。此外，Hyperopt结合MongoDB可以进行分布式调参，快速找到相对较优的参数。安装的时候需要指定dev版本才能使用模拟退火调参，也支持暴力调参、随机调参等策略。

（贝叶斯优化，又叫序贯模型优化（Sequential model-based optimization，SMBO），是最有效的函数优化方法之一。与共轭梯度下降法等标准优化策略相比，SMBO的优势有：利用平滑性而无需计算梯度；可处理实数、离散值、条件变量等；可处理大量变量并行优化。）

Let's go！！！

1 安装

pip install hyperopt

安装hyperopt时也会安装 networkx，如果在调用时出现 TypeError: 'generator' object is not subscriptable 报错，可以将其换成1.11版本。

pip uninstall networkx
pip install networkx==1.11

2 重点知识

2.1 fmin

from hyperopt import fmin, tpe, hp
best = fmin(
    fn=lambda x: x,
    space=hp.uniform('x', 0, 1),
    algo=tpe.suggest,
    max_evals=100)
print best

输出结果为：{'x': 0.0006154621520631152}

函数fmin首先接受一个函数来最小化，记为fn，在这里用一个函数lambda x: x来指定。该函数可以是任何有效的值返回函数，例如回归中的平均绝对误差。

下一个参数指定搜索空间，在本例中，它是0到1之间的连续数字范围，由hp.uniform('x', 0, 1)指定。hp.uniform是一个内置的hyperopt函数，它有三个参数：名称x，范围的下限和上限0和1。

algo参数指定搜索算法，本例中tpe表示 tree of Parzen estimators。该主题超出了本文的范围，但有数学背景的读者可以细读这篇文章。algo参数也可以设置为hyperopt.random，但是这里我们没有涉及，因为它是众所周知的搜索策略。

最后，我们指定fmin函数将执行的最大评估次数max_evals。这个fmin函数将返回一个python字典。

当我们调整max_evals=1000时，输出结果为：{'x': 3.7023587264309516e-06}，可以发现结果更接近于0。

为了更好的理解，可以看下面这个更复杂一些的例子。

best = fmin(
    fn=lambda x: (x-1)**2,
    space=hp.uniform('x', -2, 2),
    algo=tpe.suggest,
    max_evals=100)
print best

输出结果为：{'x': 1.007633842139922}

2.2 space

对于变量的变化范围与取值概率，有以下几类。

看个例子，

from hyperopt import hp
import hyperopt.pyll.stochastic

space = {
    'x': hp.uniform('x', 0, 1),
    'y': hp.normal('y', 0, 1),
    'name': hp.choice('name', ['alice', 'bob']),
}
print hyperopt.pyll.stochastic.sample(space)

输出结果为：{'y': -1.3901709472842074, 'x': 0.4335747017293238, 'name': 'bob'}

2.3 通过 Trials 捕获信息

Trials用来记录每次eval的时候，具体使用了什么参数以及相关的返回值。这时候，fn的返回值变为dict，除了loss，还有一个status。Trials对象将数据存储为一个BSON对象，可以利用MongoDB做分布式运算。

from hyperopt import fmin, tpe, hp, STATUS_OK, Trials
from matplotlib import pyplot as plt

fspace = {
    'x': hp.uniform('x', -5, 5)
}

def f(params):
    x = params['x']
    val = x**2
    return {'loss': val, 'status': STATUS_OK}

trials = Trials()
best = fmin(fn=f, space=fspace, algo=tpe.suggest, max_evals=50, trials=trials)

print 'best:', best
print 'trials:'
for trial in trials.trials[:2]:
    print trial

对于STATUS_OK的返回，会统计它的loss值，而对于STATUS_FAIL的返回，则会忽略。

输出结果如下，

best: {'x': -0.0025882455372094326}
trials:
{'refresh_time': datetime.datetime(2018, 12, 5, 3, 5, 43, 152000), 'book_time': datetime.datetime(2018, 12, 5, 3, 5, 43, 152000), 'misc': {'tid': 0, 'idxs': {'x': [0]}, 'cmd': ('domain_attachment', 'FMinIter_Domain'), 'vals': {'x': [-2.511797855178682]}, 'workdir': None}, 'state': 2, 'tid': 0, 'exp_key': None, 'version': 0, 'result': {'status': 'ok', 'loss': 6.309128465280228}, 'owner': None, 'spec': None}
{'refresh_time': datetime.datetime(2018, 12, 5, 3, 5, 43, 153000), 'book_time': datetime.datetime(2018, 12, 5, 3, 5, 43, 153000), 'misc': {'tid': 1, 'idxs': {'x': [1]}, 'cmd': ('domain_attachment', 'FMinIter_Domain'), 'vals': {'x': [3.43836093884876]}, 'workdir': None}, 'state': 2, 'tid': 1, 'exp_key': None, 'version': 0, 'result': {'status': 'ok', 'loss': 11.822325945800927}, 'owner': None, 'spec': None}

可以通过这里面的值，把一些变量与loss的点绘图，来看匹配度。或者tid与变量绘图，看它搜索的位置收敛（非数学意义上的收敛）情况。
trials有这几种：

trials.trials - a list of dictionaries representing everything about the search
trials.results - a list of dictionaries returned by ‘objective’ during the search
trials.losses() - a list of losses (float for each ‘ok’ trial)

trials.statuses() - a list of status strings
我们可以将上述trials进行可视化，值 vs. 时间与损失 vs. 值。

f, ax = plt.subplots(1)
xs = [t['tid'] for t in trials.trials]
ys = [t['misc']['vals']['x'] for t in trials.trials]
ax.set_xlim(xs[0]-10, xs[-1]+10)
ax.scatter(xs, ys, s=20, linewidth=0.01, alpha=0.75)
ax.set_title('$x$ $vs$ $t$ ', fontsize=18)
ax.set_xlabel('$t$', fontsize=16)
ax.set_ylabel('$x$', fontsize=16)

f, ax = plt.subplots(1)
xs = [t['misc']['vals']['x'] for t in trials.trials]
ys = [t['result']['loss'] for t in trials.trials]
ax.scatter(xs, ys, s=20, linewidth=0.01, alpha=0.75)
ax.set_title('$val$ $vs$ $x$ ', fontsize=18)
ax.set_xlabel('$x$', fontsize=16)
ax.set_ylabel('$val$', fontsize=16)

3 Hyperopt应用

3.1 K近邻

需要注意的是，由于我们试图最大化交叉验证的准确率，而hyperopt只知道如何最小化函数，所以必须对准确率取负。最小化函数f与最大化f的负数是相等的。

from sklearn.datasets import load_iris
from sklearn.neighbors import KNeighborsClassifier
from sklearn.cross_validation import cross_val_score
from hyperopt import hp,STATUS_OK,Trials,fmin,tpe
from matplotlib import pyplot as plt

iris=load_iris()
X=iris.data
y=iris.target

def hyperopt_train(params):
    clf=KNeighborsClassifier(**params)
    return cross_val_score(clf,X,y).mean()
    
space_knn={'n_neighbors':hp.choice('n_neighbors',range(1,100))}

def f(parmas):
    acc=hyperopt_train(parmas)
    return {'loss':-acc,'status':STATUS_OK}

trials=Trials()
best=fmin(f,space_knn,algo=tpe.suggest,max_evals=100,trials=trials)
print 'best',best

输出结果为：best {'n_neighbors': 4}

f, ax = plt.subplots(1)#, figsize=(10,10))
xs = [t['misc']['vals']['n_neighbors'] for t in trials.trials]
ys = [-t['result']['loss'] for t in trials.trials]
ax.scatter(xs, ys, s=20, linewidth=0.01, alpha=0.5)
ax.set_title('Iris Dataset - KNN', fontsize=18)
ax.set_xlabel('n_neighbors', fontsize=12)
ax.set_ylabel('cross validation accuracy', fontsize=12)

k 大于63后，准确率急剧下降。这是因为数据集中每个类的数量。这三个类中每个类只有50个实例。所以让我们将'n_neighbors'的值限制为较小的值来进一步探索。

'n_neighbors': hp.choice('n_neighbors', range(1,50))

重新运行后，得到的图像如下，

现在我们可以清楚地看到k的最佳值为4。

3.2 支持向量机（SVM）

由于这是一个分类任务，我们将使用sklearn的SVC类。代码如下

from sklearn.datasets import load_iris
from sklearn.cross_validation import cross_val_score
from hyperopt import hp,STATUS_OK,Trials,fmin,tpe
from matplotlib import pyplot as plt
from sklearn.svm import SVC 
import numpy as np

iris=load_iris()
X=iris.data
y=iris.target

def hyperopt_train_test(params):
    clf =SVC(**params)
    return cross_val_score(clf, X, y).mean()

space_svm = {
    'C': hp.uniform('C', 0, 20),
    'kernel': hp.choice('kernel', ['linear', 'sigmoid', 'poly', 'rbf']),
    'gamma': hp.uniform('gamma', 0, 20),
}

def f(params):
    acc = hyperopt_train_test(params)
    return {'loss': -acc, 'status': STATUS_OK}

trials = Trials()
best = fmin(f, space_svm, algo=tpe.suggest, max_evals=100, trials=trials)
print 'best:',best

parameters = ['C', 'kernel', 'gamma']
cols = len(parameters)
f, axes = plt.subplots(nrows=1, ncols=cols, figsize=(20,5))
cmap = plt.cm.jet
for i, val in enumerate(parameters):
    xs = np.array([t['misc']['vals'][val] for t in trials.trials]).ravel()
    ys = [-t['result']['loss'] for t in trials.trials]
    axes[i].scatter(xs, ys, s=20, linewidth=0.01, alpha=0.25, c=cmap(float(i)/len(parameters)))
    axes[i].set_title(val)
    axes[i].set_ylim([0.9, 1.0])

输出结果为：best:{'kernel': 3, 'C': 3.6332677642526985, 'gamma': 2.0192849151350796}

3.3 决策树

我们将尝试只优化决策树的一些参数。代码如下。

from sklearn.datasets import load_iris
from sklearn.cross_validation import cross_val_score
from hyperopt import hp,STATUS_OK,Trials,fmin,tpe
from matplotlib import pyplot as plt
from sklearn.tree import DecisionTreeClassifier
import numpy as np

iris=load_iris()
X=iris.data
y=iris.target

def hyperopt_train_test(params):
    clf = DecisionTreeClassifier(**params)
    return cross_val_score(clf, X, y).mean()

space_dt = {
    'max_depth': hp.choice('max_depth', range(1,20)),
    'max_features': hp.choice('max_features', range(1,5)),
    'criterion': hp.choice('criterion', ["gini", "entropy"]),
}

def f(params):
    acc = hyperopt_train_test(params)
    return {'loss': -acc, 'status': STATUS_OK}

trials = Trials()
best = fmin(f, space_dt, algo=tpe.suggest, max_evals=300, trials=trials)
print 'best:',best

parameters = ['max_depth', 'max_features', 'criterion'] # decision tree
cols = len(parameters)
f, axes = plt.subplots(nrows=1, ncols=cols, figsize=(20,5))
cmap = plt.cm.jet

for i, val in enumerate(parameters):
    xs = np.array([t['misc']['vals'][val] for t in trials.trials]).ravel()
    ys = [-t['result']['loss'] for t in trials.trials]
    axes[i].scatter(xs, ys, s=20, linewidth=0.01, alpha=0.25, c=cmap(float(i)/len(parameters)))
    axes[i].set_title(val)
    axes[i].set_ylim([0.9, 1.0])

输出结果为：best:{'max_features': 1, 'criterion': 1, 'max_depth': 13}

3.4 随机森林

让我们来看看集成分类器随机森林发生了什么，随机森林只是在不同分区数据上训练的决策树集合，每个分区都对输出类进行投票，并将绝大多数类的选择为预测。

from sklearn.datasets import load_iris
from sklearn.cross_validation import cross_val_score
from hyperopt import hp,STATUS_OK,Trials,fmin,tpe
from matplotlib import pyplot as plt
from sklearn.ensemble import RandomForestClassifier
import numpy as np

iris=load_iris()
X=iris.data
y=iris.target

def hyperopt_train_test(params):
    clf = RandomForestClassifier(**params)
    return cross_val_score(clf, X, y).mean()

space4rf = {
    'max_depth': hp.choice('max_depth', range(1,20)),
    'max_features': hp.choice('max_features', range(1,5)),
    'n_estimators': hp.choice('n_estimators', range(1,20)),
    'criterion': hp.choice('criterion', ["gini", "entropy"]),
}

best = 0
def f(params):
    global best
    acc = hyperopt_train_test(params)
    if acc > best:
        best = acc
    print 'new best:', best, params
    return {'loss': -acc, 'status': STATUS_OK}

trials = Trials()
best = fmin(f, space4rf, algo=tpe.suggest, max_evals=300, trials=trials)
print 'best:',best

parameters = ['n_estimators', 'max_depth', 'max_features', 'criterion']
f, axes = plt.subplots(nrows=1,ncols=4, figsize=(20,5))
cmap = plt.cm.jet
for i, val in enumerate(parameters):
    print i, val
    xs = np.array([t['misc']['vals'][val] for t in trials.trials]).ravel()
    ys = [-t['result']['loss'] for t in trials.trials]
    ys = np.array(ys)
    axes[i].scatter(xs, ys, s=20, linewidth=0.01, alpha=0.25, c=cmap(float(i)/len(parameters)))
    axes[i].set_title(val)

输出结果为：best: {'max_features': 3, 'n_estimators': 11, 'criterion': 1, 'max_depth': 2}

4 多模型调优

从众多模型和众多参数中找到最优模型及其参数

from sklearn.datasets import load_iris
from sklearn.cross_validation import cross_val_score
from hyperopt import hp,STATUS_OK,Trials,fmin,tpe
from sklearn.tree import DecisionTreeClassifier
from sklearn.neighbors import KNeighborsClassifier
from sklearn.naive_bayes import BernoulliNB
from sklearn.svm import SVC

iris=load_iris()
X=iris.data
y=iris.target

def hyperopt_train_test(params):
    t = params['type']
    del params['type']
    if t == 'naive_bayes':
        clf = BernoulliNB(**params)
    elif t == 'svm':
        clf = SVC(**params)
    elif t == 'dtree':
        clf = DecisionTreeClassifier(**params)
    elif t == 'knn':
        clf = KNeighborsClassifier(**params)
    else:
        return 0
    return cross_val_score(clf, X, y).mean()

space = hp.choice('classifier_type', [
    {
        'type': 'naive_bayes',
        'alpha': hp.uniform('alpha', 0.0, 2.0)
    },
    {
        'type': 'svm',
        'C': hp.uniform('C', 0, 10.0),
        'kernel': hp.choice('kernel', ['linear', 'rbf']),
        'gamma': hp.uniform('gamma', 0, 20.0)
    },
    {
        'type': 'randomforest',
        'max_depth': hp.choice('max_depth', range(1,20)),
        'max_features': hp.choice('max_features', range(1,5)),
        'n_estimators': hp.choice('n_estimators', range(1,20)),
        'criterion': hp.choice('criterion', ["gini", "entropy"]),
        'scale': hp.choice('scale', [0, 1])
    },
    {
        'type': 'knn',
        'n_neighbors': hp.choice('knn_n_neighbors', range(1,50))
    }
])

count = 0
best = 0
def f(params):
    global best, count
    count += 1
    acc = hyperopt_train_test(params.copy())
    if acc > best:
        print 'new best:', acc, 'using', params['type']
        best = acc
    if count % 50 == 0:
        print 'iters:', count, ', acc:', acc, 'using', params
    return {'loss': -acc, 'status': STATUS_OK}

trials = Trials()
best = fmin(f, space, algo=tpe.suggest, max_evals=1500, trials=trials)
print 'best:',best

输出结果为：best:{'kernel': 0, 'C': 1.4211568317201784, 'classifier_type': 1, 'gamma': 8.74017707300719}

5 参考

浅笑古今

关注

31
点赞
踩
201

收藏

觉得还不错? 一键收藏
8
评论
Hyperopt入门

当我们创建好模型后，还要调整各个模型的参数，才找到最好的匹配。即使模型还可以，如果它的参数设置不匹配，同样无法输出好的结果。常用的调参方式有Grid search 和 Random search ，Grid search 是全空间扫描，所以比较慢，Random search 虽然快，但可能错失空间上的一些重要的点，精度不够。而Hyperopt是一种通过贝叶斯优化来调整参数的工具，该方法较快的速...
复制链接

扫一扫