使用Optuna的AllenNLP超参数优化

Introduction to Optuna integration for AllenNLP.

介绍用于AllenNLP的Optuna集成。

TL; 博士 (TL; DR)

  • An AllenNLP integration, AllenNLPExecutor, has been added to Optuna, enabling users to reuse an AllenNLP Jsonnet configuration file to optimize hyperparameters.

    AllenNLP集成AllenNLPExecutor已添加到Optuna,使用户可以重用AllenNLP Jsonnet配置文件来优化超参数。

  • We will continue to enhance the AllenNLP integration (e.g., support for Optuna Pruning API). This allows users to train a model with some sophisticated algorithms such as Hyperband.

    我们将继续增强AllenNLP集成(例如,对Optuna Pruning API的支持)。 这使用户可以使用一些复杂的算法(例如Hyperband)训练模型。
  • The sample I made for the demonstration for this article is available on GitHub. Please check it out!

    我为本文的演示制作的示例可在GitHub找到 。 请检查一下!

介绍 (Introduction)

In this article, I introduce how to use Optuna, a hyperparameter optimization library, to estimate hyperparameters of a model implemented with AllenNLP, a neural network library for natural language processing.

在本文中,我将介绍如何使用Optuna(超参数优化库)来估计通过AllenNLP(用于自然语言处理的神经网络库)实现的模型的超参数。

奥图纳 (Optuna)

Optuna is a library for hyperparameter optimization, providing flexibility in optimizing hyperparameters in machine learning. Optuna has many search algorithms for hyperparameters, including Tree-structured Parzen Estimator (TPE) [1], CMA Evolution Strategy (CMA-ES) [2], and Multi-objective optimization [3].

Optuna是用于超参数优化的库,为在机器学习中优化超参数提供了灵活性。 Optuna有许多用于超参数的搜索算法,包括树结构Parzen估计器(TPE) [1], CMA演化策略(CMA-ES) [2]和多目标优化 [3]。

In Optuna, we define an objective function to perform hyperparameter optimization. A simple example is shown below.

在Optuna中,我们定义了一个目标函数来执行超参数优化。 一个简单的例子如下所示。

The search spaces of parameters are defined by using suggest APIs. Passing the objective function to the method study.optimize makes Optuna starts to optimize hyperparameters. For more information, please check the tutorial:

通过使用建议API定义参数的搜索空间。 将目标函数传递给方法研究study.optimize使Optuna开始优化超参数。 有关更多信息,请查看教程:

艾伦 (AllenNLP)

AllenNLP is the library for natural language processing using neural networks. It is developed by the Allen Institute for Artificial Intelligence. They have presented a paper and a tutorial at top NLP conferences, which are useful for NLP research. In addition, a variety of tutorials and demonstrations are available online, allowing beginners to experience cutting-edge NLP techniques.

AllenNLP是使用神经网络进行自然语言处理的库。 它是由艾伦人工智能研究所开发的。 他们在NLP顶级会议上发表了论文教程 ,对NLP研究很有用。 此外,在线提供了各种教程和演示,使初学者可以体验最先进的NLP技术。

There are two ways for implementing the model using AllenNLP: (a) writing a Python script and executing it directly, and (b) preparing a configuration file written in Jsonnet and running by the allennlp command-line interface. This article explains about the latter. If you want to use Optuna in your Python scripts, please see the official sample on GitHub.

使用AllenNLP可以通过两种方法来实现模型:(a)编写Python脚本并直接执行它,(b)准备用Jsonnet编写并由allennlp命令行界面运行的配置文件。 本文介绍了后者。 如果要在Python脚本中使用Optuna,请参阅GitHub上的官方示例

Using a Jsonnet configuration file, users can train models by only writing configurations for the experiments. This eliminates the need to write a script to train a model and allows users to focus on their model architecture, hyperparameters, and training configuration.

使用Jsonnet配置文件,用户只需编写实验配置即可训练模型。 这样就无需编写脚本来训练模型,并允许用户专注于其模型体系结构,超参数和训练配置。

One of the most famous hyperparameter optimization tools for AllenNLP is AllenTune. AllenTune supports optimization using a Jsonnet-style configuration file. And AllenTune supports simple Random Search and Grid Search algorithms for search parameters. The user can optimize by making a few line changes to the existing Jsonnet file in order to define the search space.

AllenNLP最著名的超参数优化工具之一是AllenTune 。 AllenTune支持使用Jsonnet样式的配置文件进行优化。 AllenTune支持简单的随机搜索和网格搜索算法来搜索参数。 用户可以通过对现有Jsonnet文件进行几行更改来进行优化,以定义搜索空间。

Optuna + AllenNLP (Optuna + AllenNLP)

For AllenNLP, typically, the hyperparameters are defined with a Jsonnet file, while Optuna defines hyperparameters to be optimized by writing a Python script. To bridge this gap, we’ve created the AllenNLPExecutor to allow Optuna ranges of hyperparameters to be defined in the AllenNLP Jsonnet file.

对于AllenNLP,通常使用Jsonnet文件定义超参数,而Optuna定义通过编写Python脚本进行优化的超参数。 为了弥合这种差距,我们创建了AllenNLPExecutor以允许在AllenNLP Jsonnet文件中定义超参数的Optuna范围。

The AllenNLPExecutor performs parameter optimization as follows.

AllenNLPExecutor如下执行参数优化。

  • Edit the configuration file in Jsonnet format and mask the hyperparameters with std.extVar.

    以Jsonnet格式编辑配置文件,并使用std.extVar屏蔽超std.extVar

  • Sample parameters from the search space defined using Optuna’s suggest API and setting up Jsonnet files to create a Params object for AllenNLP

    使用Optuna的建议API定义搜索空间中的示例参数,并设置Jsonnet文件以为AllenNLP创建Params对象
  • Pass the Params object to allennlp.commands.train.train_model and execute model training.

    Params对象传递给allennlp.commands.train.train_model并执行模型训练。

For details of the implementation, please see the pull request.

有关实现的详细信息,请参阅请求请求。

Previous, it would be necessary to create a module for each project to do the above. But now, with AllenNLPExecutor, you can optimize hyperparameters with less effort. The PR was merged and has been available since v1.4.0, which was released on May 11.

以前,有必要为每个项目创建一个模块来执行上述操作。 但是现在,使用AllenNLPExecutor ,您可以轻松地优化超参数。 PR已合并,自5月11日发布的v1.4.0起可用。

AllenNLP执行员演示 (AllenNLPExecutor Demonstration)

任务:IMDb (Task: IMDb)

To demonstrate Optuna’s AllenNLP integration, we tackle the sentiment analysis of the IMDb review data[3]. The IMDb dataset contains 20,000 training data and 5,000 test data, each record containing a review submission for a movie or TV show and a label indicating whether the review was a positive or negative submission. The task, in this case, is to predict whether a review is positive or negative from the textual information in the body of the review.

为了演示Optuna的AllenNLP集成,我们处理了对IMDb审阅数据的情感分析[3]。 IMDb数据集包含20,000个训练数据和5,000个测试数据,每条记录均包含针对电影或电视节目的评论提交,以及指示该评论是正面还是负面的标签。 在这种情况下,任务是根据评论正文中的文字信息来预测评论是正面还是负面。

制备 (Preparation)

If you create a configuration file in Jsonnet format using AllenNLP, it looks like the following. The configuration file and parameters are based on the official sample of AllenTune. The default value of the parameter is the median value of each parameter space defined in the official sample. We call this model baseline.

如果使用AllenNLP创建Jsonnet格式的配置文件,则如下所示。 配置文件和参数基于AllenTune 的官方样本 。 参数的默认值是官方样本中定义的每个参数空间的中值。 我们将此模型称为基准。

First, mask values of hyperparameters in the Jsonnet config with Jsonnet method calling std.extVar('{param_name}') with std.parseInt for integer or std.parseJson for floating-point. [edited in 2020/07/28: please use std.parseInt or std.parseJson for casting parameters to desired value types.]

首先,使用Jsonnet方法调用std.extVar('{param_name}') Jsonnet方法屏蔽Jsonnet配置中的超参数值,其中std.parseInt为整数, std.parseJson为浮点数。 [于2020/07/28编辑:请使用std.parseIntstd.parseJson将参数转换为所需的值类型。]

Image for post

Resulting config would be the following:

结果配置如下:

Now that you have created the config, you can define the search space in Optuna. Note that the parameter names are the same as those defined in the config earlier. The objective function is as follows.

创建配置后,您可以在Optuna中定义搜索空间。 请注意,参数名称与先前配置中定义的名称相同。 目标函数如下。

Once we have defined the search space, we pass the trial object to AllenNLPExecutor. It’s time to create executor! AllenNLPExecutor takes a trial, a path to config, a path to snapshot, and a target metric to be optimized as input arguments (executor = AllenNLPExecutor(trial, config, snapshot, metric)). Then let’s run executor.run to start optimization. In each trial step in optimization, objective is called and does the following steps: (1) trains a model (2) gets a target metric on validation data (3) returns a target metric.

定义搜索空间后,将试用对象传递给AllenNLPExecutor 。 现在该创建executorAllenNLPExecutor接受trial ,配置路径,快照路径和要优化的目标指标作为输入参数( executor = AllenNLPExecutor(trial, config, snapshot, metric) )。 然后,运行executor.run以开始优化。 在优化的每个试验步骤中,都会调用objective并执行以下步骤:(1)训练模型(2)根据验证数据获取目标指标(3)返回目标指标。

After all, creating study and study.optimize starts parameter optimization.

毕竟,创建study and study.optimize启动参数优化。

结果 (Results)

The results of the hyperparameter optimization are shown below. The evaluation metric is the percentage of correct answers in the validation data. Baseline is the model that I described in the preparation section. Optuna+AllenNLP is the result of optimization with AllenNLPExecutor. We performed the optimization five times with changing the seed values and calculated the average accuracy. Because the baseline is trained with fixed hyperparameters, the average accuracy remains constant over repeated trials. As a result of the optimization using Optuna, we can see that the average accuracy improves with the number of trials. To the end, the accuracy improved by about 2.7 points on average compared to using the original hyperparameters.

超参数优化的结果如下所示。 评估指标是验证数据中正确答案的百分比。 Baseline是我在准备部分中描述的模型。 Optuna+AllenNLPAllenNLPExecutor优化的结果。 我们通过更改种子值进行了五次优化,并计算了平均精度。 由于使用固定的超参数训练基线,因此在重复试验中平均精度保持恒定。 使用Optuna进行优化的结果是,我们可以看到,平均精度随试验次数的增加而提高。 最终,与使用原始超参数相比,精度平均提高了约2.7点。

Image for post
Performance comparison between baseline and AllenNLP+Optuna
基线与AllenNLP + Optuna之间的性能比较

Optuna also has the feature to dump a configuration file with optimized hyperparameters. Call dump_best_config with a path to config, a path to output config, and the study already optimized.

Optuna还具有转储具有优化超参数的配置文件的功能。 用配置路径,输出配置路径和已经优化的study调用dump_best_config

The example for the output of dump_best_config looks like the following. You can see that the values of parameters such as dropout and embedding_dim masked with std.extVar are rewritten with the actual values. Also, the output file can be passed directly to the command. This allows the user to relearn the model with optimized parameters.

dump_best_config的输出示例如下所示。 你可以看到的参数,如值dropoutembedding_dim与屏蔽std.extVar与实际值改写。 同样,输出文件可以直接传递给命令。 这允许用户使用优化的参数重新学习模型。

结论 (Conclusion)

In this article, I introduced how to combine AllenNLP and Optuna, a neural network library for natural language processing, to optimize hyperparameters, which is easy to use with a few modifications to AllenNLP’s Jsonnet file. As a demo, I worked on a polarity analysis of IMDb review data.

在本文中,我介绍了如何结合AllenNLP和Optuna(用于自然语言处理的神经网络库)来优化超参数,通过对AllenNLP的Jsonnet文件进行一些修改即可轻松使用超参数。 作为演示,我进行了IMDb审阅数据的极性分析。

The sample I made for this demo is available on GitHub. If you want to run the sample, please try it!

我为该演示制作的示例在GitHub上找到。 如果要运行示例,请尝试!

翻译自: https://medium.com/pytorch/hyperparameter-optimization-for-allennlp-using-optuna-acb8d96737e5

  • 0
    点赞
  • 0
    收藏
    觉得还不错? 一键收藏
  • 0
    评论
评论
添加红包

请填写红包祝福语或标题

红包个数最小为10个

红包金额最低5元

当前余额3.43前往充值 >
需支付:10.00
成就一亿技术人!
领取后你会自动成为博主和红包主的粉丝 规则
hope_wisdom
发出的红包
实付
使用余额支付
点击重新获取
扫码支付
钱包余额 0

抵扣说明:

1.余额是钱包充值的虚拟货币,按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载,可以购买VIP、付费专栏及课程。

余额充值