使用哈尔滨工业大学SCIR的开源代码训练自己的ELMo

最新推荐文章于 2020-10-25 22:14:40 发布

codebrid

最新推荐文章于 2020-10-25 22:14:40 发布

阅读量2.9k

点赞数

分类专栏：自然语言处理

本文链接：https://blog.csdn.net/ccbrid/article/details/90545836

版权

自然语言处理专栏收录该内容

12 篇文章 1 订阅

订阅专栏

本篇博客使用哈尔滨工业大学SCIR实验室的ELMoForManyLangs

链接：https://github.com/HIT-SCIR/ELMoForManyLangs

使用方法：

1. gitclone 到本地

2. 在Downloads处~~（提供了各种语言（包括简体中文）~~下载预训练好的语言模型，下载的语言模型中带有自己的config。

3. 执行setup命令

python setup.py install

4. 设置模型中（例如zhs.model/config.json）中的config_path为cnn_50_100_512_4096_sample.json的相对位置

如何finetuing ELMo？

在只使用ELMO提供的embedding时，ELMoForManyLangs/elmo.py的class Embedder中168行中存在model.eval()，在自己的代码中正式调用ELMOembedding时使用了with torch.no_grad()来保证不对elmo进行更改，且提高运行速度减少显存。

同理，在finetuing ELMO时，168行的model.eval()要关掉，且不要在elmo外加with torch.no_grad()即可。

（该种方法需要更大的显存）

如何训练自己的ELMo？

1. 配置要求

python >= 3.6；pytorch 0.4；other requirements from allennlp

2. 准备好输入数据和词表

数据格式：

Notable alumni
Aris Kalafatis ( Acting )
Labour Party
They build an open nest in a tree hole , or man - made nest - boxes .
Legacy

3. 进入目录执行命令

python -m elmoformanylangs.biLM train \
    --train_path data/en.raw \
    --config_path configs/cnn_50_100_512_4096_sample.json \
    --model output/en \
    --optimizer adam \
    --lr 0.001 \
    --lr_decay 0.8 \
    --max_epoch 10 \
    --max_sent_len 20 \
    --max_vocab_size 150000 \
    --min_count 3 --gpu 2

-train_path：用于训练的数据，数据格式如上文
-config_path：
-model：训练好的模型的保存地址
-max_sent_len：例如一个含70词的句子，由于max_len=30，会被分成3个句子
-max_vocab_size：代码中未查到使用？？
-min_count：最少word数量为3（<S></S><UNK>）

4. 原文使用了20-million word，

However, we need to add that the training process is not very stable. In some cases, we end up with a loss of nan. We are actively working on that and hopefully improve it in the future.

The training of ELMo on one language takes roughly 3 days on an NVIDIA P100 GPU.

codebrid

关注

0
点赞
踩
11

收藏

觉得还不错? 一键收藏
1
评论
使用哈尔滨工业大学SCIR的开源代码训练自己的ELMo

本篇博客使用哈尔滨工业大学SCIR实验室的ELMoForManyLangs链接：https://github.com/HIT-SCIR/ELMoForManyLangs使用方法：1. gitclone 到本地2. 在Downloads处（提供了各种语言（包括简体中文）下载预训练好的语言模型，下载的语言模型中带有自己的config。3. 执行setup命令python se...
复制链接

扫一扫