Simplify the Usage of Lexicon in Chinese NER跑论文代码遇坑及解决方法

北人不归

已于 2022-01-20 12:39:07 修改

阅读量1.7k

点赞数 3

文章标签： pytorch 人工智能 python

于 2022-01-20 11:59:33 首次发布

本文链接：https://blog.csdn.net/weixin_44192983/article/details/122598197

版权

github上给出的版本信息是：python3.6，pytorch0.4.1

遇坑：

1.安装后运行显示需要装transformers，直接pip install transformers，结果版本太高不匹配

解决：搜了一下发现装transformers3.4.0版本的比较多，就去装了3.4.0

2.装完后import transformers出现ImportError: cannot import name '_softmax_backward_data'

解决：再降版本，安装transformers2.1.1，这下可以了

3.再次运行程序，出现error：

Traceback (most recent call last):
  File "D:\program\envs\pytorch\lib\site-packages\urllib3\connection.py", line 175, in _new_conn
    (self._dns_host, self.port), self.timeout, **extra_kw
  File "D:\program\envs\pytorch\lib\site-packages\urllib3\util\connection.py", line 95, in create_connection
    raise err
  File "D:\program\envs\pytorch\lib\site-packages\urllib3\util\connection.py", line 85, in create_connection
    sock.connect(sa)
TimeoutError: [WinError 10060] 由于连接方在一段时间后没有正确答复或连接的主机没有反应，连接尝试失败。

解决：一开始以为是联网的问题，开关防火墙、科学上网都试过了，无解。然后debug一步一步走，发现代码里需要连接一个网址：

url:'https://s3.amazonaws.com/models.huggingface.co/bert/bert-base-chinese-vocab.txt'

恍然大悟是bert模块没装，自己得先下载下来（我是小白我承认5555），放到路径中：

path = 'C:\\Users\\yuyuan\\.pytorch_pretrained_bert\\bert-base-chinese'

然后修改文件functions.py，添加上面这个path，再改动对应函数：

#tokenizer = BertTokenizer.from_pretrained('bert-base-chinese', do_lower_case=True)
tokenizer = BertTokenizer.from_pretrained(path)

gazlstm.py里也要改：

if self.use_bert:
    #self.bert_encoder = BertModel.from_pretrained('bert-base-chinese')
    self.bert_encoder = BertModel.from_pretrained(path)
    for p in self.bert_encoder.parameters():
        p.requires_grad = False

ok终于解决，可以成功调用bert（虽然这个方法很简单粗暴，再换电脑还得修改代码，害）

4.继续往下运行，输出build batched crf...后，出现错误：（这个时候已经快暴走了）

cublas runtime error : the GPU program failed to execute at C:/ProgramData/Miniconda3/conda-bld/pytorch_1533096106539/work/
aten/src/THC/THCBlas.cu:249

解决：一通搜索发现是显卡和CUDA不匹配……我的电脑是RTX3050Ti，只有安装CUDA11.0及以上版本才能用GPU，而我为了用pytorch0.4.1，装的是CUDA9.0。

要解决只能升级CUDA，但CUDA11.0对应的pytorch版本是1.7.1，高版本pytorch跑这个论文代码肯定会出点问题。

没办法，只能重装CUDA11.0，对应的cudnn，以及对应的pytorch

5.可以继续运行了，顺利进入训练，但开始疯狂warning刷屏，训练信息都看不到了：

D:\undergraduation\LexiconAugmentedNER-master\model\gazlstm.py:151: UserWarning: masked_fill_ received a mask with dtype torch.uint8, this behavior is now deprecated,please use a mask with dtype torch.bool instead. (Triggered internally at  ..\aten\src\ATen\native\cuda\LegacyDefinitions.cpp:28.)
  gaz_embeds = gaz_embeds_d.data.masked_fill_(gaz_mask.data, 0)  #(b,l,4,g,ge)  ge:gaz_embed_dim
D:\undergraduation\LexiconAugmentedNER-master\model\crf.py:97: UserWarning: indexing with dtype torch.uint8 is now deprecated, please use a dtype torch.bool instead. (Triggered internally at  ..\aten\src\ATen/native/IndexingUtils.h:25.)
  masked_cur_partition = cur_partition.masked_select(mask_idx)
D:\undergraduation\LexiconAugmentedNER-master\model\crf.py:102: UserWarning: masked_scatter_ received a mask with dtype torch.uint8, this behavior is now deprecated,please use a mask with dtype torch.bool instead. (Triggered internally at  ..\aten\src\ATen\native\cuda\LegacyDefinitions.cpp:72.)
  partition.masked_scatter_(mask_idx, masked_cur_partition)
D:\undergraduation\LexiconAugmentedNER-master\model\crf.py:248: UserWarning: indexing with dtype torch.uint8 is now deprecated, please use a dtype torch.bool instead. (Triggered internally at  ..\aten\src\ATen/native/IndexingUtils.h:25.)
  tg_energy = tg_energy.masked_select(mask.transpose(1,0))
[W ..\aten\src\ATen\native\cuda\LegacyDefinitions.cpp:72] Warning: masked_scatter_ received a mask with dtype torch.uint8, this behavior is now deprecated,please use a mask with dtype torch.bool instead. (function masked_scatter__cuda)
[W ..\aten\src\ATen\native\cuda\LegacyDefinitions.cpp:28] Warning: masked_fill_ received a mask with dtype torch.uint8, this behavior is now deprecated,please use a mask with dtype torch.bool instead. (function masked_fill__cuda)
[W IndexingUtils.h:25] Warning: indexing with dtype torch.uint8 is now deprecated, please use a dtype torch.bool instead. (function expandTensors)

下面都是在重复后面三个warning。

解决：把batchify_with_label函数return里的mask改成mask.bool()

 # print(bert_seq_tensor.type())
    return gazs, word_seq_tensor, biword_seq_tensor, word_seq_lengths, label_seq_tensor, layer_gaz_tensor, gaz_count_tensor,gaz_chars_tensor, gaz_mask_tensor, gazchar_mask_tensor, mask.bool(), bert_seq_tensor, bert_mask

这时还有一个warning，显示在gazlstm.py中

再修改get_tags函数：

gaz_mask = gaz_mask_input.unsqueeze(-1).repeat(1,1,1,1,self.gaz_emb_dim)

# 加一句
gaz_mask = gaz_mask.bool()

gaz_embeds = gaz_embeds_d.data.masked_fill_(gaz_mask.data, 0)  #(b,l,4,g,ge)  ge:gaz_embed_dim

ok，没有warning了，顺利运行！