2021-04-22 xf_event_extraction-attribution训练部分

最新推荐文章于 2024-07-16 10:12:59 发布

twt10512

最新推荐文章于 2024-07-16 10:12:59 发布

阅读量162

点赞数

文章标签：自然语言处理 pytorch

本文链接：https://blog.csdn.net/twt10512/article/details/116018968

版权

数据预处理

dev_examples, dev_callback_info = processor.get_dev_examples(dev_raw_examples)

examples  包含（ set_type =‘dev ’，text=文本，trigger（字和对应位置），label（时态，极性））# dev "明冠新材料...." ['现在', '肯定'] ('投资', 31)

转换为特征

dev_features = convert_examples_to_features(opt.task_type, dev_examples, opt.bert_dir,opt.max_seq_len, **feature_para)

tokenizer = BertTokenizer.from_pretrained(bert_dir)

encode_dict = tokenizer.encode_plus(text=tokens,                                                   #tokens 词表
                                    max_length=max_seq_len,
                                    pad_to_max_length=True,
                                    is_pretokenized=True,
                                    return_token_type_ids=True,
                                    return_attention_mask=True)

token_ids = encode_dict['input_ids']
attention_masks = encode_dict['attention_mask']
token_type_ids = encode_dict['token_type_ids']

    # 左右各取 20 的窗口作为 trigger 触发的语境   pooling_masks_range取 trigger之前的20 和 之后的20 考虑边界
    pooling_masks_range = range(max(1, trigger_loc[0] - window_size),
                                min(min(1 + len(raw_text), max_seq_len - 1), trigger_loc[1] + window_size))

pooling_masks ：触发词的语境 为1 但触发词位置 为0

pooling_masks = tensor([0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 1., 1., 1., 1., 1., 1.,
        1., 1., 1., 1., 1., 1., 1., 1., 1., 1., 1., 1., 1., 1., 0., 0., 1., 1.,
        1., 1., 1., 1., 1., 1., 1., 1., 1., 1., 1., 1., 1., 1., 1., 1., 1., 0.,
        0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0......]) 长度为512

tmp =(1 - pooling_masks) * (-1e7)  其中shape为768*512 ，512表示为文本位置信息
=tensor([-1.0000e+07, -1.0000e+07, -1.0000e+07, -1.0000e+07, -1.0000e+07,
        -1.0000e+07, -1.0000e+07, -1.0000e+07, -1.0000e+07, -1.0000e+07,
        -1.0000e+07, -1.0000e+07,  1.1424e+00,  5.0685e-01,  2.6580e-01,
         6.0688e-01,  7.7622e-01,  1.3689e-01, -1.8831e-01, -1.1735e+00,
        -6.0387e-01, -2.7806e-01,  2.1010e-01,  6.7245e-01,  1.9266e-01,
        -4.4677e-01,  7.4722e-01,  9.4186e-01,  6.2497e-01,  3.8681e-01,
         9.3025e-01,  1.0526e+00, -1.0000e+07, -1.0000e+07,  1.1034e+00,
         3.6335e-01,  1.6081e+00,  1.5982e+00,  4.6920e-01,  2.0704e-01,
         9.3432e-01,  5.9635e-01,  2.5174e-01,  6.4896e-01,  7.2911e-01,
         1.2812e+00,  2.1220e-01,  8.9959e-01,  6.5138e-01,  9.1962e-01,
         4.0591e-01,  7.4915e-01,  5.0308e-01, -1.0000e+07, -1.0000e+07,
        -1.0000e+07, -1.0000e+07, -1.0000e+07, -1.0000e+07, -1.0000e+07,

tmp.shape = 768*512 其中512中每一行中语境是小数，其他为负的很大的数。

然后最大池化。得到 768*1维数组。即在768每个维度选取了512位置信息中最大的池化，融合成一个768向量。

feature = AttributionFeature(token_ids=token_ids,
                             attention_masks=attention_masks,
                             token_type_ids=token_type_ids,
                             trigger_loc=trigger_loc,
                             pooling_masks=pooling_masks,
                             labels=labels) # labels对应 {'polarity2id': {'肯定': 0, '可能': 1, '否定': 2}, 'tense2id': {'过去': 0, '将来': 1, '其他': 2, '现在': 3}中的数字

输出结果

        seq_out = bert_outputs[0]  # [64,512,768]

        trigger_label_feature = self._batch_gather(seq_out, trigger_index)  # [64, 2, 768]  取出触发词对应位置的768
        trigger_label_feature = trigger_label_feature.view([trigger_label_feature.size()[0], -1]) # [64, 1536] 

        seq_out = torch.transpose(seq_out, -1, -2)     # [64, 512, 768] => [64, 768, 512]

        pooling_masks = torch.unsqueeze(pooling_masks, 1)     #[64, 512] => [64, 1, 512]

        seq_out = seq_out + (1 - pooling_masks) * (-1e7)  # mask 无关区域    seq_out在触发词位置加一个-1e7？

        # seq_out.shape=[64, 768, 512] => self.pooling_layer(seq_out).shape =[64, 768, 1]

        pooled_out = self.pooling_layer(seq_out).squeeze(-1)   # torch.Size([64, 768])

        logits = torch.cat([pooled_out, trigger_label_feature], dim=-1)  # [64, 768] 拼接[64, 1536] =>[64, 2304]

        polarity_logits = self.polarity_classifier(self.dropout_layer(logits)) # torch.Size([64, 3])

        tense_logits = self.tense_classifier(self.dropout_layer(logits))  #[64, 4]

        out = (torch.softmax(polarity_logits, dim=-1), torch.softmax(tense_logits, dim=-1),)

其中过程包含 3条 768，分别为触发词起始位置768 ，触发词终止位置768 ，有语境pooling_masks和最大池化产生的tmp 768。

3条768 分别链接时态，极性全连接神经元。softmax去预测，loss-> nn.CrossEntropyLoss()去训练。

twt10512

关注

0
点赞
踩
0

收藏

觉得还不错? 一键收藏
0
评论
2021-04-22 xf_event_extraction-attribution训练部分

数据预处理dev_examples, dev_callback_info = processor.get_dev_examples(dev_raw_examples)examples 包含（ set_type =‘dev ’，text=文本，trigger（字和对应位置），label（时态，极性））#dev "明冠新材料...." ['现在', '肯定'] ('投资', 31)转换为特征dev_features = convert_examples_to_features(opt..
复制链接

扫一扫