FewRel解析

最新推荐文章于 2025-04-24 16:47:22 发布

rainbow_lucky0106

最新推荐文章于 2025-04-24 16:47:22 发布

阅读量2.8k

点赞数 2

分类专栏： Deep Learning

本文链接：https://blog.csdn.net/qq_21980099/article/details/88936939

版权

Deep Learning 专栏收录该内容

56 篇文章

订阅专栏

一、数据集解析

glove.5B.50d.json

word to vector转换表

训练集test.json与验证集val.json

验证集分为两部分（***比例？？？***），以实现测试：sample a pair of input and standard output file from the validation set.
格式解析
file_name: Json file storing the data in the following format
{
“P155”: # relation id
[
{
“token”: [“Hot”, “Dance”, “Club”, …], # sentence
“h”: [“song for a future generation”, “Q7561099”, [[16, 17, …]]], # head entity [word, id, location]
“t”: [“whammy kiss”, “Q7990594”, [[11, 12]]], # tail entity [word, id, location]
},
…
],
“P177”:
[
…
]
…
}

word_vec_file_name: Json file storing word vectors in the following format
[
{‘word’: ‘the’, ‘vec’: [0.418, 0.24968, …]},
{‘word’: ‘,’, ‘vec’: [0.013441, 0.23682, …]},
…
]

max_length: The length that all the sentences need to be extend to.

case_sensitive: Whether the data processing is case-sensitive（是否区分大小写）, default as False.

reprocess: Do the pre-processing whether there exist pre-processed files, default as False.

cuda: Use cuda or not, default as True.

二、创建对象实例

创建JSONFileDataLoader对象实例train_data_loader\val_data_loader\test_data_loader

判断是否reprocess(预处理？)或_processed_data不存在

是reprocess(预处理？) 或 _processed_data不存在

Loading data file（train.json） & word vector file（glove.6B.50d.json）

self.ori_data = data file
self.ori_word_vec = word vector file

判断是否区分大小写 case sensitive

不区分大小写：遍历每一个relation中的每一个instance中的每一个tokens，将其中每个字母变小写。

Pre-process word vec

self.word2id
self.word_vec_tot=400000：glove中word总数
UNK\BLANK放在word_vec_tot末尾
self.word_vec_dim=50：每个word的维度
=》Got 400000 words of 50 dims
Building word vector matrix and mapping：word2id对应矩阵、word_vector映射关系建立
- 初始化word_vector矩阵：tensor(word_vec_tot * word_vec_dim)
- 对word vector file中每个word，根据位置关系转换id，存储到word2id。
- 每个字对应的vector储存在vord_vec_mat(id索引)中
- self.word_vec_mat[cur_id] / np.sqrt(np.sum(self.word_vec_mat[cur_id] ** 2) 控制矩阵中每个值的范围？？？？？
- UNK与BLANK存在word2id最后。

Pre-processing data

self.instance_tot（instance的总数）=每个relation中的数目相加 = 700*relation
self.data_word、self.data_pos1、self.data_pos2、self.data_mask：初始化word,pos1,pos2,mask；大小为instance_tot * max_length
self.data_length初始化长度(instance的总数)：data_length[i] (记录每个句子tokens的长度)
self.rel2scope
cur_ref_data_word初始化为第i个instance
- cur_ref_data_word[i]保存第i个instance中每个word对应id，不在字典表中的即为UNK，长度不足max_length补BLANK
超过max_length则截断(长度限制)：data_length[i] <= max_length；pos1\pos2 < max_length
设置self.data_pos1
设置self.data_mask[i][j]
- 超过原始句子长度：mask=0
- 在两实体之前：mask=1
- 在两实体之间：mask=2
- 实体后，原句长度填充：mask=3
self.rel2scope[relation]：记录当前relation所拥有的instance范围

Storing processed files: 读入_processed_data目录文件

创建FewShotREFramework对象实例framework

创建CNNSentenceEncoder对象实例

word embedding\position embedding
embedding后卷积、池化

三、选择模型

metanet

计算损失

nn.CrossEntropyLoss()

embedding

encoder

basic_encoder

attention_encoder

linear层: 线性变换

basic_fc

attention_fc

learner_basic

线性变换(2,20)(20,20)(20,1)

learner_attention

LSTM+linear

training

ckpt_dir=’./checkpoint’,
test_result_dir=’./test_result’,
learning_rate=1e-1,
lr_step_size=20000, Decay learning rate every lr_step_size steps
weight_decay=1e-5, Rate of decaying weight
train_iter=30000,
val_iter=1000,
val_step=2000, Validate every val_step steps
test_iter=3000