transE阅读笔记

张不开的手

已于 2022-04-02 20:24:18 修改

阅读量281

点赞数

文章标签：知识图谱

于 2022-04-02 20:21:42 首次发布

本文链接：https://blog.csdn.net/m0_51002859/article/details/123928225

版权

paper链接：https://proceedings.neurips.cc/paper/2013/file/1cecc7a77928ca8133fa24680a88d2f9-Paper.pdfhttps://proceedings.neurips.cc/paper/2013/file/1cecc7a77928ca8133fa24680a88d2f9-Paper.pdf

一些名词的理解
- translation 应该译作“平移”，head向量经过relation向量的平移（向量相加），应该靠近tail向量。这里的loss就可以以选定的范数来进行度量（“距离”）
模型的理解
- 特点：我用的参数少（simplicity），好训练
  - easy to train, contains a reduced number of parameters and can scale up to very large databases
- greater expressiveness seems to be more synonymous to underfitting than to better performance
- challenge：数据的异质性
  - difficulty of relational data is that the notion of locality may involve relationships and entities of different types at the same time,
算法流程：
- 1 uniform初始化随机赋值，对每个实体和关系生成100维向量
- 2 每次训练采样，nbatch=400
  - 相关参数：entity : 14951 , relation : 1345 , triple : 483142
    batch size: 1207
- 3 每个batch为一组，对每个（h,r,t）triplet以50%几率随机替换head 或尾（不同时）
- 4 比较计算相关loss，梯度更新（即完成SGD）
其他模型
- SE [3] embeds entities into R k , and relationships into two matrices L1 ∈ R k×k and L2 ∈ R k×k such that d(L1h, L2t) is large for corrupted triplets (h, l, t) (and small otherwise).
- SE 是求矩阵乘积，得到两个矩阵比较；transE是向量加法，得到向量间的比较
实验
- 数据集
  - Wordnet / freebase
  - freebase15k：
    - entity2id：
      /m/06rf7 0
      /m/0c94fn 1
    - relation2id：
```
/sports/sports_team/roster./soccer/football_roster_position/player  8
/business/company_type/companies_of_this_type  9
```
    - train（h,t,r）：
      /m/07s9rl0 /m/0170z3 /media_common/netflix_genre/titles
- Metric:
  - 1，positive的平均rank MEAN RANK
  - 2，positive在前十的比例 HITS@10 (%)
- results：outperform
细节：
- filter：过滤corrupt后仍然为正例的情况
- 没理解为什么grad_pos 和 grad_neg 要置为 $\pm$ 1

        for triple, corrupted_triple in Tbatch:
            # 取copy里的vector累积更新
            h_correct_update = copy_entity[triple[0]]
            t_correct_update = copy_entity[triple[1]]
            relation_update = copy_relation[triple[2]]

            h_corrupt_update = copy_entity[corrupted_triple[0]]
            t_corrupt_update = copy_entity[corrupted_triple[1]]

            # 取原始的vector计算梯度
            h_correct = self.entity[triple[0]]
            t_correct = self.entity[triple[1]]
            relation = self.relation[triple[2]]

            h_corrupt = self.entity[corrupted_triple[0]]
            t_corrupt = self.entity[corrupted_triple[1]]

            if self.L1:
                dist_correct = distanceL1(h_correct, relation, t_correct)
                dist_corrupt = distanceL1(h_corrupt, relation, t_corrupt)
            else:
                dist_correct = distanceL2(h_correct, relation, t_correct)
                dist_corrupt = distanceL2(h_corrupt, relation, t_corrupt)

            err = self.hinge_loss(dist_correct, dist_corrupt)

            if err > 0:
                self.loss += err

                grad_pos = 2 * (h_correct + relation - t_correct)
                grad_neg = 2 * (h_corrupt + relation - t_corrupt)

                if self.L1:
                    for i in range(len(grad_pos)):
                        if (grad_pos[i] > 0):
                            grad_pos[i] = 1
                        else:
                            grad_pos[i] = -1

                    for i in range(len(grad_neg)):
                        if (grad_neg[i] > 0):
                            grad_neg[i] = 1
                        else:
                            grad_neg[i] = -1

                # head系数为正，减梯度；tail系数为负，加梯度
                h_correct_update -= self.learning_rate * grad_pos
                t_correct_update -= (-1) * self.learning_rate * grad_pos

                # corrupt项整体为负，因此符号与correct相反
                if triple[0] == corrupted_triple[0]:  # 若替换的是尾实体，则头实体更新两次
                    h_correct_update -= (-1) * self.learning_rate * grad_neg
                    t_corrupt_update -= self.learning_rate * grad_neg

                elif triple[1] == corrupted_triple[1]:  # 若替换的是头实体，则尾实体更新两次
                    h_corrupt_update -= (-1) * self.learning_rate * grad_neg
                    t_correct_update -= self.learning_rate * grad_neg

                #relation更新两次
                relation_update -= self.learning_rate*grad_pos
                relation_update -= (-1)*self.learning_rate*grad_neg