Entity linking 2

最新推荐文章于 2022-04-25 14:43:02 发布

是ひま呀

最新推荐文章于 2022-04-25 14:43:02 发布

阅读量135

点赞数

分类专栏： # WDPS 课程笔记

本文链接：https://blog.csdn.net/Odessa_R/article/details/103560457

版权

WDPS 同时被 2 个专栏收录

5 篇文章 0 订阅

订阅专栏

课程笔记

5 篇文章 0 订阅

订阅专栏

Candidate Entity Ranking

两种排序方式：

Supervised ranking methods
unsupervised ranking methods

Features

两种feature

- context-independent features
Simply check weather the mention and the entity label in the KB match:
- exact matching
- dice coefficient(系数/协同) score
- hemming distance

- context-dependent features 需要读取实体上下文
entity popularity: 选一个最常见的释义
entity type: NER可以返回给定词的broad type (person, organisation, location…). 通过判断type的配对来确定含义。

bag of words (BOW)
all words in the doc that contains the entity mention and match with the words associated to the entity
concept vectors
从给定文章中可以提取出key-phrases, anchor text, named entities. 用这些features来创建vector来代表实体和释义。它们之间的相似度可以用cosine similarity和jaccard similarity来进行计算。
coherence between mapping
在一篇文章里，实体和一到两个主题是一致的
可以通过计算两个实体和两个释义的相关度来进行计算。在Wikipedia中我们可以通过计算有多少篇文章关联向同一对实体。

Supervised ranking methods

Binary classification methods

输入<mention, entity>，我们可以训练一个classifier返回1或0来判别mapping是否准确。
eg. SVM, Naive Bayes Classifiers

probabilistic methods

除了用classifier，我们还可以用概率模型来表示准确度

Unsupervised ranking methods

Graph based approaches
AIDA system: entity-mention and entity-entity relations as a graph. 每一条边都代表实体释义可能性的权重

find a subgraph where only one entity-mention edges with max weight. NP-hard (greedy algorithm)

VSM based models （vector space model）
获取好的训练数据困难又很贵
只计算释义和备选实体间的相似度

Unlinkable mention prediction

ignore the problem.
如果备选项为零，假定实体不可连接
use a threshold value on the ranking score
train a binary classifier
add NIL as special entity. 如果NIL得分最高，则认为实体不可连接

是ひま呀

关注

0
点赞
踩
0

收藏

觉得还不错? 一键收藏
0
评论
Entity linking 2

Candidate Entity Ranking两种排序方式：Supervised ranking methodsunsupervised ranking methodsFeatures两种feature- context-independent featuresSimply check weather the mention and the entity label in th...
复制链接

扫一扫