entity annotation与entity linking的区别

实体链接的定义是识别文本中的mention,并将其链接到知识库中。通常包括识别文本中的mention以及将mention链接到知识库中的entity两个步骤。部分工作也默认mention实现提供,而将重点放在实体消歧上。

笔者近日又看到了entity annotation的概念,好奇它和entity linking有什么区别。经查阅相关资料A framework for benchmarking entity-annotation systems,笔者认为,entity annotation的目标是服务于文本表示,是想要抽取出文本中有意义的片段,并将其链接到无歧义的identifiers上。从entity annotation的定义看,entity linking应该包含在entity annotation中,但entity annotation在entity linking的基础上,还会去除无意义的(对表示文本无益)的实体。

Classic approaches to document indexing, clustering, classification and retrieval are based on the bag-of-words paradigm. The limitations of this paradigm are well-known to the IR community and in recent years a good deal of work has attempted to move beyond by “grounding” the processed
texts with respect to an adequate semantic representation, by designing so-called entity annotators. The key idea is to identify, in the input text, short-and-meaningful sequences
of terms (also called mentions) and annotate them with unambiguous identifiers (also called entities) drawn from a catalog.
Most recent work adopts anchor texts occurring in
Wikipedia as entity mentions and the respective Wikipedia pages as the mentioned entity, because Wikipedia offers today the best trade-off between catalogs with a rigorous structure but low coverage (such as WordNet, CYC, TAP), and a large text collection with wide coverage but unstructured and noisy content (like the whole Web). The process of entity annotation involves three main steps: (1) parsing of the input text, which is the task to detect candidate entity mentions and link each of them to all possible entities they could mention; (2) disambiguation of mentions, which is the task of selecting the most pertinent Wikipedia page (i.e., entity) that best describes each mention; (3) pruning of a mention, which discards a detected mention and its annotated entity if they are considered not interesting or pertinent to the semantic interpretation of the input text.

  • 0
    点赞
  • 0
    收藏
    觉得还不错? 一键收藏
  • 0
    评论
评论
添加红包

请填写红包祝福语或标题

红包个数最小为10个

红包金额最低5元

当前余额3.43前往充值 >
需支付:10.00
成就一亿技术人!
领取后你会自动成为博主和红包主的粉丝 规则
hope_wisdom
发出的红包
实付
使用余额支付
点击重新获取
扫码支付
钱包余额 0

抵扣说明:

1.余额是钱包充值的虚拟货币,按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载,可以购买VIP、付费专栏及课程。

余额充值