Good Visual Guidance Makes A Better Extractor:Hierarchical Visual Prefix for Multimodal Entity and

Abstract
problem:
MNER and MRE usually suffer from error sensitiv ity when irrelevant object images incorporated in texts.
solution:
H ierarchical V isual P refix fusion NeT work
detail:
regard visual representation as pluggable visual prefix to guide the textual repre sentation for error insensitive forecasting deci sion
a dynamic gated aggregation strategy to achieve hierarchical multi scaled visual features as visual prefix for fu sion
1 Introduction
main contribution:
present a hierarchical visual prefix fusion network towards MNER and MRE
the first work to leverage hierarchical pyramidal visual features for multimodal learning
2 Related work
Multimodal Entity and Relation Extraction
text-only -> multimodel ignoring the error sensitivity -> multimodel with classifier reducing irrelevant images but requiring expensive annotation -> our works
Pre-trained Multimodal Representation
the existing visual-linguistic BERT models:Architecture And Pretraining tasks
why not applying current visual-language models to the MNER and MRE task?
MNER and MRE mainly focus on leveraging visual information to enhance the text rather than conducting prediction on the image side
3 Methodology
The overall architecture of our hierarchical visual prefix for multimodal entity and relation extraction
3.1 Collection of Pyramidal Visual Feature
the regional image providing more se mantic knowledge to assist information extraction
global images express abstract concepts as weak learning signals
so we take the regional images as the vital information and the global images as the supplement
adopt the visual grounding toolkit for extracting local visual objects with top m salience
rescale the global image and object image to 224 × 224 pixels as the
global image: I and visual objects: O = { o 1 , o 2 , ..., o m , }
given an image, we encode it with a backbone model and generate a list of pyramidal feature maps { F 1 , F 2 , F 3 , . . . , F c

后面略

  • 0
    点赞
  • 0
    收藏
    觉得还不错? 一键收藏
  • 0
    评论

“相关推荐”对你有帮助么?

  • 非常没帮助
  • 没帮助
  • 一般
  • 有帮助
  • 非常有帮助
提交
评论
添加红包

请填写红包祝福语或标题

红包个数最小为10个

红包金额最低5元

当前余额3.43前往充值 >
需支付:10.00
成就一亿技术人!
领取后你会自动成为博主和红包主的粉丝 规则
hope_wisdom
发出的红包
实付
使用余额支付
点击重新获取
扫码支付
钱包余额 0

抵扣说明:

1.余额是钱包充值的虚拟货币,按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载,可以购买VIP、付费专栏及课程。

余额充值