用于视觉问题回答的差异化注意力模型《Differential Attention for Visual Question Answering》

目录

一、文献摘要介绍

二、网络框架介绍

三、实验分析

四、结论


这是视觉问答论文阅读的系列笔记之一,本文有点长,请耐心阅读,定会有收货。如有不足,随时欢迎交流和探讨。

一、文献摘要介绍

In this paper we aim to answer questions based on images when provided with a dataset of question-answer pairs for a number of images during training. A number of methods have focused on solving this problem by using image based attention. This is done by focusing on a specific part of the image while answering the question. Humans also do so when solving this problem. However, the regions that the previous systems focus on are not correlated with the regions that humans focus on. The accuracy is limited due to this drawback. In this paper, we propose to solve this problem by using an exemplar based method. We obtain one or more supporting and opposing exemplars to obtain a differential attention region. This differential attention is closer to human attention than other image based attention methods. It also helps in obtaining improved accuracy when answering questions. The method is evaluated on challenging benchmark datasets. We perform better than other image based attention methods and are competitive with other state of the art methods that focus on both image and questions.

在本文中,作者的目标是在训练过程中为图像提供问题-答案对数据集时,基于图像回答问题。许多方法已经集中于通过使用基于图像的注意力来解决这个问题。这是通过在回答问题时专注于图像的特定部分来完成的。解决这个问题时,人类也会这样做。但是,以前系统关注的区域与人类关注的区域不相关。由于这个缺点,精度受到限制。在本文中,我们建议使用基于示例的方法来解决此问题。我们获得一个或多个支持和对立示例,以获得差异化的注意力区域。与其他基于图像的注意力方法相比,这种差异注意力更接近人类注意力。它也有助于在回答问题时提高准确性。在具有挑战性的基准数据集上评估了该方法。作者提出的模型比其他基于图像的注意力方法表现更好,并且与关注图像和问题的其他最新方法相比具有竞争力,大致流程如下图1所示。

二、网络框架介绍

        在给定图像 x_i 的情况下,

  • 0
    点赞
  • 1
    收藏
    觉得还不错? 一键收藏
  • 0
    评论
评论
添加红包

请填写红包祝福语或标题

红包个数最小为10个

红包金额最低5元

当前余额3.43前往充值 >
需支付:10.00
成就一亿技术人!
领取后你会自动成为博主和红包主的粉丝 规则
hope_wisdom
发出的红包
实付
使用余额支付
点击重新获取
扫码支付
钱包余额 0

抵扣说明:

1.余额是钱包充值的虚拟货币,按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载,可以购买VIP、付费专栏及课程。

余额充值