![](https://img-blog.csdnimg.cn/20201014180756927.png?x-oss-process=image/resize,m_fixed,h_64,w_64)
VQA
文章平均质量分 93
aos5
做最好的自己ᕦ(ò_óˇ)ᕤ
展开
-
Text-Instance Graph: Exploring the Relational Semantics for Text-based Visual Question Answering
Text-Instance Graph: Exploring the Relational Semantics for Text-based Visual Question Answering原创 2022-03-02 20:58:44 · 399 阅读 · 1 评论 -
【VQA文献阅读】VQS:将语义分割与视觉问答结合起来(ICCV2017)
VQS: Linking Segmentations to Questions and Answers for Supervised Attention in VQA and Question-Focused Semantic Segmentation文章目录VQS: Linking Segmentations to Questions and Answers for Supervised Attention in VQA and Question-Focused Semantic Segmentatio原创 2021-05-01 17:54:48 · 834 阅读 · 1 评论 -
【预训练视觉-语言模型文献阅读文献阅读】最新BERT模型——UNITER: UNiversal Image-TExt Representation Learning
【预训练视觉语言模型文献阅读】UNITER: UNiversal Image-TExt Representation Learning文章目录【预训练视觉语言模型文献阅读】UNITER: UNiversal Image-TExt Representation LearningAbstract1 Introduction介绍2 Related Work相关工作3 UNiversal Image-TExt Representation通用图像表示3.1 Model Overview模型概述3.2 预训练任务原创 2021-03-18 22:25:48 · 2398 阅读 · 2 评论 -
【预训练视觉-语言模型文献阅读】最新SOTA——Oscar
【VQA最新文献阅读】Oscar: Object-Semantics Aligned Pre-training for Vision-Language Tasks文章目录Abstract1 Introduction介绍3 Oscar Pre-training 预训练OscarInputPre-Training ObjectiveA Dictionary View: Masked Token Loss.A Modality View: Contrastive Loss.Discussion.Pre-trai原创 2021-03-18 18:38:51 · 3623 阅读 · 2 评论 -
【预训练视觉-语言模型文献阅读】VL-BERT: PRE-TRAINING OF GENERIC VISUAL- LINGUISTIC REPRESENTATIONS(ICLR 2020)
【视觉语言任务文献阅读】VL-BERT: PRE-TRAINING OF GENERIC VISUAL- LINGUISTIC REPRESENTATIONS文章目录【视觉语言任务文献阅读】VL-BERT: PRE-TRAINING OF GENERIC VISUAL- LINGUISTIC REPRESENTATIONSABSTRACT1 INTRODUCTION2 RELATED WORKPre-training for Computer VisionPre-training for Natural原创 2021-03-18 10:31:06 · 571 阅读 · 1 评论 -
【VQA文献阅读】(CVPR2019)Answer Them All! Toward Universal Visual Question Answering Models ——直观了解最新VQA数据集
【VQA文献阅读】Answer Them All! Toward Universal Visual Question Answering Models ——最新VQA综述Abtract视觉问答(VQA)的研究分为两个阵营:第一个阵营侧重于需要自然图像理解的VQA数据集第二个阵营侧重于测试推理的合成数据集一个好的VQA算法应该两者都有,但是只有少数VQA算法是以这种方式测试的。我们在覆盖两个领域的八个VQA数据集上比较了五种最先进的VQA算法。为了使比较公平,所有模型都尽可能标准化,例如,它们使用原创 2021-03-09 16:42:37 · 997 阅读 · 2 评论 -
【VideoQA最新论文阅读】第一篇视频问答综述Video Question Answering: a Survey of Models and Datasets
Video Question Answering: a Survey of Models and Datasets长文预警!!!p.s.此篇文章于2021年1月25日新鲜出炉,在Springer需要付费观看,博主免费分享给大家,希望与大家共同学习!Abstract视频问答(VideoQA)根据视频内容自动回答自然语言问题。它促进了在线教育、情景分析、视频内容检索等方面的发展。VideoQA是一项具有挑战性的任务,因为它需要一个模型来理解视频的语义信息和生成答案的问题。首先,我们提出了一个视频特征提原创 2021-02-16 22:11:33 · 6984 阅读 · 7 评论 -
【VideoQA最新文献阅读】Open-Ended Multi-Modal Relational Reason for Video Question Answering
Open-Ended Multi-Modal Relational Reason for Video Question AnsweringAbstract视觉障碍者不仅在引导和检索对象等基础性任务上迫切需要帮助,而且在描绘新环境等先进性任务上也迫切需要帮助。比起导盲犬,它们可能更需要能够提供语言交互的设备。在此基础上,我们将研究机器人代理与视障人之间的交互。在我们的研究中,我们将开发一个机器人代理,它将能够分析测试环境,并回答参与者的问题。在本文中,我们将讨论在人机交互中出现的问题,并找出相关的因素。原创 2021-02-14 21:46:37 · 1516 阅读 · 0 评论 -
【VQA文献阅读】PATHVQA: 30000+ QUESTIONS FOR MEDICAL VISUAL QUESTION ANSWERING
【VQA文献阅读】PATHVQA: 30000+ QUESTIONS FOR MEDICAL VISUAL QUESTION ANSWERING原文地址:https://arxiv.org/abs/2003.10286ABSTRACTIs it possible to develop an “AI Pathologist" to pass the board-certified examination of the American Board of Pathology? To achieve th原创 2021-02-09 11:36:59 · 1693 阅读 · 2 评论 -
【VQA文献阅读】VQA-Med: Overview of the Medical Visual Question Answering Task at ImageCLEF 2019
VQA-Med: Overview of the Medical Visual Question Answering Task at ImageCLEF 2019文章地址:http://ceur-ws.org/Vol-2380/paper_272.pdfAbstractThis paper presents an overview of the Medical Visual Ques-tion Answering task (VQA-Med) at ImageCLEF 2019. Particip原创 2021-02-09 11:03:28 · 1374 阅读 · 0 评论 -
[VQA文献阅读] FloodNet: A High Resolution Aerial Imagery Dataset for Post Flood Scene Understanding
背景文章题目:《FloodNet: A High Resolution Aerial Imagery Dataset for Post Flood Scene Understanding》文章下载:https://arxiv.org/pdf/2012.02951.pdfAbstractVisual scene understanding is the core task in makingany crucial decision in any computer vision system. Al-原创 2021-02-07 18:51:24 · 2048 阅读 · 4 评论