Semantic-aware Scene Recognition 阅读笔记

本文介绍了2020年发表在Pattern Recognition上的一篇论文,研究如何解决场景分类中的语义歧义性问题。论文提出了一种端到端的多模态CNN架构,通过结合图像和语义分割信息,利用注意力模块增强学习到的场景内容,提升场景识别的准确性。在MIT Indoor 67, SUN 397 和 Places365 数据集上取得最先进的结果。" 78516725,6751841,使用OpenCV创建与遍历3D图像,"['图像处理', 'OpenCV', '计算机视觉', '矩阵操作']
摘要由CSDN通过智能技术生成

Semantic-aware Scene Recognition阅读笔记

该论文是发表在 2020 年 Pattern Recognition 的论文,作者是西班牙的学者。

Abstract:

场景分类的主要问题
Semantic ambiguity (语义歧义性): images of several scene classes may share similar objects, which causes confusion among them. The problem is aggravated when images of a particular scene class are notably different.

  • 不同的场景类别会共享相似的物体,从而导致他们语义模糊;
  • 当同一类别的差异越大时,这个问题就越显著。

本文主要贡献
An end-to-end multi-modal CNN is proposed, which combines image and context information by an attention module .
Context information, in the shape of a semantic segmentation, is used to gate RGB-features by leveraging on information encoded in the semantic representation: the set of scene objects and stuff, and their relative locations.
This gating process reinforces the learning of indicative scene content and enhances scene disambiguation by refocusing the receptive fields of the CNN towards them.

  • 通过注意力模块将图像和上下文信息结合在一起。
  • 通过利用编码在语义特征上的信息(场景物体集合,及其相对位置),上下文信息被用于控制RGB特征。
  • 该控制过程通过将CNN的感受野朝向更具判别性的区域,从而学习到指示场景内容和加强了场景消歧性。

Problem, Challenge, Motivation, and Contribution

Problem: The complexity of the scene recognition task lies partially on the ambiguity between different scene categories showing similar appearances and objects’ distributions: inter-scene boundaries can be blurry, as the set

评论
添加红包

请填写红包祝福语或标题

红包个数最小为10个

红包金额最低5元

当前余额3.43前往充值 >
需支付:10.00
成就一亿技术人!
领取后你会自动成为博主和红包主的粉丝 规则
hope_wisdom
发出的红包
实付
使用余额支付
点击重新获取
扫码支付
钱包余额 0

抵扣说明:

1.余额是钱包充值的虚拟货币,按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载,可以购买VIP、付费专栏及课程。

余额充值