弱监督论文阅读《object instance mining for weakly supervised object detection》

文章一:《object instance mining for weakly supervised object detection》

这一个系列是我读弱监督目标检测文章的笔记。主体部分是对于整个文章的知识的摘抄。
前言:

这篇文章主要面向这样一个场景:一张图像中同一个类别的目标很多,基于多示例学习的方法,比如WSDDN一类。通常在建模的时候会在这个场景上做的较少。而本文提出了OIM的方法。通过“高置信的候选区域与其周围高重叠度的目标可能属于同一类别"这一假设,建立spatial graph(SG);通过“同一类目标的外观相似”这一假设,建立appearance graph(AG);为了使得模型从更具有判别性的局部区域减少关注,建立了instance reweighted loss (IR)。整个文章通过一个OIM算法统一SG、AG和IR进行训练。获得了比原WSDDN方法更好的效果。

Abstract:
This paper is accpected by AAAI2020. And it aims to focus on an improtant case: It will miss object instances when there are many instances for one category in an image. And the model has three contributions: spatial graph、appearance graph and instance rewighted loss.

Introduction:
Most previous approaches follow the framework of combining multiple instance learning with CNN, which usually mines the most confident class-specific object proposals for learning CNN-based classifier, regardless of the number of object instances appearing in an image.
Many images in the challenging VOC datasets contain more than one object instance from the same class. For example, the trainval set in VOC 2007 has 7913 image-level labels, which contains 15662 annotated object instances. It means that at least 7749 instances are not selected during training.
For mining more instances ,this paper proposes the method, called OIM(object instance mining framework). It is based on two fundamental assumptions:A:the highest confidence proposal and its surrounding highly overlapped proposals should probably belong to the same class.(for building the spatial graph) B: the objects from the same class should have high appearance similarity.(for building the appearance graph)
The key contributions can be summarized as follows:
An object instance mining approach using spatial and appearance graphs is developed to mine all possible object instances with only image-level annotation, and it can significantly improve the discriminative capability of the trained CNN classifier.
An object instance reweighted loss by adjusting the weight of loss function of different instances is proposed to learn more accurate CNN classifier.
Method:
在这里插入图片描述

(1)The spatial graph
Given an input image I with class label c, a set of region propsals P={p_1 〖,p〗2,…,p_N } and their corresponding confidence scores X={x_1,x_2,…x_N}, the core instance(proposal) p(i_c ) with the highest confidence score x_(i_c ) can be selected. Here i_c donates the index of this core instance.
The core spatial graph can be defined by G_(i_c)s=(V_(i_c)s,E_(i_c)^s), where each node represents a selected proposal which has the saptial similarity. Each edge represents such spatial aimilarity.
(2)The appearance graph
We define feature vectors of each proposal as F={f_1,f_2,…,f_N} and it can be generated from the fully connected layer. Then the appearance graph is defined as Ga={Va,E^a}, where each node is a selected proposal which has high appearance similarity with the core instance and each edge represents the appearance similarity.

The similarity between the core instance and the other proposals can be calculated as follows:
在这里插入图片描述
F denotes the feature map.
And only when the proposals p_j meet the condition that the D_(i_c,j)<αD_avg and p_j has no overlap with all the proposals previously selected. Such proposal can be added into the nodes in G^a.
D_avg represents the average inter-class similarity of the core spatial graph :
在这里插入图片描述
在这里插入图片描述
(3)the alogorithm of object instance mining
warning: pay attention to the place where the alogrithm is used.
在这里插入图片描述

在这里插入图片描述
(4)instance reweighted loss在这里插入图片描述
to guide the network to pay more attetion on learning the less discriminative regions of the object instance in each graph G^s.

Experiment:
we use VOC2007、VOC2012 to train and test the model, and the evaluation metics are mAP and CorLoc.
VGG16 as the backbone
NMS with IoU of 0.3 per class is performed to calculate mAP and CorLoc.(all the papers are same!!!)
在这里插入图片描述Sota:56.8%在这里插入图片描述在这里插入图片描述

SOTA:53.6%
在这里插入图片描述

Ablation
在这里插入图片描述
Conclution
This paper aims to mine more instance than before to solve the “miss instance”.
And the model has three contributions: spatial graph、appearance graph and instance rewighted loss.

  • 0
    点赞
  • 1
    收藏
    觉得还不错? 一键收藏
  • 0
    评论
评论
添加红包

请填写红包祝福语或标题

红包个数最小为10个

红包金额最低5元

当前余额3.43前往充值 >
需支付:10.00
成就一亿技术人!
领取后你会自动成为博主和红包主的粉丝 规则
hope_wisdom
发出的红包
实付
使用余额支付
点击重新获取
扫码支付
钱包余额 0

抵扣说明:

1.余额是钱包充值的虚拟货币,按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载,可以购买VIP、付费专栏及课程。

余额充值