ICCV2017——CoupleNet: Coupling Global Structure with Local Parts for Object Detection

practical_sharp

于 2021-04-17 17:17:34 发布

阅读量635

点赞数 1

分类专栏：深度学习文章标签：目标检测

本文链接：https://blog.csdn.net/practical_sharp/article/details/115797290

版权

深度学习专栏收录该内容

21 篇文章 2 订阅

订阅专栏

CoupleNet: Coupling Global Structure with Local Parts for Object Detection

学习CoupleNet，这个网络结构就是使用全局上下文和局部上下文进行融合，检测。

在最近的一些论文中我也看到过这么的操作，但是他们的出发点都应该是来自于这一篇论文。

所以学习CoupleNet是如何进行检测，如何进行局部上下文和全局上下文的融合的至关重要。

学习CoupleNet主要抓住以下问题：

CoupleNet是基于RFCN进行改进的，首先要掌握Faster RCNN和RFCN；
CoupleNet中关于全局上下文信息是如何提取的？
CoupleNet中局部上下文信息是如何提取的？
CoupleNet中局部上下文和全局上下文是如何进行融合来增强检测的？
CoupleNet中还存在着哪些缺陷？

我决定做完实验之后对CoupleNet进行代码的学习，从代码层面研究和掌握CoupleNet。

首先从论文层面来解释上面提出的几个问题

Global FCN

在这里插入图片描述

For the global FCN, we aim to describe the object by using the whole region-level features.

Firstly, we attach a 1024-d 1x1 convolutional layer after the last convolutional block in ResNet-101 for reducing the dimension. Due to the diverse size of the object, we insert a RoI pooling layer in [8] to extract a fixed-length feature vector as the global structure description of the object.

如何获取global structure description of the object ?
文中也描述了：

To enhance the feature representation ability of the global FCN, 
here we introduce the contextual information as an effective supplement. 
Specifically, we extend the context region by 2 times larger thanthe size of original proposal. 
Then the features RoI pooled from the original region and context region are concatenated together and fed into the latter RoI-wise subnetwork

Secondly, we use two convolutional layers with kernal size k × k and 1 × 1 respectively (k is set to the default value 7) to further abstract the global representation of RoI. Finally, the output of 1x1 convolution is fed into the classifier whose output is also a (C + 1)-d vector.

总结来说：
上方的ROI pooling根据region proposal的大小得到一个（C+1）channel的feature map，
K表示把region在平面上分成K*K个cells，C表示类别数量，1代表背景background。

下方的ROI pooling根据region proposal的大小外扩1倍得到一个一个（C+1）channel的feature map。

然后两个ROI pooling得到的特征图在channel维度进行concat，

之后经过一个kk的卷积和1个11的卷积

local FCN

这一部分就是RFCN的操作

通过位置敏感的ROI pooling得到K*K(C+1)个channel的特征图进行vote得到C+1个channel的特征图，
这个尺寸和global pooling得到的特征图维度一致。

局部上下文和全局上下文如何进行融合？

在这里插入图片描述

CoupleNet存在哪些问题？

以后再说。

代码，最近读了再放

practical_sharp

关注

1
点赞
踩
1

收藏

觉得还不错? 一键收藏
0
评论
ICCV2017——CoupleNet: Coupling Global Structure with Local Parts for Object Detection

CoupleNet: Coupling Global Structure with Local Parts for Object Detection学习CoupleNet，这个网络结构就是使用全局上下文和局部上下文进行融合，检测。在最近的一些论文中我也看到过这么的操作，但是他们的出发点都应该是来自于这一篇论文。所以学习CoupleNet是如何进行检测，如何进行局部上下文和全局上下文的融合的至关重要。学习CoupleNet主要抓住以下问题：CoupleNet是基于RFCN进行改进的，首先要掌握F
复制链接

扫一扫