[paper reading] CenterNet (Triplets)

最新推荐文章于 2021-10-21 10:12:48 发布

Harry嗷

最新推荐文章于 2021-10-21 10:12:48 发布

阅读量1.4k

点赞数

分类专栏： paper reading Detection 文章标签：机器学习人工智能深度学习计算机视觉论文笔记

本文链接：https://blog.csdn.net/qq_41683065/article/details/109548342

版权

paper reading 同时被 2 个专栏收录

17 篇文章 3 订阅

订阅专栏

Detection

11 篇文章 1 订阅

订阅专栏

[paper reading] CenterNet (Triplets)

GitHub：Notes of Classic Detection Papers

2020.11.09更新：更新了Use Yourself，即对于本文的理解和想法，详情参见GitHub：Notes of Classic Detection Papers

本来想放到GitHub的，结果GitHub不支持公式。
没办法只能放到CSDN，但是格式也有些乱
强烈建议去GitHub上下载源文件，来阅读学习！！！这样阅读体验才是最好的
当然，如果有用，希望能给个star！

topic	motivation	technique	key element	math	use yourself	relativity
CenterNet (triple)	Problem to Solve Idea Intuition	CenterNet Architecture Center Pooling Cascade Corner Pooling Central Region Exploration	Baseline：CornerNet Generating BBox Training Inferencing Ablation Experiment Error Analysis Metric AP & AR & FD Small & Medium & Large	Central Region Loss Function	……	Related Work

文章目录

[paper reading] CenterNet (Triplets)

Motivation

Problem to Solve

keypoint-based方法的弊端（这里主要指的是CornerNet）：

由于缺少对于cropped region的additional look，无法获得bounding box region的visual pattern，会导致产生大量的incorrect bounding box

在这里插入图片描述

① CornerNet 会产生很多的错误的bounding box

Idea

用一个keypoint triplet（top-left corner & bottom-right corner & center）表示一个object。

即在由top-left corner & bottom-right corner去encode边界信息的同时，通过引入center，使得模型可以explore每个predicted bounding box的visual patter（获得object的internal信息）

在具体的做法上，是将 visual patterns within object 转化成 keypoint detection

在这里插入图片描述

② 检查Central Region可以找出正确的prediction

Intuition

该思路部分沿袭RoI Pooling的思想，通过efficient discrimination（Central Region），使得one-stage方法一定程度上具有了two-stage方法的resample能力

具体来说：如果predicted bounding box和ground-truth box有高IoU，则Center-Region中的Center KeyPoint也会被预测为相同的类别

Technique

CenterNet Architecture

在这里插入图片描述

Components

[Center Pooling](#Center Pooling)
[Cascade Corner Pooling](#Cascade Corner Pooling)
[Central Region Exploration](#Central Region Exploration)

Improvement

AP Improvement

small、medium、large object的AP均有提升，绝大部分的提升来自small object

原因：Center Information。incorrect bounding box越小，能在其Central Region检测到center keypoint的可能性越小

small object

在这里插入图片描述

medium & large object

AR Improvement

原因：滤除了incorrect bounding box，相当于提升了accurate location but lower scores的bounding box的confidence

Center Pooling

Cascade Corner Pooling 和 Center Pooling 都可以通过结合不同方向的 Corner Pooling 实现

Why

geometric center并不一定带有recognizable visual pattern

Purpose

better detection of center keypoint！！！

具体来说，是为Central Region提供recognizable visual pattern，以感知proposal中心位置的信息，从而检测bounding box的正确性

Steps

在这里插入图片描述

对于Center Pooling的输入feature map，在水平和垂直方向取max summed response

backbone输出feature map
在水平和垂直方向分别找到最大值
将其加到一起

在这里插入图片描述

Cascade Corner Pooling

Cascade Corner Pooling 和 Center Pooling 都可以通过结合不同方向的 Corner Pooling 实现

Why

corner在object之外，缺少local appearance feature

Purpose

better detection of corners！！！

具体来说，是丰富top-left corner和bottom-right corner收集的信息，以同时感知boundary和internal信息

Steps

在这里插入图片描述

在输入feature map的boundary和internal方向，去max summed response（双方向的pooling更稳定更鲁棒，能提高准确率和召回率）

在boundary方向上找boundary max
在boundary max的位置，向internal方向上找internal max
把2个max加起来（加到corner的位置）

在这里插入图片描述

Central Region Exploration

Scale-Aware Central Region

原因：

$\text{recall} \ vs. \text{precision}$
Central Region的选择：

对不同size的bounding box生成不同大小Central Region
- small bounding box ==> large central region
  
  原因：small center region会导致small bounding box的low recall
- large bounding box ==> small central region
  
  原因：small center region会导致small bounding box的low recall
在实验中，使用2中Central Region：

具体使用哪种，由bounding box的scale决定：
- $< 150$ ：n = 3 (left)
- $> 150$ ：n = 5 (right)

Exploration

center keypoint落到Central Region中
center keypoint和bounding box的类别相同

Key Element

Baseline：CornerNet

Three outputs

heatmap：
- top-left corner
- bottom-right corner
每个heatmap都包括2个部分：
1. 不同category的keypoint的位置
2. 每个keypoint的confidence score
embedding：

对corner进行分组
offset：

把corner从heatmap去remap到input image

Generate BBox

对top-left corner和bottom-right corner分别取top-100
根据embedding distance对corner进行分组（embedding distance < $T h r e s h o l d$ ）
计算bounding box的confidence score（2个corner score的平均）

Drawbacks

CornerNet的False Discovery Rate（FD）很高（即：有大量的incorrect bounding box）

AP & FD的含义，见 [Metric AP & AR & FD](#Metric AP & AR & FD)

Generating BBox

选取 top-k 个center keypoints
center keypoint去remap到input image（使用offset）
在bounding box中定义Central Region
保留符合要求的bounding box
- center keypoint落到Central Region中
- center keypoint和bounding box的类别相同
计算bounding box的score

为top-left corner、bottom-right corner、center的average score

Training

Input & Output Size

input size：511×511
output size：128×128

Data Augmentation

同 CornerNet

Inferencing

Single-Scale Testing

以原分辨率，将original和flipped输入网络

Multi-Scale Testing

以分辨率 $[0.6, 1.0, 1.2, 1.5, 1.8]$ ，将original和flipped输入网络

Steps

根据70对Triplet确定70对bounding box

详见 [Generating BBox](#Generating BBox)
将flipped image再次flip，合并到原image上
Post-Processing：Soft-NMS
取top-100的bounding box

Ablation Experiment

在这里插入图片描述

Incorrect Bounding Box Reduction

在这里插入图片描述

Inference Speed

visual patterns exploration的cost很小

CenterNet某版本可以在精度和速度上同时超过CornerNet某版本

Center Pooling Ablation

结论：

Center Pooling可以大幅度提高large object的AP
原因：
- Center Pooling可以提取更丰富的internal visual patterns
- larger object包含更多的internal visual pattern

在这里插入图片描述

Cascade Corner Pooling Ablation

结论：
- 由于large object有丰富的internal visual patterns，Cascade Corner Pooling可以看到更多的object
- 过于丰富的internal visual patterns会影响其对boundary的敏感，导致inaccurate bounding box
  - 可以通过Center Pooling抑制错误的Bounding box

Central Region Exploration Ablation

结论：

提升了整体的AP，其中小目标AP提升最大
原因：

小目标的center keypoint更容易被located

Error Analysis

Exploration of visual patterns依赖于center keypoint实现 ==> Center keypoint的丢失会导致CenterNet丢失bounding box的visual pattern
Center keypoint还有很大的提升空间

Metric AP & AR & FD

AP：Average Precision Rate

是在所有category上，以10个Threshold（e.g. $0.5 : 0.05 : 0.95$ ）上计算

可以反映网络可以预测多少高质量的bounding box（一般IoU $\ge0.5$ ）

是MS-COCO数据集最重要的metric

AR：Maximum Recall Rate

在每张图片上取固定数量的detection，在所有类别和10个IoU Threshold上取平均

FD：False Discovery Rate

反映incorrect bounding box的比例
$\text{FD} = 1-\text{AP}$

Small & Medium & Large

small object： $\text{area}<32^2$
medium object： $32^2<\text{area}<96^2$
large object： $\text{area}>96^2$

Math

Central Region

在这里插入图片描述

Loss Function

主要分为：

Detection Loss
- Corner Detection Loss $\text{L}_{\text{det}}^{\text{co}}$
- Center Detection Loss $\text{L}_{\text{det}}^{\text{ce}}$
Pull & Push Loss

仅对Corner进行
- Pull Loss $\text{L}_{\text{pull}}^{\text{co}}$
- Push Loss $\text{L}_{\text{push}}^{\text{co}}$
Offset Loss
- Corner offset Loss $\text{L}_{\text{off}}^{\text{co}}$
- Center offset Loss $\text{L}_{\text{off}}^{\text{ce}}$