R-CNN, Fast R-CNN, Faster R-CNN, Mask R-CNN

最新推荐文章于 2023-08-02 12:59:38 发布

Silent_Summer

最新推荐文章于 2023-08-02 12:59:38 发布

阅读量984

点赞数

分类专栏： Deep Learning 文章标签： R-CNN

本文链接：https://blog.csdn.net/cxsydjn/article/details/79342338

版权

Deep Learning 专栏收录该内容

1 篇文章 0 订阅

订阅专栏

R-CNN 系列的四篇文章如下：
1. R-CNN: https://arxiv.org/abs/1311.2524
2. Fast R-CNN: https://arxiv.org/abs/1504.08083
3. Faster R-CNN: https://arxiv.org/abs/1506.01497
4. Mask R-CNN: https://arxiv.org/abs/1703.06870

R-CNN

R-CNN: R-CNN utilizes Selective Search to generate region proposals, CNN to extract features, SVM to classify object, and linear regression to tighten the generated bounding boxes. However, its training tasks, including training CNN, SVM and linear regression, are complex. Even worse, it deals with four parts separately, which limits its processing speed. And it could not be used in real time object detection.

Fast R-CNN

Fast R-CNN: Fast R-CNN utilizes same methods in region proposal generating and feature extraction. But it uses softmax instead of SVM. It speeds up R-CNN since it avoids performing CNN on each proposal separately. It processes the whole image to generate the feature map only using CNN once, and then obtain corresponding features of each proposal from the feature map. Due to the size restriction, Fast R-CNN applies ROI pooling to normalize features for the following regression tasks. However, it’s still based on generating region proposals with Selective Search, which is the bottleneck of computation.

Faster R-CNN

Faster R-CNN: Faster R-CNN consists of two main modules, Region Proposal Network (RPN) for generating region proposals and Fast R-CNN using the proposed regions for classification, and finally implements an end-to-end network. Although Fast R-CNN is faster than R-CNN, Faster R-CNN is even more computing-efficient through the following steps. It generates region proposals on feature map instead of the original image using sliding window, and in each window center, it proposes 9 region candidates with 3 scales and 3 aspect ratios. RPN is also used to train classification and bounding box regression, and it shares convolutional layers between RPN and Fast R-CNN.

The most interesting point that hits me is the mechanism of anchors. It seems the reverse version of Spatial Pyramid Pooling (SPP). SPP is used for resizing inputs with different sizes to a single-scale output. Anchor is used to obtain inputs with multiple scales and aspect ratios from a single-scale output on the feature map with a single-sized sliding window. Then all these obtained inputs (a pyramid of anchors) would be fed into classification and regression tasks.

Mask R-CNN

Mask R-CNN detects objects and generates segmentation mask for each instance simultaneously based on Faster R-CNN architecture. The key innovations are adding a branch for mask prediction in parallel with classification and bounding box regression, and replacing the RoIPool with the RoIAlign. For the first point, Mask R-CNN performs classification and mask prediction in parallel, unlike methods whose segmentation precedes recognition. The network generates masks for each class, and uses classification result to select the output mask. From this point of view, Mask R-CNN decouples mask and class label prediction. For the second point, the authors found that the quantization issue from the RoIPool has small negative effects on classification but a large one on pixel-level mask prediction. Hence, they proposed an RoIAlign layer using bilinear interpolation to make extracted features align to the input. At last, Mask R-CNN shows its good generality, flexibility and accuracy in multiple tasks, including instance segmentation, bounding box object detection and person keypoint detection.

References:

[1] A Brief History of CNNs in Image Segmentation: From R-CNN to Mask R-CNN
[2] https://zhuanlan.zhihu.com/p/25954683
[2] https://zhuanlan.zhihu.com/p/26655034
[2] https://zhuanlan.zhihu.com/p/32830206

Silent_Summer

关注

0
点赞
踩
7

收藏

觉得还不错? 一键收藏
0
评论
R-CNN, Fast R-CNN, Faster R-CNN, Mask R-CNN

R-CNN 系列的四篇文章的一些comments, 包括：R-CNN，Fast R-CNN， Faster R-CNN，Mask R-CNN.
复制链接

扫一扫