【Deep Learning】Review:Rich feature hierarchies for accurate object detection and semantic segmentati

最新推荐文章于 2024-04-18 23:00:13 发布

qbt4juik

最新推荐文章于 2024-04-18 23:00:13 发布

阅读量512

点赞数

分类专栏： Machine Learning 文章标签： Deep Learning Computer Vision

本文链接：https://blog.csdn.net/qbt4juik/article/details/50577334

版权

Machine Learning 专栏收录该内容

6 篇文章 0 订阅

订阅专栏

Rich feature hierarchies for accurate object detection and semantic segmentation

Source: http://arxiv.org/abs/1311.2524

github: https://github.com/rbgirshick/rcnn

1. Summary of thePaper

Quoted from the paper:

“Object detection systemoverview. Our system (1) takes an input image, (2) extracts around 2000bottom-up region proposals, (3) computes features for each proposal using alarge convolutional neural network (CNN), and then (4) classifies each regionusing class-specific linear SVMs. R-CNN achieves a mean average precision (mAP)of 53.7% on PASCAL VOC 2010. For comparison, [39] reports 35.1% mAP using thesame region proposals, but with a spatial pyramid and bag-of-visual-wordsapproach. The popular deformable part models perform at 33.4%. On the 200-classILSVRC2013 detection dataset, R-CNN’s mAP is 31.4%, a large improvement overOverFeat [34], which had the previous best result at 24.3%.”

To sum up, theframework of the method provided by the paper is actually very simple buteffective:

l Replacethe sliding sampling method with selective search, extracting 2,000-regionproposal candidate.

l Traink L-SVM classifier to obtain the score of each region based on the output ofthe features from AlexNet.

l Gainthe detection results by abandoning some region in accordance with NMS.

2. MainContributions

1) Onecan apply high-capacity convolutional neural networks (CNNs) to bottom-upregion proposals in order to localize and segment objects.

2) Whenlabeled training data is scarce, supervised pre-training for an auxiliary taskfollowed by domain-specific fine-tuning, yields a significant performanceboost. Combining region proposal with CNN plays an outstanding performance.

3. Positive andnegative points

Positive Points:

(i) Replacetraditional sliding windows methods with region proposal method.

(ii) Use AlexNetto extract feature. Minimize the size of region to 227*227 so that we canprovide background information as prior information.

(iii) Use Boundary-boxregression to further promote the accuracy.

(iv) Replacesoftmax with SVM because the background shared in softmax and thus SVM providesmore independent information.

Negative Points:

(i) .

4. How strong isthe evaluation

l Performancelayer by layer without fine tuning. This is to say, by using the features ofpool5, fc6, fc7 for implementing SVM, the results are very similar. Theconclusion is CNN could mostly demonstrate the information in convolutionallayer.

l Comparisonto recent feature learning methods indicates CNN’s effect is generally betterthan the other methods’ performance.

l Comparedto Google Dean et al. paper (CVPR best paper): 16% mAP in 5 minutes. Here 48%in about 1 minute!

5. Possibledirection for the future work

I’mthinking is it possible that we have some methods that improve the speed andaccuracy of object detection without using region proposal. Although regionproposal largely promote the efficiency, it inevitably neglects some importantinformation. Hope this could be possible direction.

qbt4juik

关注

0
点赞
踩
0

收藏

觉得还不错? 一键收藏
0
评论
【Deep Learning】Review:Rich feature hierarchies for accurate object detection and semantic segmentati

Rich feature hierarchies for accurate object detection and semantic segmentation Source: http://arxiv.org/abs/1311.2524github: https://github.com/rbgirshick/rcnn1. Summary of thePaperQ
复制链接

扫一扫

专栏目录