Object Detection in 20 Years: A Survey 综述论文笔记

目标检测综述,做一些笔记记录一下

一、INTRODUCTION

1、From the application point of view, object detection can be grouped into two research topics “general object detection” and “detection applications”, where the former one aims to explore the methods of detecting different types of objects under a unified framework to simulate the human vision and cognition, and the later one refers to the detection under specific application scenarios, such as pedestrian detection,face detection, text detection, etc.

 

2、 After years of development,the state of the art object detection systems have been integrated with a large number of techniques such as “multiscale detection”, “hard negative mining”, “bounding box regression”, etc.

3、 The acceleration of object detection has long been a crucial but challenging task.

4、As different detection tasks have totally different objectives and constraints, their difficulties may vary from each other. In addition to some common challenges in other computer vision tasks such as objects under different viewpoints,illuminations, and intraclass variations, the challenges in object detection include but not limited to the following aspects: object rotation and scale changes (e.g., small objects), accurate object localization, dense and occluded object detection, speed up of detection, etc. 

二、OBJECT DETECTION IN 20 YEARS

1、In the past two decades, it is widely accepted that the progress of object detection has generally gone through two historical periods: “traditional object detection period (before 2014)” and “deep learning based detection period (after 2014)”

2、 Most of the early object detection algorithms were built based on handcrafted features. Due to the lack of effective image representation at that time, people have no choice but to design sophisticated feature representations, and a variety of speed up skills to exhaust the usage of limited computing resources.

3、Viola Jones Detectors:The VJ detector has dramatically improved its detection speed by incorporating three important techniques: “integral image”, “feature selection”, and “detection cascades”.

4、HOG Detector:HOG can be considered as an important improvement of the scale-invariant feature transform and shape contexts  of its time.

5、Deformable Part-based Model (DPM):DPM was originally proposed by P. Felzenszwalb in 2008 as an extension of the HOG detector, and then a variety of improvements have been made by R. Girshick.The DPM follows the detection philosophy of “divide and conquer”, where the training can be simply considered as the learning of a proper way of decomposing an object,
and the inference can be considered as an ensemble of detections on different object parts.

6、Milestones: CNN based Two-stage Detectors:In 2012, the world saw the rebirth of convolutional neural networks;R.Girshick et al. took the lead to break the deadlocks in 2014 by proposing the Regions with CNN features (RCNN) for object detection.

7、RCNN:Although RCNN has made great progress, its drawbacks are obvious: the redundant feature computations on a large number of overlapped proposals (over 2000 boxes from one image) leads to an extremely slow detection speed (14s per image with GPU). 

8、SPPNet:The main contribution of SPPNet is the introduction of a Spatial Pyramid Pooling (SPP,空间金字塔池) layer, which enables a CNN to generate a fixed-length representation regardless of the size of image/region of interest without rescaling it.

9、Fast RCNN:Fast RCNN enables us to simultaneously train a detector and a bounding box regressor under the same network configurations.

10、Faster RCNN: Faster RCNN is the first end-to-end, and the first near-realtime deep learning detector.  The main contribution of Faster-RCNN is the introduction of Region Proposal Network (RPN) that enables nearly cost-free region proposals. 

11、Feature Pyramid Networks:Although the features in deeper layers of a CNN are beneficial for category recognition, it is not conducive to localizing objects. To this end, a top-down architecture with lateral connections is developed in FPN for building high-level semantics at all scales. FPN has now become a basic building block of many latest detectors.

 

12、未完

评论
添加红包

请填写红包祝福语或标题

红包个数最小为10个

红包金额最低5元

当前余额3.43前往充值 >
需支付:10.00
成就一亿技术人!
领取后你会自动成为博主和红包主的粉丝 规则
hope_wisdom
发出的红包
实付
使用余额支付
点击重新获取
扫码支付
钱包余额 0

抵扣说明:

1.余额是钱包充值的虚拟货币,按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载,可以购买VIP、付费专栏及课程。

余额充值