【目标识别】深度学习进行目标识别的资源列表:O网页链接 包括RNN、MultiBox、SPP-Net、DeepID-Net、Fast R-CNN、DeepBox、MR-CNN、Faster R-CNN、YOLO、DenseBox、SSD、Inside-Outside Net、G-CNN等。 Papers
Deep Neural Networks for Object Detection
OverFeat: Integrated Recognition, Localization and Detection using Convolutional Networks
[td]
method
|
ILSVRC 2013 mAP
|
OverFeat
|
24.3%
|
R-CNN
Rich feature hierarchies for accurate object detection and semantic segmentation(R-CNN)
[td]
method
|
VOC 2007 mAP
|
VOC 2010 mAP
|
VOC 2012 mAP
|
ILSVRC 2013 mAP
|
R-CNN,AlexNet
|
54.2%
|
50.2%
|
49.6%
| | R-CNN,bbox reg,AlexNet |
58.5%
|
53.7%
|
53.3%
|
31.4%
|
R-CNN,bbox reg,ZFNet
|
59.2%
| | | | R-CNN,VGG-Net |
62.2%
| | | | R-CNN,bbox reg,VGG-Net |
66.0%
| | | |
MultiBox
Scalable Object Detection using Deep Neural Networks (MultiBox)
SPP-Net
Spatial Pyramid Pooling in Deep Convolutional Networks for Visual Recognition
[td]
method |
VOC 2007 mAP
|
ILSVRC 2013 mAP
|
SPP_net(ZF-5),1-model
|
54.2%
|
31.84%
|
SPP_net(ZF-5),2-model
|
60.9%
| | SPP_net(ZF-5),6-model | | 35.11% |
Learning Rich Features from RGB-D Images for Object Detection and Segmentation
Scalable, High-Quality Object Detection
DeepID-Net
DeepID-Net: Deformable Deep Convolutional Neural Networks for Object Detection
[td]
method
|
VOC 2007 mAP
|
ILSVRC 2013 mAP
|
DeepID-Net
|
64.1%
|
50.3%
|
Object Detection Networks on Convolutional Feature Maps
[td]
method
|
Trained on
|
mAP
|
NoC
|
07+12
|
68.8%
|
NoC,bb
|
07+12
|
71.6%
|
NoC,+EB
|
07+12
|
71.8%
|
NoC,+EB,bb
|
07+12
|
73.3%
|
Improving Object Detection with Deep Convolutional Networks via Bayesian Optimization and Structured Prediction
[td]
Model
|
BBoxReg?
|
VOC 2007 mAP(IoU>0.5)
|
R-CNN(AlexNet)
|
No
|
54.2%
|
R-CNN(VGG)
|
No
|
60.6%
|
+StructObj
|
No
|
61.2%
|
+StructObj-FT
|
No
|
62.3%
|
+FGS
|
No
|
64.8%
|
+StructObj+FGS
|
No
|
65.9%
|
+StructObj-FT+FGS
|
No
|
66.5%
| [td]
Model
|
BBoxReg?
|
VOC 2007 mAP(IoU>0.5)
|
R-CNN(AlexNet)
|
Yes
|
58.5%
|
R-CNN(VGG)
|
Yes
|
65.4%
|
+StructObj
|
Yes
|
66.6%
|
+StructObj-FT
|
Yes
|
66.9%
|
+FGS
|
Yes
|
67.2%
|
+StructObj+FGS
|
Yes
|
68.5%
|
+StructObj-FT+FGS
|
Yes
|
68.4%
|
Fast R-CNN
Fast R-CNN
[td]
method
|
data
|
VOC 2007 mAP
|
FRCN,VGG16
|
07
|
66.9%
|
FRCN,VGG16
|
07+12
|
70.0%
| [td]
method
|
data
|
VOC 2010 mAP
|
FRCN,VGG16
|
12
|
66.1%
|
FRCN,VGG16
|
07++12
|
68.8%
| [td]
method
|
data
|
VOC 2012 mAP
|
FRCN,VGG16
|
12
|
65.7%
|
FRCN,VGG16
|
07++12
|
68.4%
|
DeepBox
DeepBox: Learning Objectness with Convolutional Networks
MR-CNN
Object detection via a multi-region & semantic segmentation-aware CNN model (MR-CNN)
[td]
Model
|
Trained on
|
VOC 2007 mAP
|
VGG-net
|
07+12
|
78.2%
|
VGG-net
|
07
|
74.9%
| [td]
Model
|
Trained on
|
VOC 2012 mAP
|
VGG-net
|
07+12
|
73.9%
|
VGG-net
|
12
|
70.7%
|
Faster R-CNN
Faster R-CNN: Towards Real-Time Object Detection with Region Proposal Networks(NIPS 2015)
[td]
| training data |
test data
|
mAP
|
time/img
|
Faster RCNN, VGG-16
|
07
|
VOC 2007 test
|
69.9%
|
198ms
|
Faster RCNN, VGG-16
|
07+12
|
VOC 2007 test
|
73.2%
|
198ms
|
Faster RCNN, VGG-16
|
12
|
VOC 2007 test
|
67.0%
|
198ms
|
Faster RCNN, VGG-16
|
07++12
|
VOC 2007 test
|
70.4%
|
198ms
|
YOLO
You Only Look Once: Unified, Real-Time Object Detection(YOLO)
R-CNN minus R
DenseBox
DenseBox: Unifying Landmark Localization with End to End Object Detection
SSD
SSD: Single Shot MultiBox Detector
Inside-Outside Net
Inside-Outside Net: Detecting Objects in Context with Skip Pooling and Recurrent Neural Networks
Detection results on VOC 2007 test:
[td]
Method
|
R
|
S
|
W
|
D
|
Train
|
mAP
|
FRCN
| | | | |
07+12
|
70.0
|
RPN
| | | | |
07+12
|
73.2
|
MR-CNN
| | |
√
| | 07+12 |
78.2
|
ION
| | | | |
07+12
|
74.6
|
ION
|
√
| | | | 07+12 |
75.6
|
ION
|
√
|
√
| | |
07+12+S
|
76.5
|
ION
|
√
|
√
|
√
| | 07+12+S |
78.5
|
ION
|
√
|
√
|
√
|
√
|
07+12+S
|
79.2
|
Detection results on VOC 2012 test:
[td]
Method
|
R
|
S
|
W
|
D
|
Train
|
mAP
|
FRCN
| | | | |
07++12
|
68.4
|
RPN
| | | | |
07++12
|
70.4
|
FRCN+YOLO
| | | | |
07++12
|
70.4
|
HyperNet
| | | | |
07++12
|
71.4
|
MR-CNN
| | |
√
| | 07+12 |
73.9
|
ION
|
√
|
√
|
√
|
√
|
07+12+S
|
76.4
|
G-CNN
G-CNN: an Iterative Grid Based Object Detector
Learning Deep Features for Discriminative Localization
Factors in Finetuning Deep Model for object detection
We don’t need no bounding-boxes: Training object class detectors using only human verification
A MultiPath Network for Object Detection
Beyond Bounding Boxes: Precise Localization of Objects in Images (PhD Thesis)
T-CNN: Tubelets with Convolutional Neural Networks for Object Detection from Videos
Training Region-based Object Detectors with Online Hard Example Mining
Specific Object Deteciton
End-to-end people detection in crowded scenes
Tutorials
Convolutional Feature Maps: Elements of efficient (and accurate) CNN-based object detection
Codes
TensorBox: a simple framework for training neural networks to detect objects in images
Object detection in torch: Implementation of some object detection frameworks in torch
Blogs
Convolutional Neural Networks for Object Detection
|