目标检测(2017前)方法介绍

https://handong1587.github.io/deep_learning/2015/10/09/nlp.html

MethodVOC2007VOC2010VOC2012ILSVRC 2013MSCOCO 2015Speed
OverFeat   24.3%  
R-CNN (AlexNet)58.5%53.7%53.3%31.4%  
R-CNN (VGG16)66.0%     
SPP_net(ZF-5)54.2%(1-model), 60.9%(2-model)  31.84%(1-model), 35.11%(6-model)  
DeepID-Net64.1%  50.3%  
NoC73.3% 68.8%   
Fast-RCNN (VGG16)70.0%68.8%68.4% 19.7%(@[0.5-0.95]), 35.9%(@0.5) 
MR-CNN78.2% 73.9%   
Faster-RCNN (VGG16)78.8% 75.9% 21.9%(@[0.5-0.95]), 42.7%(@0.5)198ms
Faster-RCNN (ResNet-101)85.6% 83.8% 37.4%(@[0.5-0.95]), 59.0%(@0.5) 
SSD300 (VGG16)72.1%    58 fps
SSD500 (VGG16)75.1%    23 fps
ION79.2% 76.4%   
AZ-Net70.4%   22.3%(@[0.5-0.95]), 41.0%(@0.5) 
CRAFT75.7% 71.3%48.5%  
OHEM78.9% 76.3% 25.5%(@[0.5-0.95]), 45.9%(@0.5) 
R-FCN (ResNet-50)77.4%    0.12sec(K40), 0.09sec(TitianX)
R-FCN (ResNet-101)79.5%    0.17sec(K40), 0.12sec(TitianX)
R-FCN (ResNet-101),multi sc train83.6% 82.0% 31.5%(@[0.5-0.95]), 53.2%(@0.5) 
PVANet 9.081.8% 82.5%  750ms(CPU), 46ms(TitianX)

Leaderboard

Detection Results: VOC2012

Papers

Deep Neural Networks for Object Detection

OverFeat: Integrated Recognition, Localization and Detection using Convolutional Networks

R-CNN

Rich feature hierarchies for accurate object detection and semantic segmentation

MultiBox

Scalable Object Detection using Deep Neural Networks

Scalable, High-Quality Object Detection

SPP-Net

Spatial Pyramid Pooling in Deep Convolutional Networks for Visual Recognition

Learning Rich Features from RGB-D Images for Object Detection and Segmentation

DeepID-Net

DeepID-Net: Deformable Deep Convolutional Neural Networks for Object Detection

Object Detectors Emerge in Deep Scene CNNs

segDeepM: Exploiting Segmentation and Context in Deep Neural Networks for Object Detection

NoC

Object Detection Networks on Convolutional Feature Maps

Improving Object Detection with Deep Convolutional Networks via Bayesian Optimization and Structured Prediction

Fast R-CNN

Fast R-CNN

DeepBox

DeepBox: Learning Objectness with Convolutional Networks

MR-CNN

Object detection via a multi-region & semantic segmentation-aware CNN model

Faster R-CNN

Faster R-CNN: Towards Real-Time Object Detection with Region Proposal Networks

Faster R-CNN in MXNet with distributed implementation and data parallelization

YOLO

You Only Look Once: Unified, Real-Time Object Detection

Start Training YOLO with Our Own Data

R-CNN minus R

AttentionNet

AttentionNet: Aggregating Weak Directions for Accurate Object Detection

DenseBox

DenseBox: Unifying Landmark Localization with End to End Object Detection

SSD

SSD: Single Shot MultiBox Detector

为什么SSD(Single Shot MultiBox Detector)对小目标的检测效果不好?

Inside-Outside Net (ION)

Inside-Outside Net: Detecting Objects in Context with Skip Pooling and Recurrent Neural Networks

Adaptive Object Detection Using Adjacency and Zoom Prediction

G-CNN

G-CNN: an Iterative Grid Based Object Detector

Factors in Finetuning Deep Model for object detection Factors in Finetuning Deep Model for Object Detection with Long-tail Distribution

We don’t need no bounding-boxes: Training object class detectors using only human verification

HyperNet

HyperNet: Towards Accurate Region Proposal Generation and Joint Object Detection

MultiPathNet

A MultiPath Network for Object Detection

CRAFT

CRAFT Objects from Images

OHEM

Training Region-based Object Detectors with Online Hard Example Mining

Track and Transfer: Watching Videos to Simulate Strong Human Supervision for Weakly-Supervised Object Detection

Exploit All the Layers: Fast and Accurate CNN Object Detector with Scale Dependent Pooling and Cascaded Rejection Classifiers

http://www-personal.umich.edu/~wgchoi/SDP-CRC_camready.pdf

R-FCN

R-FCN: Object Detection via Region-based Fully Convolutional Networks

Weakly supervised object detection using pseudo-strong labels

Recycle deep features for better object detection

MS-CNN

A Unified Multi-scale Deep Convolutional Neural Network for Fast Object Detection

Multi-stage Object Detection with Group Recursive Learning

Subcategory-aware Convolutional Neural Networks for Object Proposals and Detection

PVANET

PVANET: Deep but Lightweight Neural Networks for Real-time Object Detection

PVANet: Lightweight Deep Neural Networks for Real-time Object Detection

GBD-Net

Gated Bi-directional CNN for Object Detection

Crafting GBD-Net for Object Detection

StuffNet

StuffNet: Using ‘Stuff’ to Improve Object Detection

Generalized Haar Filter based Deep Networks for Real-Time Object Detection in Traffic Scene

Hierarchical Object Detection with Deep Reinforcement Learning

Learning to detect and localize many objects from few examples

Speed/accuracy trade-offs for modern convolutional object detectors

SqueezeDet: Unified, Small, Low Power Fully Convolutional Neural Networks for Real-Time Object Detection for Autonomous Driving

Feature Pyramid Network (FPN)

Feature Pyramid Networks for Object Detection

Detection From Video

Learning Object Class Detectors from Weakly Annotated Video

Analysing domain shift factors between videos and images for object detection

Video Object Recognition

Deep Learning for Saliency Prediction in Natural Video

T-CNN

T-CNN: Tubelets with Convolutional Neural Networks for Object Detection from Videos

Object Detection from Video Tubelets with Convolutional Neural Networks

Object Detection in Videos with Tubelets and Multi-context Cues

Context Matters: Refining Object Detection in Video with Recurrent Neural Networks

CNN Based Object Detection in Large Video Images

Datasets

YouTube-Objects dataset v2.2

ILSVRC2015: Object detection from video (VID)

Object Detection in 3D

Vote3Deep: Fast Object Detection in 3D Point Clouds Using Efficient Convolutional Neural Networks

Salient Object Detection

This task involves predicting the salient regions of an image given by human eye fixations.

Best Deep Saliency Detection Models (CVPR 2016 & 2015)

http://i.cs.hku.hk/~yzyu/vision.html

Large-scale optimization of hierarchical features for saliency prediction in natural images

Predicting Eye Fixations using Convolutional Neural Networks

Saliency Detection by Multi-Context Deep Learning

DeepSaliency: Multi-Task Deep Neural Network Model for Salient Object Detection

SuperCNN: A Superpixelwise Convolutional Neural Network for Salient Object Detection

Shallow and Deep Convolutional Networks for Saliency Prediction

Recurrent Attentional Networks for Saliency Detection

Two-Stream Convolutional Networks for Dynamic Saliency Prediction

Unconstrained Salient Object Detection

Unconstrained Salient Object Detection via Proposal Subset Optimization

DHSNet: Deep Hierarchical Saliency Network for Salient Object Detection

Salient Object Subitizing

Deeply-Supervised Recurrent Convolutional Neural Network for Saliency Detection

Saliency Detection via Combining Region-Level and Pixel-Level Predictions with CNNs

Edge Preserving and Multi-Scale Contextual Neural Network for Salient Object Detection

A Deep Multi-Level Network for Saliency Prediction

Visual Saliency Detection Based on Multiscale Deep CNN Features

A Deep Spatial Contextual Long-term Recurrent Convolutional Network for Saliency Detection

Deeply supervised salient object detection with short connections

Weakly Supervised Top-down Salient Object Detection

Specific Object Deteciton

Face Deteciton

Multi-view Face Detection Using Deep Convolutional Neural Networks

From Facial Parts Responses to Face Detection: A Deep Learning Approach

Compact Convolutional Neural Network Cascade for Face Detection

Face Detection with End-to-End Integration of a ConvNet and a 3D Model

Supervised Transformer Network for Efficient Face Detection

UnitBox

UnitBox: An Advanced Object Detection Network

Bootstrapping Face Detection with Hard Negative Examples

Grid Loss: Detecting Occluded Faces

A Multi-Scale Cascade Fully Convolutional Network Face Detector

MTCNN

Joint Face Detection and Alignment using Multi-task Cascaded Convolutional Networks

Joint Face Detection and Alignment using Multi-task Cascaded Convolutional Neural Networks

Datasets / Benchmarks

FDDB: Face Detection Data Set and Benchmark

WIDER FACE: A Face Detection Benchmark

Facial Point / Landmark Detection

Deep Convolutional Network Cascade for Facial Point Detection

A Recurrent Encoder-Decoder Network for Sequential Face Alignment

Detecting facial landmarks in the video based on a hybrid framework

Deep Constrained Local Models for Facial Landmark Detection

People Detection

End-to-end people detection in crowded scenes

Detecting People in Artwork with CNNs

Person Head Detection

Context-aware CNNs for person head detection

Pedestrian Detection

Pedestrian Detection aided by Deep Learning Semantic Tasks

Deep Learning Strong Parts for Pedestrian Detection

Deep convolutional neural networks for pedestrian detection

New algorithm improves speed and accuracy of pedestrian detection

Pushing the Limits of Deep CNNs for Pedestrian Detection

  • intro: “set a new record on the Caltech pedestrian dataset, lowering the log-average miss rate from 11.7% to 8.9%”
  • arxiv: http://arxiv.org/abs/1603.04525

A Real-Time Deep Learning Pedestrian Detector for Robot Navigation

A Real-Time Pedestrian Detector using Deep Learning for Human-Aware Navigation

Is Faster R-CNN Doing Well for Pedestrian Detection?

Reduced Memory Region Based Deep Convolutional Neural Network Detection

Fused DNN: A deep neural network fusion approach to fast and robust pedestrian detection

Multispectral Deep Neural Networks for Pedestrian Detection

Vehicle Detection

DAVE: A Unified Framework for Fast Vehicle Detection and Annotation

Traffic-Sign Detection

Traffic-Sign Detection and Classification in the Wild

Boundary / Edge / Contour Detection

Holistically-Nested Edge Detection

Unsupervised Learning of Edges

Pushing the Boundaries of Boundary Detection using Deep Learning

Convolutional Oriented Boundaries

Richer Convolutional Features for Edge Detection

Skeleton Detection

Object Skeleton Extraction in Natural Images by Fusing Scale-associated Deep Side Outputs

DeepSkeleton: Learning Multi-task Scale-associated Deep Side Outputs for Object Skeleton Extraction in Natural Images

Fruit Detection

Deep Fruit Detection in Orchards

Image Segmentation for Fruit Detection and Yield Estimation in Apple Orchards

Others

Deep Deformation Network for Object Landmark Localization

Fashion Landmark Detection in the Wild

Deep Learning for Fast and Accurate Fashion Item Detection

Visual Relationship Detection with Language Priors

OSMDeepOD - OSM and Deep Learning based Object Detection from Aerial Imagery (formerly known as “OSM-Crosswalk-Detection”)

Selfie Detection by Synergy-Constraint Based Convolutional Neural Network

Associative Embedding:End-to-End Learning for Joint Detection and Grouping

Deep Cuboid Detection: Beyond 2D Bounding Boxes

Object Proposal

DeepProposal: Hunting Objects by Cascading Deep Convolutional Layers

Scale-aware Pixel-wise Object Proposal Networks

Attend Refine Repeat: Active Box Proposal Generation via In-Out Localization

Learning to Segment Object Proposals via Recursive Neural Networks

Localization

Beyond Bounding Boxes: Precise Localization of Objects in Images

Weakly Supervised Object Localization with Multi-fold Multiple Instance Learning

Weakly Supervised Object Localization Using Size Estimates

Localizing objects using referring expressions

LocNet: Improving Localization Accuracy for Object Detection

Learning Deep Features for Discriminative Localization

ContextLocNet: Context-Aware Deep Network Models for Weakly Supervised Localization

Tutorials

Convolutional Feature Maps: Elements of efficient (and accurate) CNN-based object detection

Projects

TensorBox: a simple framework for training neural networks to detect objects in images

Object detection in torch: Implementation of some object detection frameworks in torch

Using DIGITS to train an Object Detection network

FCN-MultiBox Detector

Blogs

Convolutional Neural Networks for Object Detection

http://rnd.azoft.com/convolutional-neural-networks-object-detection/

Introducing automatic object detection to visual search (Pinterest)

Deep Learning for Object Detection with DIGITS

Analyzing The Papers Behind Facebook’s Computer Vision Approach

How to Train a Deep-Learned Object Detection Model in the Microsoft Cognitive Toolkit

Object Detection in Satellite Imagery, a Low Overhead Approach

You Only Look Twice — Multi-Scale Object Detection in Satellite Imagery With Convolutional Neural Networks

Faster R-CNN Pedestrian and Car Detection

  • 0
    点赞
  • 1
    收藏
    觉得还不错? 一键收藏
  • 0
    评论
### 回答1: 当然,很高兴为您介绍基于点云的3D目标检测方法的发展。以下是一些相关研究的文献和它们的主要贡献: 1. PointNet: Deep Learning on Point Sets for 3D Classification and Segmentation.(Charles R. Qi,2017) 这篇文章首次将深度学习引入了点云数据的处理中,并提出了PointNet模型,该模型通过将点云数据映射到高维空间中,对点云数据进行处理,并能够对点云进行分类和分割。PointNet模型采用了平移不变性和旋转不变性,并使用最大池化函数来实现这一点。 2. Frustum PointNets for 3D Object Detection from RGB-D Data.(Charles R. Qi,2018) 这篇文章提出了一种新的基于点云的3D目标检测方法,称为Frustum PointNets。该方法基于二维边界框(frustum)从RGB-D数据中提取点云数据,并使用PointNet模型进行处理。Frustum PointNets在KITTI数据集上实现了最先进的性能,并且可以在GPU上进行实时运行。 3. PointRCNN: 3D Object Proposal Generation and Detection from Point Cloud.(Shaoshuai Shi,2019) 这篇文章提出了PointRCNN方法,这是一种基于点云的3D目标检测方法。PointRCNN首先使用Region Proposal Network(RPN)生成候选物体区域,然后使用PointNet++对候选区域中的点云进行特征提取,并使用二维卷积网络(CNN)进行物体分类和边界框回归。PointRCNN在KITTI数据集上实现了最先进的性能,并且在扩展性和速度方面也具有很好的表现。 4. PV-RCNN: Point-Voxel Feature Set Abstraction for 3D Object Detection.(Shiwei Li,2020) 这篇文章提出了一种新的基于点云和体素(voxel)的3D目标检测方法,称为PV-RCNN。PV-RCNN将点云数据转换为体素表示,并使用PointNet++和二维卷积网络(CNN)对点和体素特征进行提取。该方法在KITTI数据集上实现了最先进的性能,并且具有很好的鲁棒性和可扩展性。 以上是基于点云的3D目标检测方法发展的一些关键研究,这些方法在实践中取得了很好的成果,为3D目标检测领域的进一步研究提供了很好的基础。 ### 回答2: 基于点云的3D目标检测方法是计算机视觉和机器学习领域的一个热门话题。这种方法通过利用稀疏3D点云数据来检测和识别真实世界中的物体。 近年来,随着激光雷达技术和大规模3D点云数据集的发展,基于点云的3D目标检测方法取得了显著的进展。早期的研究工作主要集中在基于传统的特征提取算法,如基于体素的表示和3D形状描述符等。然而,这些方法对于点云数据中的噪声和遮挡等问题并不具备很好的鲁棒性。 随着深度学习的兴起,基于点云的3D目标检测方法开始采用神经网络来进行特征学习和目标检测。一种重要的方法是基于PointNet的网络架构。PointNet通过将点云数据作为输入,利用多层感知机来学习点云的全局特征表示,并通过投影机制对点云的局部特征进行聚合。这种方法能够在不同形状和姿态的目标之间建立联系,并实现准确的检测和分类。 为了提高PointNet的性能,研究者们提出了很多改进方法。例如,PointNet++提出了一种逐层的架构,利用不同尺度的局部聚合来学习更丰富的特征。同时,为了解决点云数据的不均匀采样问题,有学者提出了基于多尺度的负采样和动态调整采样密度的方法,以提高模型的性能。 另外,一些研究还引入了基于投影的方法,例如VoteNet和HVNet。这些方法通过将点云投影到2D平面上,利用图像的检测和识别方法来进行目标检测和姿态估计。 总的来说,基于点云的3D目标检测方法在过去几年中取得了长足的发展。研究者们通过引入深度学习和神经网络技术,不断改进和优化算法,使得点云的特征学习和目标检测变得更加准确和鲁棒。未来,我们可以期待这一领域的进一步发展和创新。

“相关推荐”对你有帮助么?

  • 非常没帮助
  • 没帮助
  • 一般
  • 有帮助
  • 非常有帮助
提交
评论
添加红包

请填写红包祝福语或标题

红包个数最小为10个

红包金额最低5元

当前余额3.43前往充值 >
需支付:10.00
成就一亿技术人!
领取后你会自动成为博主和红包主的粉丝 规则
hope_wisdom
发出的红包
实付
使用余额支付
点击重新获取
扫码支付
钱包余额 0

抵扣说明:

1.余额是钱包充值的虚拟货币,按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载,可以购买VIP、付费专栏及课程。

余额充值