Object Detection(目标检测神文)
置顶 2018年08月21日 14:25:28 Mars_WH 阅读数:12695
目标检测神文,非常全而且持续在更新。转发自:https://handong1587.github.io/deep_learning/2015/10/09/object-detection.html,如有侵权联系删除。
更新时间:
20190226
我会跟进原作者博客持续更新,加入自己对目标检测领域的一些新研究及论文解读。博客根据需求直接进行关键字搜索,例如2018,可找到最新论文。
文章目录
- Papers
- 损失函数
- R-CNN
- Fast R-CNN
- Faster R-CNN
- Faster R-CNN: Towards Real-Time Object Detection with Region Proposal Networks
- R-CNN minus R
- Faster R-CNN in MXNet with distributed implementation and data parallelization
- Contextual Priming and Feedback for Faster R-CNN
- An Implementation of Faster RCNN with Study for Region Sampling
- Interpretable R-CNN
- [AAAI2019]Object Detection based on Region Decomposition and Assembly
- Light-Head R-CNN
- MultiBox
- SPP-Net
- Spatial Pyramid Pooling in Deep Convolutional Networks for Visual Recognition
- DeepID-Net: Deformable Deep Convolutional Neural Networks for Object Detection
- Object Detectors Emerge in Deep Scene CNNs
- segDeepM: Exploiting Segmentation and Context in Deep Neural Networks for Object Detection
- Object Detection Networks on Convolutional Feature Maps
- Improving Object Detection with Deep Convolutional Networks via Bayesian Optimization and Structured Prediction
- DeepBox: Learning Objectness with Convolutional Networks
- MR-CNN
- YOLO
- You Only Look Once: Unified, Real-Time Object Detection
- darkflow - translate darknet to tensorflow. Load trained weights, retrain/fine-tune them using tensorflow, export constant graph def to C++
- Start Training YOLO with Our Own Data
- YOLO: Core ML versus MPSNNGraph
- TensorFlow YOLO object detection on Android
- Computer Vision in iOS – Object Detection
- YOLOv2
- YOLOv3
- DenseBox
- SSD
- DSSD
- FSSD
- ESSD
- Inside-Outside Net (ION)
- Factors in Finetuning Deep Model for object detection
- CRAFT
- OHEM
- R-FCN
- MS-CNN
- PVANET
- GBD-Net
- Gated Bi-directional CNN for Object Detection
- Crafting GBD-Net for Object Detection
- StuffNet: Using ‘Stuff’ to Improve Object Detection
- Generalized Haar Filter based Deep Networks for Real-Time Object Detection in Traffic Scene
- Hierarchical Object Detection with Deep Reinforcement Learning
- Learning to detect and localize many objects from few examples
- Speed/accuracy trade-offs for modern convolutional object detectors
- SqueezeDet: Unified, Small, Low Power Fully Convolutional Neural Networks for Real-Time Object Detection for Autonomous Driving
- Feature Pyramid Network (FPN)
- Feature Pyramid Networks for Object Detection
- Action-Driven Object Detection with Top-Down Visual Attentions
- Beyond Skip Connections: Top-Down Modulation for Object Detection
- Wide-Residual-Inception Networks for Real-time Object Detection
- Attentional Network for Visual Object Detection
- Learning Chained Deep Features and Classifiers for Cascade in Object Detection
- DeNet: Scalable Real-time Object Detection with Directed Sparse Sampling
- Discriminative Bimodal Networks for Visual Localization and Detection with Natural Language Queries
- Spatial Memory for Context Reasoning in Object Detection
- Accurate Single Stage Detector Using Recurrent Rolling Convolution
- Deep Occlusion Reasoning for Multi-Camera Multi-Target Detection
- LCDet: Low-Complexity Fully-Convolutional Neural Networks for Object Detection in Embedded Systems
- Point Linking Network for Object Detection
- Perceptual Generative Adversarial Networks for Small Object Detection
- Few-shot Object Detection
- Yes-Net: An effective Detector Based on Global Information
- SMC Faster R-CNN: Toward a scene-specialized multi-object detector
- Towards lightweight convolutional neural networks for object detection
- RON: Reverse Connection with Objectness Prior Networks for Object Detection
- Mimicking Very Efficient Network for Object Detection
- Residual Features and Unified Prediction Network for Single Stage Detection
- Deformable Part-based Fully Convolutional Network for Object Detection
- Adaptive Feeding: Achieving Fast and Accurate Detections by Adaptively Combining Object Detectors
- Recurrent Scale Approximation for Object Detection in CNN
- DSOD
- DSOD: Learning Deeply Supervised Object Detectors from Scratch
- Object Detection from Scratch with Deep Supervision
- Focal Loss for Dense Object Detection
- Focal Loss Dense Detector for Vehicle Surveillance
- CoupleNet: Coupling Global Structure with Local Parts for Object Detection
- Incremental Learning of Object Detectors without Catastrophic Forgetting
- Zoom Out-and-In Network with Map Attention Decision for Region Proposal and Object Detection
- StairNet: Top-Down Semantic Aggregation for Accurate One Shot Detection
- Dynamic Zoom-in Network for Fast Object Detection in Large Images
- Zero-Annotation Object Detection with Web Knowledge Transfer
- MegDet
- MegDet: A Large Mini-Batch Object Detector
- Single-Shot Refinement Neural Network for Object Detection
- Receptive Field Block Net for Accurate and Fast Object Detection
- An Analysis of Scale Invariance in Object Detection - SNIP
- Feature Selective Networks for Object Detection
- Learning a Rotation Invariant Detector with Rotatable Bounding Box
- Scalable Object Detection for Stylized Objects
- Learning Object Detectors from Scratch with Gated Recurrent Feature Pyramids
- Deep Regionlets for Object Detection
- Training and Testing Object Detectors with Virtual Images
- Large-Scale Object Discovery and Detector Adaptation from Unlabeled Video
- Spot the Difference by Object Detection
- Localization-Aware Active Learning for Object Detection
- Object Detection with Mask-based Feature Encoding
- LSTD: A Low-Shot Transfer Detector for Object Detection
- Domain Adaptive Faster R-CNN for Object Detection in the Wild
- Pseudo Mask Augmented Object Detection
- Revisiting RCNN: On Awakening the Classification Power of Faster RCNN
- Decoupled Classification Refinement: Hard False Positive Suppression for Object Detection
- Learning Region Features for Object Detection
- Single-Shot Bidirectional Pyramid Networks for High-Quality Object Detection
- Object Detection for Comics using Manga109 Annotations
- Task-Driven Super Resolution: Object Detection in Low-resolution Images
- Transferring Common-Sense Knowledge for Object Detection
- Multi-scale Location-aware Kernel Representation for Object Detection
- Loss Rank Mining: A General Hard Example Mining Method for Real-time Detectors
- DetNet: A Backbone network for Object Detection
- Robust Physical Adversarial Attack on Faster R-CNN Object Detector
- AdvDetPatch: Attacking Object Detectors with Adversarial Patches
- Attacking Object Detectors via Imperceptible Patches on Background
- Physical Adversarial Examples for Object Detectors
- Quantization Mimic: Towards Very Tiny CNN for Object Detection
- Object detection at 200 Frames Per Second
- Object Detection using Domain Randomization and Generative Adversarial Refinement of Synthetic Images
- SNIPER: Efficient Multi-Scale Training
- Soft Sampling for Robust Object Detection
- MetaAnchor: Learning to Detect Objects with Customized Anchors
- Localization Recall Precision (LRP): A New Performance Metric for Object Detection
- Auto-Context R-CNN
- Pooling Pyramid Network for Object Detection
- Modeling Visual Context is Key to Augmenting Object Detection Datasets
- Dual Refinement Network for Single-Shot Object Detection
- Acquisition of Localization Confidence for Accurate Object Detection
- CornerNet: Detecting Objects as Paired Keypoints
- Unsupervised Hard Example Mining from Videos for Improved Object Detection
- SAN: Learning Relationship between Convolutional Features for Multi-Scale Object Detection
- A Survey of Modern Object Detection Literature using Deep Learning
- Tiny-DSOD: Lightweight Object Detection for Resource-Restricted Usages
- Deep Feature Pyramid Reconfiguration for Object Detection
- MDCN: Multi-Scale, Deep Inception Convolutional Neural Networks for Efficient Object Detection
- Recent Advances in Object Detection in the Age of Deep Convolutional Neural Networks
- Deep Learning for Generic Object Detection: A Survey
- Training Confidence-Calibrated Classifier for Detecting Out-of-Distribution Samples
- ScratchDet:Exploring to Train Single-Shot Object Detectors from Scratch
- Fast and accurate object detection in high resolution 4K and 8K video using GPUs
- Hybrid Knowledge Routed Modules for Large-scale Object Detection
- Gradient Harmonized Single-stage Detector
- M2Det: A Single-Shot Object Detector based on Multi-Level Feature Pyramid Network
- BAN: Focusing on Boundary Context for Object Detection
- Multi-layer Pruning Framework for Compressing Single Shot MultiBox Detector
- R2CNN++: Multi-Dimensional Attention Based Rotation Invariant Detector with Robust Anchor Strategy
- DeRPN: Taking a further step toward more general object detection
- Fast Efficient Object Detection Using Selective Attention
- Sampling Techniques for Large-Scale Object Detection from Sparsely Annotated Objects
- Efficient Coarse-to-Fine Non-Local Module for the Detection of Small Objects
- Deep Regionlets: Blended Representation and Deep Learning for Generic Object Detection
- Grid R-CNN
- Transferable Adversarial Attacks for Image and Video Object Detection
- Anchor Box Optimization for Object Detection
- AutoFocus: Efficient Multi-Scale Inference
- Practical Adversarial Attack Against Object Detector
- Learning Efficient Detector with Semi-supervised Adaptive Distillation
- Scale-Aware Trident Networks for Object Detection
- Region Proposal by Guided Anchoring
- Consistent Optimization for Single-Shot Object Detection
- Bottom-up Object Detection by Grouping Extreme and Center Points
- A Single-shot Object Detector with Feature Aggragation and Enhancement
- Bag of Freebies for Training Object Detection Neural Networks
- Non-Maximum Suppression (NMS)
- End-to-End Integration of a Convolutional Network, Deformable Parts Model and Non-Maximum Suppression
- A convnet for non-maximum suppression
- Soft-NMS – Improving Object Detection With One Line of Code
- Learning non-maximum suppression
- Relation Networks for Object Detection
- Learning Pairwise Relationship for Multi-object Detection in Crowded Scenes
- Daedalus: Breaking Non-Maximum Suppression in Object Detection via Adversarial Examples
- Adversarial Examples
- Weakly Supervised Object Detection
- Track and Transfer: Watching Videos to Simulate Strong Human Supervision for Weakly-Supervised Object Detection
- Weakly supervised object detection using pseudo-strong labels
- Saliency Guided End-to-End Learning for Weakly Supervised Object Detection
- Visual and Semantic Knowledge Transfer for Large Scale Semi-supervised Object Detection
- Video Object Detection
- Learning Object Class Detectors from Weakly Annotated Video
- Analysing domain shift factors between videos and images for object detection
- Video Object Recognition
- Deep Learning for Saliency Prediction in Natural Video
- T-CNN: Tubelets with Convolutional Neural Networks for Object Detection from Videos
- Object Detection from Video Tubelets with Convolutional Neural Networks
- Object Detection in Videos with Tubelets and Multi-context Cues
- Context Matters: Refining Object Detection in Video with Recurrent Neural Networks
- CNN Based Object Detection in Large Video Images
- Object Detection in Videos with Tubelet Proposal Networks
- Flow-Guided Feature Aggregation for Video Object Detection
- Video Object Detection using Faster R-CNN
- Improving Context Modeling for Video Object Detection and Tracking
- Temporal Dynamic Graph LSTM for Action-driven Video Object Detection
- Mobile Video Object Detection with Temporally-Aware Feature Maps
- Towards High Performance Video Object Detection
- Impression Network for Video Object Detection
- Spatial-Temporal Memory Networks for Video Object Detection
- 3D-DETNet: a Single Stage Video-Based Vehicle Detector
- Object Detection in Videos by Short and Long Range Object Linking
- Object Detection in Video with Spatiotemporal Sampling Networks
- Towards High Performance Video Object Detection for Mobiles
- Optimizing Video Object Detection via a Scale-Time Lattice
- Pack and Detect: Fast Object Detection in Videos Using Region-of-Interest Packing
- Fast Object Detection in Compressed Video
- Tube-CNN: Modeling temporal evolution of appearance for object detection in video
- AdaScale: Towards Real-time Video Object Detection Using Adaptive Scaling
- Object Detection on Mobile Devices
- Object Detection in 3D
- Vote3Deep: Fast Object Detection in 3D Point Clouds Using Efficient Convolutional Neural Networks
- Complex-YOLO: Real-time 3D Object Detection on Point Clouds
- Focal Loss in 3D Object Detection
- 3D Object Detection Using Scale Invariant and Feature Reweighting Networks
- 3D Backbone Network for 3D Object Detection
- Object Detection on RGB-D
- Zero-Shot Object Detection
- Salient Object Detection
- Best Deep Saliency Detection Models (CVPR 2016 & 2015)
- Large-scale optimization of hierarchical features for saliency prediction in natural images
- Predicting Eye Fixations using Convolutional Neural Networks
- Saliency Detection by Multi-Context Deep Learning
- DeepSaliency: Multi-Task Deep Neural Network Model for Salient Object Detection
- SuperCNN: A Superpixelwise Convolutional Neural Network for Salient Object Detection
- Shallow and Deep Convolutional Networks for Saliency Prediction
- Recurrent Attentional Networks for Saliency Detection
- Two-Stream Convolutional Networks for Dynamic Saliency Prediction
- Unconstrained Salient Object Detection
- Unconstrained Salient Object Detection via Proposal Subset Optimization
- DHSNet: Deep Hierarchical Saliency Network for Salient Object Detection
- Salient Object Subitizing
- Deeply-Supervised Recurrent Convolutional Neural Network for Saliency Detection
- Saliency Detection via Combining Region-Level and Pixel-Level Predictions with CNNs
- Edge Preserving and Multi-Scale Contextual Neural Network for Salient Object Detection
- A Deep Multi-Level Network for Saliency Prediction
- Visual Saliency Detection Based on Multiscale Deep CNN Features
- A Deep Spatial Contextual Long-term Recurrent Convolutional Network for Saliency Detection
- Deeply supervised salient object detection with short connections
- Weakly Supervised Top-down Salient Object Detection
- SalGAN: Visual Saliency Prediction with Generative Adversarial Networks
- Visual Saliency Prediction Using a Mixture of Deep Neural Networks
- A Fast and Compact Salient Score Regression Network Based on Fully Convolutional Network
- Saliency Detection by Forward and Backward Cues in Deep-CNNs
- Supervised Adversarial Networks for Image Saliency Detection
- Group-wise Deep Co-saliency Detection
- Towards the Success Rate of One: Real-time Unconstrained Salient Object Detection
- Amulet: Aggregating Multi-level Convolutional Features for Salient Object Detection
- Learning Uncertain Convolutional Features for Accurate Saliency Detection
- Deep Edge-Aware Saliency Detection
- Self-explanatory Deep Salient Object Detection
- PiCANet: Learning Pixel-wise Contextual Attention in ConvNets and Its Application in Saliency Detection
- DeepFeat: A Bottom Up and Top Down Saliency Model Based on Deep Features of Convolutional Neural Nets
- Recurrently Aggregating Deep Features for Salient Object Detection
- Deep saliency: What is learnt by a deep network about saliency?
- Contrast-Oriented Deep Neural Networks for Salient Object Detection
- Salient Object Detection by Lossless Feature Reflection
- HyperFusion-Net: Densely Reflective Fusion for Salient Object Detection
- Video Saliency Detection
- Visual Relationship Detection
- Visual Relationship Detection with Language Priors
- ViP-CNN: A Visual Phrase Reasoning Convolutional Neural Network for Visual Relationship Detection
- Visual Translation Embedding Network for Visual Relation Detection
- Deep Variation-structured Reinforcement Learning for Visual Relationship and Attribute Detection
- Detecting Visual Relationships with Deep Relational Networks
- Identifying Spatial Relations in Images using Convolutional Neural Networks
- PPR-FCN: Weakly Supervised Visual Relation Detection via Parallel Pairwise R-FCN
- Natural Language Guided Visual Relationship Detection
- Detecting Visual Relationships Using Box Attention
- Google AI Open Images - Visual Relationship Track
- Context-Dependent Diffusion Network for Visual Relationship Detection
- A Problem Reduction Approach for Visual Relationships Detection
- Face Deteciton
- Multi-view Face Detection Using Deep Convolutional Neural Networks
- From Facial Parts Responses to Face Detection: A Deep Learning Approach
- Compact Convolutional Neural Network Cascade for Face Detection
- Face Detection with End-to-End Integration of a ConvNet and a 3D Model
- CMS-RCNN: Contextual Multi-Scale Region-based CNN for Unconstrained Face Detection
- Towards a Deep Learning Framework for Unconstrained Face Detection
- Supervised Transformer Network for Efficient Face Detection
- UnitBox: An Advanced Object Detection Network
- Bootstrapping Face Detection with Hard Negative Examples
- Grid Loss: Detecting Occluded Faces
- A Multi-Scale Cascade Fully Convolutional Network Face Detector
- MTCNN
- Joint Face Detection and Alignment using Multi-task Cascaded Convolutional Neural Networks
- Face Detection using Deep Learning: An Improved Faster RCNN Approach
- Faceness-Net: Face Detection through Deep Facial Part Responses
- Multi-Path Region-Based Convolutional Neural Network for Accurate Detection of Unconstrained “Hard Faces”
- End-To-End Face Detection and Recognition
- Face R-CNN
- Face Detection through Scale-Friendly Deep Convolutional Networks
- Scale-Aware Face Detection
- Detecting Faces Using Inside Cascaded Contextual CNN
- Multi-Branch Fully Convolutional Network for Face Detection
- SSH: Single Stage Headless Face Detector
- Dockerface: an easy to install and use Faster R-CNN face detector in a Docker container
- FaceBoxes: A CPU Real-time Face Detector with High Accuracy
- S3FD: Single Shot Scale-invariant Face Detector
- Detecting Faces Using Region-based Fully Convolutional Networks
- AffordanceNet: An End-to-End Deep Learning Approach for Object Affordance Detection
- Face Attention Network: An effective Face Detector for the Occluded Faces
- Feature Agglomeration Networks for Single Stage Face Detection
- Face Detection Using Improved Faster RCNN
- PyramidBox: A Context-assisted Single Shot Face Detector
- A Fast Face Detection Method via Convolutional Neural Network
- Beyond Trade-off: Accelerate FCN-based Face Detector with Higher Accuracy
- Real-Time Rotation-Invariant Face Detection with Progressive Calibration Networks
- SFace: An Efficient Network for Face Detection in Large Scale Variations
- Survey of Face Detection on Low-quality Images
- Anchor Cascade for Efficient Face Detection
- Adversarial Attacks on Face Detectors using Neural Net based Constrained Optimization
- Selective Refinement Network for High Performance Face Detection
- DSFD: Dual Shot Face Detector
- Learning Better Features for Face Detection with Feature Fusion and Segmentation Supervision
- FA-RPN: Floating Region Proposals for Face Detection
- Robust and High Performance Face Detector
- DAFE-FD: Density Aware Feature Enrichment for Face Detection
- Improved Selective Refinement Network for Face Detection
- Revisiting a single-stage method for face detection
- Detect Small Faces
- Person Head Detection
- Pedestrian Detection / People Detection
- Pedestrian Detection aided by Deep Learning Semantic Tasks
- Deep Learning Strong Parts for Pedestrian Detection
- Taking a Deeper Look at Pedestrians
- Convolutional Channel Features
- End-to-end people detection in crowded scenes
- Learning Complexity-Aware Cascades for Deep Pedestrian Detection
- Deep convolutional neural networks for pedestrian detection
- Scale-aware Fast R-CNN for Pedestrian Detection
- New algorithm improves speed and accuracy of pedestrian detection
- Pushing the Limits of Deep CNNs for Pedestrian Detection
- A Real-Time Deep Learning Pedestrian Detector for Robot Navigation
- A Real-Time Pedestrian Detector using Deep Learning for Human-Aware Navigation
- Is Faster R-CNN Doing Well for Pedestrian Detection?
- Unsupervised Deep Domain Adaptation for Pedestrian Detection
- Reduced Memory Region Based Deep Convolutional Neural Network Detection
- Fused DNN: A deep neural network fusion approach to fast and robust pedestrian detection
- Detecting People in Artwork with CNNs
- Multispectral Deep Neural Networks for Pedestrian Detection
- Box-level Segmentation Supervised Deep Neural Networks for Accurate and Real-time Multispectral Pedestrian Detection
- Deep Multi-camera People Detection
- Expecting the Unexpected: Training Detectors for Unusual Pedestrians with Adversarial Imposters
- What Can Help Pedestrian Detection?
- Illuminating Pedestrians via Simultaneous Detection & Segmentation
- Rotational Rectification Network for Robust Pedestrian Detection
- STD-PD: Generating Synthetic Training Data for Pedestrian Detection in Unannotated Videos
- Too Far to See? Not Really! — Pedestrian Detection with Scale-aware Localization Policy
- Repulsion Loss: Detecting Pedestrians in a Crowd
- Aggregated Channels Network for Real-Time Pedestrian Detection
- Illumination-aware Faster R-CNN for Robust Multispectral Pedestrian Detection
- Exploring Multi-Branch and High-Level Semantic Networks for Improving Pedestrian Detection
- Pedestrian-Synthesis-GAN: Generating Pedestrian Data in Real Scene and Beyond
- PCN: Part and Context Information for Pedestrian Detection with CNNs
- Small-scale Pedestrian Detection Based on Somatic Topology Localization and Temporal Feature Aggregation
- Occlusion-aware R-CNN: Detecting Pedestrians in a Crowd
- Multispectral Pedestrian Detection via Simultaneous Detection and Segmentation
- Pedestrian Detection with Autoregressive Network Phases
- The Cross-Modality Disparity Problem in Multispectral Pedestrian Detection
- Vehicle Detection
- DAVE: A Unified Framework for Fast Vehicle Detection and Annotation
- Evolving Boxes for fast Vehicle Detection
- Fine-Grained Car Detection for Visual Census Estimation
- SINet: A Scale-insensitive Convolutional Neural Network for Fast Vehicle Detection
- Label and Sample: Efficient Training of Vehicle Object Detector from Sparsely Labeled Data
- Domain Randomization for Scene-Specific Car Detection and Pose Estimation
- ShuffleDet: Real-Time Vehicle Detection Network in On-board Embedded UAV Imagery
- Traffic-Sign Detection
- Traffic-Sign Detection and Classification in the Wild
- Evaluating State-of-the-art Object Detector on Challenging Traffic Light Data
- Detecting Small Signs from Large Images
- Localized Traffic Sign Detection with Multi-scale Deconvolution Networks
- Detecting Traffic Lights by Single Shot Detection
- A Hierarchical Deep Architecture and Mini-Batch Selection Method For Joint Traffic Sign and Light Detection
- Skeleton Detection
- Object Skeleton Extraction in Natural Images by Fusing Scale-associated Deep Side Outputs
- DeepSkeleton: Learning Multi-task Scale-associated Deep Side Outputs for Object Skeleton Extraction in Natural Images
- SRN: Side-output Residual Network for Object Symmetry Detection in the Wild
- Hi-Fi: Hierarchical Feature Integration for Skeleton Detection
- Fruit Detection
- Shadow Detection
- Fast Shadow Detection from a Single Image Using a Patched Convolutional Neural Network
- A+D-Net: Shadow Detection with Adversarial Shadow Attenuation
- Stacked Conditional Generative Adversarial Networks for Jointly Learning Shadow Detection and Shadow Removal
- Direction-aware Spatial Context Features for Shadow Detection
- Direction-aware Spatial Context Features for Shadow Detection and Removal
- Others Detection
- Deep Deformation Network for Object Landmark Localization
- Fashion Landmark Detection in the Wild
- Deep Learning for Fast and Accurate Fashion Item Detection
- OSMDeepOD - OSM and Deep Learning based Object Detection from Aerial Imagery (formerly known as “OSM-Crosswalk-Detection”)
- Selfie Detection by Synergy-Constraint Based Convolutional Neural Network
- Associative Embedding:End-to-End Learning for Joint Detection and Grouping
- Deep Cuboid Detection: Beyond 2D Bounding Boxes
- Automatic Model Based Dataset Generation for Fast and Accurate Crop and Weeds Detection
- Deep Learning Logo Detection with Data Expansion by Synthesising Context
- Scalable Deep Learning Logo Detection
- Pixel-wise Ear Detection with Convolutional Encoder-Decoder Networks
- Automatic Handgun Detection Alarm in Videos Using Deep Learning
- Objects as context for part detection
- Using Deep Networks for Drone Detection
- Cut, Paste and Learn: Surprisingly Easy Synthesis for Instance Detection
- Target Driven Instance Detection
- DeepVoting: An Explainable Framework for Semantic Part Detection under Partial Occlusion
- VPGNet: Vanishing Point Guided Network for Lane and Road Marking Detection and Recognition
- Grab, Pay and Eat: Semantic Food Detection for Smart Restaurants
- ReMotENet: Efficient Relevant Motion Event Detection for Large-scale Home Surveillance Videos
- Deep Learning Object Detection Methods for Ecological Camera Trap Data
- EL-GAN: Embedding Loss Driven Generative Adversarial Networks for Lane Detection
- Towards End-to-End Lane Detection: an Instance Segmentation Approach
- iCAN: Instance-Centric Attention Network for Human-Object Interaction Detection
- Densely Supervised Grasp Detector (DSGD)
- Object Proposal
- DeepProposal: Hunting Objects by Cascading Deep Convolutional Layers
- Scale-aware Pixel-wise Object Proposal Networks
- Attend Refine Repeat: Active Box Proposal Generation via In-Out Localization
- Learning to Segment Object Proposals via Recursive Neural Networks
- Learning Detection with Diverse Proposals
- ScaleNet: Guiding Object Proposal Generation in Supermarkets and Beyond
- Improving Small Object Proposals for Company Logo Detection
- Open Logo Detection Challenge
- AttentionMask: Attentive, Efficient Object Proposal Generation Focusing on Small Objects
- Localization
- Beyond Bounding Boxes: Precise Localization of Objects in Images
- Weakly Supervised Object Localization with Multi-fold Multiple Instance Learning
- Weakly Supervised Object Localization Using Size Estimates
- Active Object Localization with Deep Reinforcement Learning
- Localizing objects using referring expressions
- LocNet: Improving Localization Accuracy for Object Detection
- Learning Deep Features for Discriminative Localization
- ContextLocNet: Context-Aware Deep Network Models for Weakly Supervised Localization
- Ensemble of Part Detectors for Simultaneous Classification and Localization
- STNet: Selective Tuning of Convolutional Networks for Object Localization
- Soft Proposal Networks for Weakly Supervised Object Localization
- Fine-grained Discriminative Localization via Saliency-guided Faster R-CNN
- Tutorials / Talks
- Projects
- Detectron
- TensorBox: a simple framework for training neural networks to detect objects in images
- Object detection in torch: Implementation of some object detection frameworks in torch
- Using DIGITS to train an Object Detection network
- FCN-MultiBox Detector
- KittiBox: A car detection model implemented in Tensorflow.
- Deformable Convolutional Networks + MST + Soft-NMS
- How to Build a Real-time Hand-Detector using Neural Networks (SSD) on Tensorflow
- Metrics for object detection
- MobileNetv2-SSDLite
- Leaderboard
- Tools
- Blogs
- Convolutional Neural Networks for Object Detection
- Introducing automatic object detection to visual search (Pinterest)
- Deep Learning for Object Detection with DIGITS
- Analyzing The Papers Behind Facebook’s Computer Vision Approach
- Easily Create High Quality Object Detectors with Deep Learning
- How to Train a Deep-Learned Object Detection Model in the Microsoft Cognitive Toolkit
- Object Detection in Satellite Imagery, a Low Overhead Approach
- You Only Look Twice — Multi-Scale Object Detection in Satellite Imagery With Convolutional Neural Networks
- Faster R-CNN Pedestrian and Car Detection
- Small U-Net for vehicle detection
- Region of interest pooling explained
- Supercharge your Computer Vision models with the TensorFlow Object Detection API
- Understanding SSD MultiBox — Real-Time Object Detection In Deep Learning
- One-shot object detection
- An overview of object detection: one-stage methods
- deep learning object detection
Method | backbone | test size | VOC2007 | VOC2010 | VOC2012 | ILSVRC 2013 | MSCOCO 2015 | Speed |
---|---|---|---|---|---|---|---|---|
OverFeat | 24.3% | |||||||
R-CNN | AlexNet | 58.5% | 53.7% | 53.3% | 31.4% | |||
R-CNN | VGG17 | 66.0% | ||||||
SPP_net | ZF-5 | 54.2% | 31.84% | |||||
DeepID-Net | 64.1% | 50.3% | ||||||
NoC | 73.3% | 68.8% | ||||||
Fast-RCNN | VGG16 | 70.0% | 68.8% | 68.4% | 19.7%(@[0.5-0.95]), 35.9%(@0.5) | |||
MR-CNN | 78.2% | 73.9% | ||||||
Faster-RCNN | VGG16 | 78.8% | 75.9% | 21.9%(@[0.5-0.95]), 42.7%(@0.5) | 198ms | |||
Faster-RCNN | ResNet101 | 85.6% | 83.8% | 37.4%(@[0.5-0.95]), 59.0%(@0.5) | ||||
YOLO | 63.4% | 57.9% | 45 fps | |||||
YOLO | VGG-16 | 66.4% | 21 fps | |||||
YOLOv2 | 448x448 | 78.6% | 73.4% | 21.6%(@[0.5-0.95]), 44.0%(@0.5) | 40 fps | |||
SSD | VGG16 | 300x300 | 77.2% | 75.8% | 25.1%(@[0.5-0.95]), 43.1%(@0.5) | 46 fps | ||
SSD | VGG16 | 512x512 | 79.8% | 78.5% | 28.8%(@[0.5-0.95]), 48.5%(@0.5) | 19 fps | ||
SSD | ResNet101 | 300x300 | 28.0%(@[0.5-0.95]) | 16 fps | ||||
SSD | ResNet101 | 512x512 | 31.2%(@[0.5-0.95]) | 8 fps | ||||
DSSD | ResNet101 | 300x300 | 28.0%(@[0.5-0.95]) | 8 fps | ||||
DSSD | ResNet101 | 500x500 | 33.2%(@[0.5-0.95]) | 6 fps | ||||
ION | 79.2% | 76.4% | ||||||
CRAFT | 75.7% | 71.3% | 48.5% | |||||
OHEM | 78.9% | 76.3% | 25.5%(@[0.5-0.95]), 45.9%(@0.5) | |||||
R-FCN | ResNet50 | 77.4% | 0.12sec(K40), 0.09sec(TitianX) | |||||
R-FCN | ResNet101 | 79.5% | 0.17sec(K40), 0.12sec(TitianX) | |||||
R-FCN(ms train) | ResNet101 | 83.6% | 82.0% | 31.5%(@[0.5-0.95]), 53.2%(@0.5) | ||||
PVANet 9.0 | 84.9% | 84.2% | 750ms(CPU), 46ms(TitianX) | |||||
RetinaNet | ResNet101-FPN | |||||||
Light-Head R-CNN | Xception* | 800/1200 | 31.5%@[0.5:0.95] | 95 fps | ||||
Light-Head R-CNN | Xception* | 700/1100 | 30.7%@[0.5:0.95] | 102 fps | ||||
STDN | 80.9 (07+12) | |||||||
RefineDet | 83.8 (07+12) | 83.5 (07++12) | 41.8 | |||||
SNIP | 45.7 | |||||||
Relation-Network | 32.5 | |||||||
Cascade R-CNN | 42.8 | |||||||
MLKP | 80.6 (07+12) | 77.2 (07++12) | 28.6 | |||||
Fitness-NMS | 41.8 | |||||||
RFBNet | 82.2 (07+12) | |||||||
CornerNet | 42.1 | |||||||
PFPNet | 84.1 (07+12) | 83.7 (07++12) | 39.4 | |||||
Pelee | 70.9 (07+12) | |||||||
HKRM | 78.8 (07+12) | 37.8 | ||||||
M2Det | 44.2 | |||||||
SIN | 76.0 (07+12) | 73.1 (07++12) | 23.2 |