Hariharan, Bharath, et al. “Simultaneous detection and segmentation.” European Conference on Computer Vision. Springer International Publishing, 2014. (Citations: 234).
1. Object proposals. Use MCG to generate 2 k object proposals per image.
2. Feature extraction. We use two separate CNNs to extract features from both the cropped bounding box region as well as from the region foreground.
3. Region classification. We train an SVM on top of the CNN features to classify each category.
1 Pipeline
See Fig. The idea is very similar to R-CNN, but with segments.1. Object proposals. Use MCG to generate 2 k object proposals per image.
2. Feature extraction. We use two separate CNNs to extract features from both the cropped bounding box region as well as from the region foreground.
3. Region classification. We train an SVM on top of the CNN features to classify each category.
4. Region refinement. Because there may be multiple overlapping regions, we do a NMS afterwards.