目前的实例分割方法可分为3类:
- top-down,也叫做 detect-then-segment,顾名思义,先检测后分割,如FCIS, Mask-RCNN, PANet, Mask Scoring R-CNN;
- bottom-up,也叫Embedding-cluster,将每个实例看成一个类别;然后按照聚类的思路,最大类间距,最小类内距,对每个像素做embedding,最后做grouping分出不同的instance。Grouping的方法:learned associative embedding,A discriminative loss function,SGN,SSAP. 一般bottom-up效果差于top-down;
- direct的方法。不同与上述两类方法,直接得到实例分割结果,如SOLO。
目录
- Deep Snake for Real-Time Instance Segmentation [2001]
- PointRend: Image Segmentation as Rendering [1912]
- SOLO: Segmenting Objects by Locations [1912]
- FCOS: Fully Convolutional One-Stage Object Detection [1904]
- TensorMask: A Foundation for Dense Object Segmentation [1903]
- Hybrid Task Cascade for Instance Segmentation [1901]
- Path Aggregation Network for Instance Segmentation [1803]
- Mask R-CNN [1703]
- Fully Convolutional Instance-aware Semantic Segmentation [1611]
- Deep Watershed Transform for Instance Segmentation [1611]
- InstanceCut: from Edges to Instances with MultiCut [1611]
- Instance-sensitive Fully Convolutional Networks [1603]
- SGN: Sequential Grouping Networks for Instance Segmentation [16XX]
Deep Snake for Real-Time Instance Segmentation [2001]
物体轮廓用循环卷积来学习特征确定offset
文章提出two-stage、real-time的instance segmentation方法:1、得到初始的目标轮廓;2、轮廓迭代变形,以得到最终精准的目标边界;
不同于CornerNet、ExtremeNet等方法直接回归目标边界上的点,受到传统snake算法的启发,Deep Snake 通过迭代变形一个初始轮廓来得到最终的目标边界;文章使用循环卷积来学习目标轮廓的结构特征;对512x512大小的图片在1080Ti上达到32.3 fps
先得到检测框,再得到diamond框,学习offset得到四个极点,得到octagon 轮廓; 输入到 deep snake,学习边缘;
PointRend: Image Segmentation as Rendering [1912]
方法:对输出的coarse mask 和 fine-grained 特征选部分点进行学习,用提出的subdivision mask rendering算法迭代,得到不确定边界区域的mask
Render方法:subdivision 、adaptive sampling、 ray-tracing
Subdivision: 只在(与周围区域十分不同的)区域计算;其他区域直接插值;
如何选点:将coarse mask 上采样X2;选择p接近0.5的N个点;用MLP得到这N个点预测值;一直迭代,直到到达某一分辨率;
但训练阶段,不采用迭代的方式训练;而是使用随机采样
SOLO: Segmenting Objects by Locations [1912]
现有方法分为两类:
top-down,也叫做 detect-then-segment,顾名思义,先检测后分割,如FCIS, Mask-RCNN, PANet, Mask Scoring R-CNN、TensorMask
bottom-up,也叫Embedding-cluster,将每个实例看成一个类别;然后按照聚类的思路,最大类间距,最小类内距,对每个像素做embedding,最后做grouping分出不同的instance。Grouping的方法:learned associative embedding,A discriminative loss function,SGN,SSAP. 一般bottom-up效果差于top-down。<