Tensorflow object detection API 源码阅读笔记：Mask R-CNN

最新推荐文章于 2024-05-10 03:59:33 发布

Wayne2019

最新推荐文章于 2024-05-10 03:59:33 发布

阅读量3.9k

点赞数 1

分类专栏： TensorFlow 文章标签： Tensorflow mask-r-cnn 物体检测计算机视觉深度学习

本文链接：https://blog.csdn.net/Wayne2019/article/details/78780944

版权

这篇我们追寻Tensorflow object detection API 源码中Mask R-CNN的痕迹。先说结论： Tensorflow object detection API 实现了ROI align，实现了mask branch（略有不同）。目前没有提供mask的预训练模型。

检测系列的总结博客和知乎中非常多，如：

目标检测-RCNN系列

 Mask-RCNN技术解析

 CNN 在图像分割中的简史：从 R-CNN 到 Mask R-CNN

我们还是主要关注Tensorflow object detection API 的代码实现细节。

ROI align vs ROI pooling
ROI pooling由于取整的问题，得到的特征和ROI的坐标（原图上）不是完全对应的。ROI Pooling层解析
（Caffe）。原始的ROI pooling就是spp的特殊情况（先量化到格点上，然后再分块max pool，与spp一样每块可能大小有细微差别；而ROI Align不进行量化，直接均匀分块，然后用双线性插值求ROI中每个格点的值）。
FAQs: how to sample grid points within a cell?
• 4 regular points in 2x2 sub-cells
• other implementation could work

Tensorflow object detection API这里对ROI pooling的实现不一样: Additionally, instead of using the ROI Pooling layer and Position-sensitive ROI Pooling layers used by [31, 6], we use Tensorflow’s “crop and resize” operation which uses bilinear interpolation to resample part of an image onto a fixed sized grid. 代码为

"""
ROI就是features_to_crop（第一阶段特征提取器得到的feature map）上的一块crop，是依据RPN的proposal_boxes截取的。
"""
def _compute_second_stage_input_feature_maps(self, features_to_crop,

最低0.47元/天解锁文章

Wayne2019

关注

1
点赞
踩
4

收藏

觉得还不错? 一键收藏
6
评论
Tensorflow object detection API 源码阅读笔记：Mask R-CNN

"""The ground-truth label is 1 if the anchor is positive, and is 0 if the anchor is negative. An anchor is labeled as positive if:(a) the anchor is the one with highest IoU overlap with a ground-tru
复制链接

扫一扫