[detectron2 ] Mask R-CNN代码笔记

最新推荐文章于 2024-10-30 20:09:25 发布

Ah丶Weii

最新推荐文章于 2024-10-30 20:09:25 发布

阅读量2.2k

点赞数 3

分类专栏：笔记文章标签： cnn python

本文链接：https://blog.csdn.net/weixin_43823854/article/details/118083494

版权

本文详细介绍了Mask R-CNN的实现，包括FPN backbone的增强，RPN_head的匹配与采样策略，以及ROI_head的推理与损失计算。在RPN_head中，通过IOU匹配正负样本，使用1:1比例进行训练。ROI_head则依据得分过滤提案，进行ROI Pooling并计算损失。最后，深入探讨了Mask分支的细节，特别是mask损失的计算方式。

摘要生成于 C知道，由 DeepSeek-R1 满血版支持，前往体验 >

主要代码文件路径：
总架构文件: detectron2/detectron2/modeling/meta_arch/rcnn.py
默认配置：detectron2/detectron2/config/defaults.py
RPN_head：detectron2/detectron2/modeling/proposal_generator/rpn.py
Mask_head: detectron2/detectron2/modeling/roi_heads/mask_head.py
fast_rcnn(box_loss) : detectron2/detectron2/modeling/roi_heads/fast_rcnn.py
pooler: detectron2/detectron2/modeling/poolers.py
roi_head: detectron2/detectron2/modeling/roi_heads/roi_heads.py
sampling label: detectron2/detectron2/modeling/sampling.py
match: detectron2/detectron2/modeling/matcher.py

总结：

拖了一年的Mask/Faster R-CNN代码，终于也算是基本结束了，之前好几次迷糊在RPN以及ROI的理论中，之前由于懒，觉得detectron2把Faster R-CNN的代码拆解的太零散了。
当我一年后重新看最初的这篇ICCV 2017 best paper的代码时，很多东西也没有那么难了，很多坑都在后续的一切论文代码中提前适应了（比如对于偏移量deltas的理解，对于pooler RoI Align的操作，在sparse rcnn中，对于FPNbackbone的理解，在FCOS 中，对于nms的操作，对于后续box branch和mask branch的输出的理解等。）这一个学期，从一开始的配置环境迷迷糊糊开始启动training，当时感觉代码太复杂，自己都无法啃的下，到后续的打pdb断点一下一下的调试输出，到现在看完了anchor-free， anchor-based， end-to-end without NMS detection三种类型的detector，这也是一种成长，我相信看完了复杂的检测代码，在去看其他的代码会更加的适应。

1. FPN Backbone

多加一层的p6输出。

    backbone = FPN(
        bottom_up=bottom_up,
        in_features=in_features,
        out_channels=out_channels,  
        norm=cfg.MODEL.FPN.NORM,
        top_block=LastLevelMaxPool(), # 多加了一个p6
        fuse_type=cfg.MODEL.FPN.FUSE_TYPE,
    )

2. RPN_head

RPN loss的计算是通过IOU，分别ignore，positive，以及negative。
Rpn loss使用的正负样本的比例是1:1，各128个bbox来训练。
传入ROI Head的proposals是通过 objectness 分支的降序排列进行筛选的。
对于使用FPN的Mask R-CNN。
使用pairwise_iou以及matchrpn中pos和neg 以及ignore proposals的设定。

		# forward rpn
        if self.proposal_generator is not None:
            proposals, proposal_losses = self.proposal_generator(images, features, gt_instances)

  (proposal_generator): RPN(
    (rpn_head): StandardRPNHead(
      (conv): Conv2d(
        256, 256, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1)
        (activation): ReLU()
      )
      (objectness_logits): Conv2d(256, 3, kernel_size=(1, 1), stride=(1, 1))
      (anchor_deltas): Conv2d(256, 12, kernel_size=(1, 1), stride=(1, 1))
    )
    (anchor_generator): DefaultAnchorGenerator(
      (cell_anchors): BufferList()
    )
  )

# 重要的配置信息
_C.MODEL.RPN.IOU_THRESHOLDS = [0.3, 0.7]
_C.MODEL.RPN.IOU_LABELS = [0, -1, 1]
_C.MODEL.RPN.BATCH_SIZE_PER_IMAGE = 256
_C.MODEL.RPN.POSITIVE_FRACTION = 0.5


# nms 操作
# 以下是只用C5层的 anchor分布

_C.MODEL.RPN.PRE_NMS_TOPK_TRAIN = 12000
_C.MODEL.RPN.PRE_NMS_TOPK_TEST = 6000

# 如果是用FPN 5层类似的结构的话
# 在训练阶段 NMS之前2000个proposals NMS之后1000个
POST_NMS_TOPK_TRAIN: 2000 # 1000
POST_NMS_TOPK_TEST:

最低0.47元/天解锁文章