maskrcnn-benchmark-master(十):box_head的loss文件

目录

前言

一、make_roi_box_loss_evaluator()函数

二、FastRCNNLossComputation

 1、__init__()函数

 2、match_targets_to_proposals()函数

 3、prepare_targets()函数

 4、subsample()函数

 5、__call__()函数


前言

上一篇博客已经介绍完box_head的inference文件,我们知道了box_head在inference阶段是如何进行筛选box(Proposals),最后得到输出的instances结果,本篇博客将介绍在box_head阶段的loss是如何进行计算的,有了前面RPN的loss文件介绍,box_head的loss文件介绍将会简单很多,它涉及到的函数和RPN的loss文件基本是类似的。

一、make_roi_box_loss_evaluator()函数

box_head的计算loss相关操作在your_project/maskrcnn_benchmark/modeling/roi_heads/box_head/loss.py文件中,我们首先来看看make_roi_box_loss_evaluator()这个函数:

def make_roi_box_loss_evaluator(cfg):
    # 匹配器 用于给RPN输出给ROI_head部分的Proposals分配真实的标签
    matcher = Matcher(
        cfg.MODEL.ROI_HEADS.FG_IOU_THRESHOLD,
        cfg.MODEL.ROI_HEADS.BG_IOU_THRESHOLD,
        allow_low_quality_matches=False,
    )
    # box的编解码器
    bbox_reg_weights = cfg.MODEL.ROI_HEADS.BBOX_REG_WEIGHTS
    box_coder = BoxCoder(weights=bbox_reg_weights)
    # 在box_head预测得到的Proposals中筛选正负样本用于训练
    fg_bg_sampler = BalancedPositiveNegativeSampler(
        cfg.MODEL.ROI_HEADS.BATCH_SIZE_PER_IMAGE, cfg.MODEL.ROI_HEADS.POSITIVE_FRACTION
    )
    # 这个不管它!
    cls_agnostic_bbox_reg = cfg.MODEL.CLS_AGNOSTIC_BBOX_REG
    # 损失的计算  用于计算整个box_head部分的loss
    loss_evaluator = FastRCNNLossComputation(
        matcher,
        fg_bg_sampler,
        box_coder,
        cls_agnostic_bbox_reg
    )

    return loss_evaluator

这部分代码是不是跟RPN中loss文件很相似,不能说相似,简直一模一样~,因为RPN的输出是Proposals,这些Proposals是作为ROI_heads的一个输入,但是在训练阶段,box_head部分将对这些Proposals选一部分用作训练,作为box_head的输入。

从上面代码中可以看出,整个函数主要由三个类的对象构成,这三个类分别是:

Matcher类:这个类主要是给RPN输出的Proposals分配对应类别标签的。
BalancedPositiveNegativeSampler类:用于筛选上述的哪些Proposals可以当作正负样本用于计算loss的过程。
FastRCNNLossComputation:用于给筛选(和inference阶段筛选机制不一样)过后得到的Proposals计算其对应的loss。

因为Match类BalancedPositiveNegativeSampler类已经在:

maskrcnn-benchmark-master(六):RPN的loss文件

介绍过了,所以本篇只着重介绍FastRCNNLossComputation类

 二、FastRCNNLossComputation

在FastRCNNLossComputation类中主要包含有五个函数,它们分别是:__init__()函数、match_targets_to_proposals()函数、prepare_targets()函数、subsample()函数、__call__()函数,它们之间的简单调用关系如下图所示:

 1、__init__()函数

 我们首先看一下__init__()函数:

class FastRCNNLossComputation(object):
    """
    Computes the loss for Faster R-CNN.
    Also supports FPN
    对Faster-RCNN部分的loss进行计算
    """

    def __init__(
        self,
        proposal_matcher,
        fg_bg_sampler,
        box_coder,
        cls_agnostic_bbox_reg=False
    ):
        """
        Arguments:
            proposal_matcher (Matcher)
            fg_bg_sampler (BalancedPositiveNegativeSampler)
            box_coder (BoxCoder)
        """
        # 定义用于Proposals标签匹配的 匹配器
        self.proposal_matcher = proposal_matcher
        # 定义用于正负样本筛选的 筛选器
        self.fg_bg_sampler = fg_bg_sampler
        # 定义box的编解码器
        self.box_coder = box_coder
        self.cls_agnostic_bbox_reg = cls_agnostic_bbox_reg

2、match_targets_to_proposals()函数

 __init__()函数主要是定义相关的类变量,没有什么好介绍的,下面来看一下match_targets_to_proposals()函数:

    def match_targets_to_proposals(self, proposal, target):
        # gt 和 RPN输出的Proposals之间的 IOU矩阵
        match_quality_matrix = boxlist_iou(target, proposal)

        # 预测边框和对应的gt的索引, 背景边框为-2 , 模糊边框为-1 
        # eg:matched_idxs[4] = 6 :表示第5个预测边框所分配的GT的id为6
        matched_idxs = self.proposal_matcher(match_quality_matrix)
        # Fast RCNN only need "labels" field for selecting the targets、
        # 获得 GT 的类别标签
        target = target.copy_with_fields("labels")
        # get the targets corresponding GT for each proposal
        # NB: need to clamp the indices because we can have a single
        # GT in the image, and matched_idxs can be -2, which goes
        # out of bounds

        # 将所有的背景边框和模糊边框的标签都对应成第一个gt的标签
        # 其实就是将target中的box 和label按照Proposals的对应顺序重新排序的一个过程,
        # 将target中box顺序和matched_idxs中的GT的id顺序保持一致
        matched_targets = target[matched_idxs.clamp(min=0)]
        # 将对应的列表索引添加至gt列表中
        matched_targets.add_field("matched_idxs", matched_idxs)
        return matched_targets

 3、prepare_targets()函数

由此我们可以看出match_targets_to_proposals()函数返回的是一个BoxList对象,这个对象中的box是Proposals所对应的GT的box,labels是Proposals所对应GT的label。

接下来我们开看看prepare_targets()函数:

# 准备类别标签和box偏移量标签
    def prepare_targets(self, proposals, targets):
        # 类别标签列表
        labels = []
        # 回归box标签列表
        regression_targets = []
        # 分别对每一张图片进行操作
        for proposals_per_image, targets_per_image in zip(proposals, targets):
            matched_targets = self.match_targets_to_proposals(
                proposals_per_image, targets_per_image
            )
            matched_idxs = matched_targets.get_field("matched_idxs")
            # 获取每一个target所对应的label标签
            labels_per_image = matched_targets.get_field("labels")
            labels_per_image = labels_per_image.to(dtype=torch.int64)

            # Label background (below the low threshold)
            # 背景标签
            bg_inds = matched_idxs == Matcher.BELOW_LOW_THRESHOLD
            labels_per_image[bg_inds] = 0

            # Label ignore proposals (between low and high thresholds)
            # 被忽视的样本
            ignore_inds = matched_idxs == Matcher.BETWEEN_THRESHOLDS
            labels_per_image[ignore_inds] = -1  # -1 is ignored by sampler

            # compute regression targets
            # 计算偏移量target  因为网络预测的结果是偏移量,所以需要生成偏移量标签
            regression_targets_per_image = self.box_coder.encode(
                matched_targets.bbox, proposals_per_image.bbox
            )
            # 对生成好的类别标签和偏移量标签进行保存
            labels.append(labels_per_image)
            regression_targets.append(regression_targets_per_image)

        return labels, regression_targets

 4、subsample()函数

上面的prepare_targets()函数就是返回为Proposals匹配好的类别标签和box偏移量标签,接下来将通过subsample()进行正负样本的筛选,我们来看看相关代码:

    def subsample(self, proposals, targets):
        """
        This method performs the positive/negative sampling, and return
        the sampled proposals.
        Note: this function keeps a state.

        Arguments:
            proposals (list[BoxList])
            targets (list[BoxList])
        """
        # 获取Proposals分配好的标签
        labels, regression_targets = self.prepare_targets(proposals, targets)
        # 获取被分配为正负样本的索引  由BalancedPositiveNegativeSampler类进行分配
        sampled_pos_inds, sampled_neg_inds = self.fg_bg_sampler(labels)

        proposals = list(proposals)
        # add corresponding label and regression_targets information to the bounding boxes
        for labels_per_image, regression_targets_per_image, proposals_per_image in zip(
            labels, regression_targets, proposals
        ):
            # 给BoxList类型的Proposals添加标签信息
            proposals_per_image.add_field("labels", labels_per_image)
            proposals_per_image.add_field(
                "regression_targets", regression_targets_per_image
            )

        # distributed sampled proposals, that were obtained on all feature maps
        # concatenated via the fg_bg_sampler, into individual feature map levels
        # 对BoxList类型的Proposals进行正负样本筛选(对应的标签也会一并被筛选出来)
        for img_idx, (pos_inds_img, neg_inds_img) in enumerate(
            zip(sampled_pos_inds, sampled_neg_inds)
        ):
            img_sampled_inds = torch.nonzero(pos_inds_img | neg_inds_img).squeeze(1)
            proposals_per_image = proposals[img_idx][img_sampled_inds]
            proposals[img_idx] = proposals_per_image
        # 得到筛选之后的Proposals(BoxList对象 其中包含有label信息)
        self._proposals = proposals
        return proposals

 5、__call__()函数

通过subsample()筛选得到可以用于训练阶段的Proposals之后(注意这些Proposals都是从RPN输出的Proposals中进行筛选的),就要进行最后loss计算工作了,我们来看一下__call__()函数:

    def __call__(self, class_logits, box_regression):
        """
        Computes the loss for Faster R-CNN.
        This requires that the subsample method has been called beforehand.

        Arguments:
            class_logits (list[Tensor])
            box_regression (list[Tensor])

        Returns:
            classification_loss (Tensor)
            box_loss (Tensor)
        """
        # 预测的Proposals类别
        class_logits = cat(class_logits, dim=0)
        # 预测的Proposals box偏移量
        box_regression = cat(box_regression, dim=0)
        device = class_logits.device

        if not hasattr(self, "_proposals"):
            raise RuntimeError("subsample needs to be called before")
        # 获取用于box head训练阶段输入的Proposals和它对应的标签
        proposals = self._proposals
        # 获取proposals对应的真实类别标签
        labels = cat([proposal.get_field("labels") for proposal in proposals], dim=0)
        # 获取proposals对应的真实box 偏移量
        regression_targets = cat(
            [proposal.get_field("regression_targets") for proposal in proposals], dim=0
        )
        # 计算类别分类loss
        classification_loss = F.cross_entropy(class_logits, labels)

        # get indices that correspond to the regression targets for
        # the corresponding ground truth labels, to be used with
        # advanced indexing
        # 不对负样本的box进行回归loss计算  所以选出正样本的索引
        sampled_pos_inds_subset = torch.nonzero(labels > 0).squeeze(1)
        labels_pos = labels[sampled_pos_inds_subset]
        if self.cls_agnostic_bbox_reg:
            map_inds = torch.tensor([4, 5, 6, 7], device=device)
        else:
            map_inds = 4 * labels_pos[:, None] + torch.tensor(
                [0, 1, 2, 3], device=device)
        # 计算box 偏移量的回归loss
        box_loss = smooth_l1_loss(
            box_regression[sampled_pos_inds_subset[:, None], map_inds],
            regression_targets[sampled_pos_inds_subset],
            size_average=False,
            beta=1,
        )
        box_loss = box_loss / labels.numel()

        return classification_loss, box_loss

至此box_head的loss文件就算介绍完了,总结一下整个过程就是:

1、给每个Proposals匹配对应的类别标签和box标签,进而计算出box偏移量的回归标签。

2、在对这些匹配好标签的Proposals筛选正负样本。(只有提前匹配好标签才知道哪些是正类哪些是负类嘛)

3、通过网络对Proposals的最后的分类结果和box偏移量的回归结果,结合匹配好的标签计算loss。

同时box_head部分算是已经介绍完了,下一次将展开mask_head的介绍,待续~

码字不易  未经许可  请勿随意转载!

  • 1
    点赞
  • 1
    收藏
    觉得还不错? 一键收藏
  • 3
    评论

“相关推荐”对你有帮助么?

  • 非常没帮助
  • 没帮助
  • 一般
  • 有帮助
  • 非常有帮助
提交
评论 3
添加红包

请填写红包祝福语或标题

红包个数最小为10个

红包金额最低5元

当前余额3.43前往充值 >
需支付:10.00
成就一亿技术人!
领取后你会自动成为博主和红包主的粉丝 规则
hope_wisdom
发出的红包
实付
使用余额支付
点击重新获取
扫码支付
钱包余额 0

抵扣说明:

1.余额是钱包充值的虚拟货币,按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载,可以购买VIP、付费专栏及课程。

余额充值