学习mask rcnn，day6（Detection Target层）

huhu1456

已于 2023-03-26 10:43:36 修改

阅读量89

点赞数 1

文章标签：学习机器学习深度学习 python

于 2023-03-26 10:42:56 首次发布

本文链接：https://blog.csdn.net/huhu1456/article/details/129776504

版权

DetectionTarget层主要负责处理ROI，去除无效和重叠的区域，确定正负样本，并按比例设置正负样本数量。它计算IOU来判断样本类型，分配类别和偏移量，并生成相应的MASK。该过程在训练阶段用于优化模型性能。

摘要由CSDN通过智能技术生成

Detection Target层作用：

1.将之前得到的2000个ROI，去掉用0充数的；

2.有的数据集会包括多个物体，去掉这种的；

3.判断正负样本，（即看候选框和其他候选框重合比例，大于阈值0.5的视为正样本，重合度小的视为负样本）；

4.设置负样本数量为正样本的3倍，总数默认为400个；

5.每一个正样本（ROI），需要得到其类别，用IOU最大的那个GT（即某个候选框与多个候选框重合，那就选重合度最大的那个）；

6.每一个正样本（ROI），需要得到与其GT-BOX的偏移量；

7.每一个正样本（ROI），需要得到与其最接近的GT-BOX对应的MASK；

8.返回所有结果，其中负样本偏移量和MASK都用0填充

# Remove zero padding
proposals, _ = trim_zeros_graph(proposals, name="trim_proposals")    #把0填充没用东西的全去除掉
gt_boxes, non_zeros = trim_zeros_graph(gt_boxes, name="trim_gt_boxes")
gt_class_ids = tf.boolean_mask(gt_class_ids, non_zeros,
                               name="trim_gt_class_ids")
gt_masks = tf.gather(gt_masks, tf.where(non_zeros)[:, 0], axis=2,
                     name="trim_gt_masks")

# Handle COCO crowds     去除重叠的标签
# A crowd box in COCO is a bounding box around several instances. Exclude
# them from training. A crowd box is given a negative class ID.
crowd_ix = tf.where(gt_class_ids < 0)[:, 0]
non_crowd_ix = tf.where(gt_class_ids > 0)[:, 0]
crowd_boxes = tf.gather(gt_boxes, crowd_ix)
crowd_masks = tf.gather(gt_masks, crowd_ix, axis=2)
gt_class_ids = tf.gather(gt_class_ids, non_crowd_ix)
gt_boxes = tf.gather(gt_boxes, non_crowd_ix)
gt_masks = tf.gather(gt_masks, non_crowd_ix, axis=2)

# Compute overlaps matrix [proposals, gt_boxes]    算重叠的比例（IOU值）
overlaps = overlaps_graph(proposals, gt_boxes)

# Compute overlaps with crowd boxes [anchors, crowds]
crowd_overlaps = overlaps_graph(proposals, crowd_boxes)
crowd_iou_max = tf.reduce_max(crowd_overlaps, axis=1)
no_crowd_bool = (crowd_iou_max < 0.001)

# Determine postive and negative ROIs
roi_iou_max = tf.reduce_max(overlaps, axis=1)
# 1. Positive ROIs are those with >= 0.5 IoU with a GT box
positive_roi_bool = (roi_iou_max >= 0.5)          #设置的阈值0.5,选出正样本
positive_indices = tf.where(positive_roi_bool)[:, 0]
# 2. Negative ROIs are those with < 0.5 with every GT box. Skip crowds.
negative_indices = tf.where(tf.logical_and(roi_iou_max < 0.5, no_crowd_bool))[:, 0]

# Subsample ROIs. Aim for 33% positive
# Positive ROIs
positive_count = int(config.TRAIN_ROIS_PER_IMAGE *
                     config.ROI_POSITIVE_RATIO)
positive_indices = tf.random_shuffle(positive_indices)[:positive_count]
positive_count = tf.shape(positive_indices)[0]
# Negative ROIs. Add enough to maintain positive:negative ratio.        #负样本的个数是正样本的3倍
r = 1.0 / config.ROI_POSITIVE_RATIO
negative_count = tf.cast(r * tf.cast(positive_count, tf.float32), tf.int32) - positive_count
negative_indices = tf.random_shuffle(negative_indices)[:negative_count]
# Gather selected ROIs
positive_rois = tf.gather(proposals, positive_indices)
negative_rois = tf.gather(proposals, negative_indices)

# Assign positive ROIs to GT boxes. 跟我当前最近的拿到手
positive_overlaps = tf.gather(overlaps, positive_indices)
roi_gt_box_assignment = tf.argmax(positive_overlaps, axis=1)
roi_gt_boxes = tf.gather(gt_boxes, roi_gt_box_assignment)
roi_gt_class_ids = tf.gather(gt_class_ids, roi_gt_box_assignment)

# Compute bbox refinement for positive ROIs
deltas = utils.box_refinement_graph(positive_rois, roi_gt_boxes)
deltas /= config.BBOX_STD_DEV

# Assign positive ROIs to GT masks
# Permute masks to [N, height, width, 1]
transposed_masks = tf.expand_dims(tf.transpose(gt_masks, [2, 0, 1]), -1)   #把重合度最大的那个拿到手
# Pick the right mask for each ROI
roi_masks = tf.gather(transposed_masks, roi_gt_box_assignment)

# Compute mask targets
boxes = positive_rois
if config.USE_MINI_MASK:
    # Transform ROI corrdinates from normalized image space
    # to normalized mini-mask space.
    y1, x1, y2, x2 = tf.split(positive_rois, 4, axis=1)
    gt_y1, gt_x1, gt_y2, gt_x2 = tf.split(roi_gt_boxes, 4, axis=1)
    gt_h = gt_y2 - gt_y1
    gt_w = gt_x2 - gt_x1
    y1 = (y1 - gt_y1) / gt_h
    x1 = (x1 - gt_x1) / gt_w
    y2 = (y2 - gt_y1) / gt_h
    x2 = (x2 - gt_x1) / gt_w
    boxes = tf.concat([y1, x1, y2, x2], 1)
box_ids = tf.range(0, tf.shape(roi_masks)[0])
masks = tf.image.crop_and_resize(tf.cast(roi_masks, tf.float32), boxes,
                                 box_ids,
                                 config.MASK_SHAPE)
# Remove the extra dimension from masks.
masks = tf.squeeze(masks, axis=3)

# Threshold mask pixels at 0.5 to have GT masks be 0 or 1 to use with
# binary cross entropy loss.
masks = tf.round(masks)

huhu1456

关注

1
点赞
踩
0

收藏

觉得还不错? 一键收藏
0
评论
学习mask rcnn，day6（Detection Target层）

transposed_masks = tf.expand_dims(tf.transpose(gt_masks, [2, 0, 1]), -1) #把重合度最大的那个拿到手。5.每一个正样本（ROI），需要得到其类别，用IOU最大的那个GT（即某个候选框与多个候选框重合，那就选重合度最大的那个）；3.判断正负样本，（即看候选框和其他候选框重合比例，大于阈值0.5的视为正样本，重合度小的视为负样本）；7.每一个正样本（ROI），需要得到与其最接近的GT-BOX对应的MASK；
复制链接

扫一扫