keras Mask Rcnn代码走读（六）-Detections

最新推荐文章于 2021-10-11 12:01:41 发布

AI剑客

最新推荐文章于 2021-10-11 12:01:41 发布

阅读量362

点赞数

分类专栏： AI keras

本文链接：https://blog.csdn.net/qq_43258953/article/details/103481349

版权

DetectionLayer处理输入的ROI、类别、边界框修正信息及原图元数据，输出基于原图的物体检测结果。它利用image_meta获取图片原始信息，包括经过填充扩展的图片尺寸和原始图片在新图中的位置。代码流程包括：获取ROI得分、坐标修正、背景区域剔除、得分阈值过滤、NMS处理和返回top k框信息，为Mask信息生成做准备。

摘要由CSDN通过智能技术生成

DetectionLayer输入包含ROI，及对应的class，box修正信息，同时还有输入的image信息（image_meta），最终的输出都基于原图上的物体检测，原图的信息来自于image_meta。

# Detections
# output is [batch, num_detections, (y1, x1, y2, x2, class_id, score)] in
# normalized coordinates
detections = DetectionLayer(config, name="mrcnn_detection")(
    [rpn_rois, mrcnn_class, mrcnn_bbox, input_image_meta])

input_image_meta项，它记录了每一张图片的原始信息，[batch, n]维矩阵，n是固定的，其生成于config.py文件中。

# Image meta data length
# See compose_image_meta() for details
self.IMAGE_META_SIZE = 1 + 3 + 3 + 4 + 1 + self.NUM_CLASSES


if mode == "square":
    # Get new height and width
    h, w = image.shape[:2]
    top_pad = (max_dim - h) // 2
    bottom_pad = max_dim - h - top_pad
    left_pad = (max_dim - w) // 2
    right_pad = max_dim - w - left_pad
    padding = [(top_pad, bottom_pad), (left_pad, right_pad), (0, 0)]
    image = np.pad(image, padding, mode='constant', constant_values=0)
    window = (top_pad, left_pad, h + top_pad, w + left_pad)

我们将深蓝色的原图（不要求w等于h）通过填充的方式扩展为浅灰色的大图用于feed网络，"window"记录了以新图左上角为原点建立坐标系，原图的左上角点和右下角点的坐标，由于坐标系选取的是像素坐标，"window"记录的就是原始图片的大小，其蕴含了输入图片中真正有意义的位置信息。
在这里插入图片描述
用于解析image_meta结构的函数如下：

def parse_image_meta_graph(meta):
    """Parses a tensor that contains image attributes to its components.
    See compose_image_meta() for more details.

    meta: [batch, meta length] where meta length depends on NUM_CLASSES

    Returns a dict of the parsed tensors.
    """
    image_id = meta[:, 0]
    original_image_shape = meta