RPN-2 Anchor Generator与RPN Head

最新推荐文章于 2022-12-31 16:45:14 发布

comea23

最新推荐文章于 2022-12-31 16:45:14 发布

阅读量551

点赞数 2

文章标签： python 深度学习卷积

本文链接：https://blog.csdn.net/comea23/article/details/123906781

版权

计算得到骨干网络的输出特征图后，进行两个步骤：1)生成Anchor并放置到特征图上的每一个点；2)将输出特征图输入RPN Head，在特征图的每个位置上预测Anchor的目标与回归参数。对应于图中的红色框部分。
在这里插入图片描述

    def forward(self, image_list, feature_maps):
        # type: (ImageList, List[Tensor]) -> List[Tensor]
        # 获取每个预测特征层的尺寸(height, width)
        grid_sizes = list([feature_map.shape[-2:] for feature_map in feature_maps])

        # 获取输入图像的height和width
        image_size = image_list.tensors.shape[-2:]

        # 获取变量类型和设备类型
        dtype, device = feature_maps[0].dtype, feature_maps[0].device

        # one step in feature map equate n pixel stride in origin image
        # 计算特征层上的一步等于原始图像上的步长
        strides = [[torch.tensor(image_size[0] // g[0], dtype=torch.int64, device=device),
                    torch.tensor(image_size[1] // g[1], dtype=torch.int64, device=device)] for g in grid_sizes]

        # 根据提供的sizes和aspect_ratios生成anchors模板
        self.set_cell_anchors(dtype, device)

        # 计算/读取所有anchors的坐标信息（这里的anchors信息是映射到原图上的所有anchors信息，不是anchors模板）
        # 得到的是一个list列表，对应每张预测特征图映射回原图的anchors坐标信息
        anchors_over_all_feature_maps = self.cached_grid_anchors(grid_sizes, strides)

        anchors = torch.jit.annotate(List[List[torch.Tensor]], [])
        # 遍历一个batch中的每张图像
        for i, (image_height, image_width) in enumerate(image_list.image_sizes):
            anchors_in_image = []
            # 遍历每张预测特征图映射回原图的anchors坐标信息
            for anchors_per_feature_map in anchors_over_all_feature_maps:
                anchors_in_image.append(anchors_per_feature_map)
            anchors.append(anchors_in_image)
        # 将每一张图像的所有预测特征层的anchors坐标信息拼接在一起
        # anchors是个list，每个元素为一张图像的所有anchors信息
        anchors = [torch.cat(anchors_per_image) for anchors_per_image in anchors]
        # Clear the cache in case that memory leaks.
        self._cache.clear()
        return anchors

对于输入特征图，先获取原图像尺寸与特征图尺寸，计算得到下采样倍数。再按照预先设置的Anchor大小与比例，创建Anchor模板。

    def generate_anchors(self, scales, aspect_ratios, dtype=torch.float32, device=torch.device("cpu")):
        # type: (List[int], List[float], torch.dtype, torch.device) -> Tensor
        """
        compute anchor sizes
        Arguments:
            scales: sqrt(anchor_area)
            aspect_ratios: h/w ratios
            dtype: float32
            device: cpu/gpu
        """
        scales = torch.as_tensor(scales, dtype=dtype, device=device)
        aspect_ratios = torch.as_tensor(aspect_ratios, dtype=dtype, device=device)
        h_ratios = torch.sqrt(aspect_ratios)
        w_ratios = 1.0 / h_ratios

        # [r1, r2, r3]' * [s1, s2, s3]
        # number of elements is len(ratios)*len(scales)
        ws = (w_ratios[:, None] * scales[None, :]).view(-1)
        hs = (h_ratios[:, None] * scales[None, :]).view(-1)

        # left-top, right-bottom coordinate relative to anchor center(0, 0)
        # 生成的anchors模板都是以（0, 0）为中心的, shape [len(ratios)*len(scales), 4]
        base_anchors = torch.stack([-ws, -hs, ws, hs], dim=1) / 2

        return base_anchors.round()

这里的Anchor模板是以(0, 0)为中心点设置的。例如预先设置的长宽比例为(0.5, 1.0, 2.0)，尺寸包括(32, 64, 128, 256, 512)。以32为例，1.0比例下，以(0, 0)为中心点时Anchor的左上右下坐标分别为(-16, -16)与(16, 16)，面积为32×32=1024。0.5比例下0.5w×w=1024，w的大小为45.25，h的大小为1024/45.25=22.62，四舍五入后坐标为(-11, -23)与(11, 23)。之后将特征图上的点按照下采样倍数映射到原图的坐标上，将映射后的坐标与Anchor模板相加，得到Anchor在原图上的坐标。
之后将特征图送入RPN Head处理。RPN Head包含三个3×3卷积层。先使用一个3×3卷积对特征图上计算，特征图通道数不变。输出结果分别送入Obj_conv与Reg_conv计算，长宽大小不变，通道数分别为Anchors数量N与4N。如上一步中，共设计了3×5=15个Anchor，这里的N即为15。

comea23

关注

2
点赞
踩
0

收藏

觉得还不错? 一键收藏
1
评论
RPN-2 Anchor Generator与RPN Head

计算得到骨干网络的输出特征图后，进行两个步骤：1)生成Anchor并放置到特征图上的每一个点；2)将输出特征图输入RPN Head，在特征图的每个位置上预测Anchor的目标与回归参数。对应于图中的红色框部分。 def forward(self, image_list, feature_maps): # type: (ImageList, List[Tensor]) -> List[Tensor] # 获取每个预测特征层的尺寸(height, width)
复制链接

扫一扫