anchor_size, aspect_ratio, anchors以及torchvision 中 AnchorGenerator 解读

RessCris

已于 2023-04-06 20:51:43 修改

阅读量959

点赞数

分类专栏：计算机视觉文章标签：深度学习目标检测人工智能

于 2023-02-13 14:43:24 首次发布

本文链接：https://blog.csdn.net/weixin_41783424/article/details/128953564

版权

计算机视觉专栏收录该内容

19 篇文章

订阅专栏

文章介绍了AnchorGenerator在目标检测算法中的功能，它是用于生成不同大小的anchors，以适应多尺寸目标。AnchorGenerator的初始化需要指定基础大小和宽高比，通过forward函数结合特征图和步长计算出原图上的anchor点。测试部分展示了如何在图像上生成anchors。

摘要生成于 C知道，由 DeepSeek-R1 满血版支持，前往体验 >

基本释义

在一些目标检测的算法中，常常会看到关于如何生成 anchors 的描述，涉及到多尺寸目标检测的时候，就需要有不同大小的anchors作为proposal。

他们之间的关系是
在这里插入图片描述

转换后的 width_b, height_b 满足以下关系
$width\_b/height\_b = width/height * aspect\_ratio$

AnchorGenerator

torchvision.models.detection.anchor_utils.py
这个模块的主要功能（调用forward函数）最终生成的结果是在原图基础上，给定feature_map 表达的采样倍数下，在原图尺寸生成的 anchors.这里要注意的就是 feature_map 只是一个倍数参考。
比如原图为 (800, 800), 给定feature_map的H/W 为 25/25, 则说明采样比例为 800/25 = 32.
那么 anchors 的中心点的 x/y 坐标值就会分布在(0, 32， 64, …)

首先来看 AnchorGenerator 类的基本结构
在这里插入图片描述

初始化

输入参数：

sizes 最后生成的 anchors 的大小基础
aspect_ratios: 参考上面的释义，指的就是宽度/高度的比值

def __init__(
        self,
        sizes=((128, 256, 512),),
        aspect_ratios=((0.5, 1.0, 2.0),),
    ):
        super(AnchorGenerator, self).__init__()

        if not isinstance(sizes[0], (list, tuple)):
            # TODO change this
            sizes = tuple((s,) for s in sizes)
        if not isinstance(aspect_ratios[0], (list, tuple)):
            aspect_ratios = (aspect_ratios,) * len(sizes)

        assert len(sizes) == len(aspect_ratios)

        self.sizes = sizes
        self.aspect_ratios = aspect_ratios
        self.cell_anchors = [self.generate_anchors(size, aspect_ratio)
                             for size, aspect_ratio in zip(sizes, aspect_ratios)]

generate_anchors

功能是生成 zero_centered 的所有 anchor.
相当于以 (0,0) 为anchor 中心生成对应的anchor.

def generate_anchors(self, scales: List[int], aspect_ratios: List[float], dtype: torch.dtype = torch.float32,
                         device: torch.device = torch.device("cpu")):
        scales = torch.as_tensor(scales, dtype=dtype, device=device)
        aspect_ratios = torch.as_tensor(aspect_ratios, dtype=dtype, device=device)
        h_ratios = torch.sqrt(aspect_ratios)
        w_ratios = 1 / h_ratios

        ws = (w_ratios[:, None] * scales[None, :]).view(-1)
        hs = (h_ratios[:, None] * scales[None, :]).view(-1)

        base_anchors = torch.stack([-ws, -hs, ws, hs], dim=1) / 2
        return base_anchors.round()

grid_anchors

功能是对每个需要生成 anchor 的中心点生成 anchor，可以理解为将上面 generate_anchors 得到的anchors 放到每个点上。

# torchvision.models.detection.anchor_utils.py  --> AnchorGenerator   --> grid_anchors
# For every (base anchor, output anchor) pair,
# offset each zero-centered base anchor by the center of the output anchor.
anchors.append(
    (shifts.view(-1, 1, 4) + base_anchors.view(1, -1, 4)).reshape(-1, 4)
)

forward

给定 image，和对应的 feature_map, 计算需要生成anchors 的点。

grid_sizes 为 feature_map 的 H, W.
stride 为 image 和 feature_map H,W 的倍数
grid_sizes 中的每个点和 stride相乘就得到了原image中需要生成anchors的点。

def forward(self, image_list: ImageList, feature_maps: List[Tensor]) -> List[Tensor]:
    grid_sizes = [feature_map.shape[-2:] for feature_map in feature_maps]
    image_size = image_list.tensors.shape[-2:]
    ...
    strides = [[torch.tensor(image_size[0] // g[0], dtype=torch.int64, device=device),
                torch.tensor(image_size[1] // g[1], dtype=torch.int64, device=device)] for g in grid_sizes]
  	...
    anchors_over_all_feature_maps = self.grid_anchors(grid_sizes, strides)
  
    return anchors

测试 AnchorGenerator 的调用

import numpy as np
import torch
from torchvision.models.detection.anchor_utils import AnchorGenerator
from torchvision.models.detection.image_list import ImageList

if __name__ == "__main__":
    # 对一幅图上的每一个点，生成我任意想要生成的 anchors.
    
    ## 读取现有的图像
    img = np.load("/root/autodl-tmp/datasets/pannuke/hvi_format/test_sample/0.npy") # 
    # print(img)
    image = np.transpose(img[..., :3], (2, 0, 1))/255
    image = torch.tensor(image, dtype=torch.float32)
    image_sizes = image.shape[-2:]

    im = ImageList(image, image_sizes)
    feature_maps = list(torch.rand(1, 1280, 25, 25))

    anchor_gen = AnchorGenerator()
    anchor_gen(im, feature_maps)