FCOS论文及源码详解(二)

Ashley-Yu

已于 2022-03-02 20:20:16 修改

阅读量1.9k

点赞数 3

分类专栏： paper 文章标签：深度学习 python

于 2020-12-06 10:53:46 首次发布

本文链接：https://blog.csdn.net/qq_44920947/article/details/110573958

版权

paper 专栏收录该内容

14 篇文章 5 订阅

订阅专栏

FCOS论文及源码详解（二）

FCOS项目
FCOS代码

在 FCOS论文及源码详解(一)中，已摘录并大致翻译论文中关于FCOS算法结构的部分，现对FCOS源码进行解析。

FCOS项目

FCOS项目.
其中，有关模型训练的部分说明如下：
Training
The following command line will train FCOS_imprv_R_50_FPN_1x on 8 GPUs with Synchronous Stochastic Gradient Descent (SGD):

python -m torch.distributed.launch \
    --nproc_per_node=8 \
    --master_port=$((RANDOM + 10000)) \
    tools/train_net.py \
    --config-file configs/fcos/fcos_imprv_R_50_FPN_1x.yaml \
    DATALOADER.NUM_WORKERS 2 \
    OUTPUT_DIR training_dir/fcos_imprv_R_50_FPN_1x

其中关键在于调用 tools/train_net.py
在文件夹中找到这一文件，便是从train_net.py这里开始读代码

FCOS代码

tools/train_net.py

main()函数中关键一句指向train()函数

model = train(cfg, args.local_rank, args.distributed)

train()函数开头则是调用build_detection_model()函数

model = build_detection_model(cfg)

build_detection_model()函数调用自fcos_core.modeling.detector，依次追索至fcos_core.modeling.detector.generalized_rcnn.GeneralizedRCNN，该类继承torch.nn.model，有三个实例化变量

self.backbone = build_backbone(cfg)
self.rpn = build_rpn(cfg, self.backbone.out_channels)
self.roi_heads = build_roi_heads(cfg, self.backbone.out_channels)

build_backbone()函数调用自fcos_core.modeling.backbone
build_rpn()函数调用自fcos_core.modeling.rpn.rpn
build_roi_heads()函数调用自fcos_core.modeling.roi_heads.roi_heads

build_backbone()

首先来看build_backbone()
关键一句

return registry.BACKBONES[cfg.MODEL.BACKBONE.CONV_BODY](cfg)

在fcos_core/config/defaults.py中可找到
cfg.MODEL.BACKBONE.CONV_BODY→_C.MODEL.BACKBONE.CONV_BODY = “R-50-C4”
registry调用自fcos_core.modeling，追索至fcos_core.utils.registry.Registry
Registry类有如下说明：

A helper class for managing registering modules, it extends a dictionary
    and provides a register functions.

    Eg. creeting a registry:
        some_registry = Registry({"default": default_module})

    There're two ways of registering new modules:
    1): normal way is just calling register function:
        def foo():
            ...
        some_registry.register("foo_module", foo)
    2): used as decorator when declaring the module:
        @some_registry.register("foo_module")
        @some_registry.register("foo_modeul_nickname")
        def foo():
            ...

    Access of module is just like using a dictionary, eg:
        f = some_registry["foo_modeul"]

在build_backbone()函数上方索至

@registry.BACKBONES.register("R-50-C4")
@registry.BACKBONES.register("R-50-C5")
@registry.BACKBONES.register("R-101-C4")
@registry.BACKBONES.register("R-101-C5")
def build_resnet_backbone(cfg):
    body = resnet.ResNet(cfg)
    model = nn.Sequential(OrderedDict([("body", body)]))
    model.out_channels = cfg.MODEL.RESNETS.BACKBONE_OUT_CHANNELS
    return model

因此build_backbone()→build_resnet_backbone()

resnet.ResNet

即fcos_core.modeling.backbone.resnet.ResNet
首先看ResNet的方法__init__()

class ResNet(nn.Module):
    def __init__(self, cfg):
        super(ResNet, self).__init__()

        # If we want to use the cfg in forward(), then we should make a copy
        # of it and store it for later use:
        # self.cfg = cfg.clone()

        # Translate string names to implementations
        stem_module = _STEM_MODULES[cfg.MODEL.RESNETS.STEM_FUNC]
        stage_specs = _STAGE_SPECS[cfg.MODEL.BACKBONE.CONV_BODY]
        transformation_module = _TRANSFORMATION_MODULES[cfg.MODEL.RESNETS.TRANS_FUNC]

首先定义stem_module、stage_specs、transformation_module

stem_module
→_STEM_MODULES[cfg.MODEL.RESNETS.STEM_FUNC]
→StemWithFixedBatchNorm(BaseStem), norm_func=FrozenBatchNorm2d
→BaseStem

class BaseStem(nn.Module):
    def __init__(self, cfg, norm_func):
        super(BaseStem, self).__init__()

        out_channels = cfg.MODEL.RESNETS.STEM_OUT_CHANNELS # 64

        self.conv1 = Conv2d(
            3, out_channels, kernel_size=7, stride=2, padding=3, bias=False
        )
        self.bn1 = norm_func(out_channels)

        for l in [self.conv1,]:
            nn.init.kaiming_uniform_(l.weight, a=1)

    def forward(self, x):
        x = self.conv1(x)
        x = self.bn1(x)
        x = F.relu_(x)
        x = F.max_pool2d(x, kernel_size=3, stride=2, padding=1)
        return x

Conv2d追索至fcos_core.layers.misc.Conv2d

class Conv2d(torch.nn.Conv2d):
    def forward(self, x):
        if x.numel() > 0:
            return super(Conv2d, self).forward(x)
        # get output shape

        output_shape = [
            (i + 2 * p - (di * (k - 1) + 1)) // d + 1
            for i, p, di, k, d in zip(
                x.shape[-2:], self.padding, self.dilation, self.kernel_size, self.stride
            )
        ]
        output_shape = [x.shape[0], self.weight.shape[0]] + output_shape
        return _NewEmptyTensorOp.apply(x, output_shape)

FrozenBatchNorm2d追索至fcos_core.layers.batch_norm.FrozenBatchNorm2d

   def __init__(self, n):
        super(FrozenBatchNorm2d, self).__init__()
        self.register_buffer("weight", torch.ones(n))
        self.register_buffer("bias", torch.zeros(n))
        self.register_buffer("running_mean", torch.zeros(n))
        self.register_buffer("running_var", torch.ones(n))

    def forward(self, x):
        scale = self.weight * self.running_var.rsqrt()
        bias = self.bias - self.running_mean * scale
        scale = scale.reshape(1, -1, 1, 1)
        bias = bias.reshape(1, -1, 1, 1)
        return x * scale + bias

register_buffer：pytorch.nn.Module的方法
This is typically used to register a buffer that should not to be considered a model parameter.
通常用于注册不应被视为模型参数的缓冲区
rsqrt(): Returns a new tensor with the reciprocal of the square-root of each of the elements of input.
rsqrt()返回每个元素平方根倒数
此算法用于规一化

stage_specs
→_STAGE_SPECS[cfg.MODEL.BACKBONE.CONV_BODY]
→ResNet50StagesTo4

ResNet50StagesTo4 = tuple(
    StageSpec(index=i, block_count=c, return_features=r)
    for (i, c, r) in ((1, 3, False), (2, 4, False), (3, 6, True))
)

在其上方索至，即stage_specs定义了各阶段参数(index序号, block_count该阶段剩余块数, return_features是否返回特征图)

StageSpec = namedtuple(
    "StageSpec",
    [
        "index",  # Index of the stage, eg 1, 2, ..,. 5
        "block_count",  # Number of residual blocks in the stage
        "return_features",  # True => return the last feature map from this stage
    ],
)

transformation_module
→_TRANSFORMATION_MODULES[cfg.MODEL.RESNETS.TRANS_FUNC]
→BottleneckWithFixedBatchNorm
其中, num_groups=1, stride_in_1x1=True, stride=1, dilation=1, dcn_config=None
→Bottleneck, norm_func=FrozenBatchNorm2d

class Bottleneck(nn.Module):
    def __init__(
        self,
        # omit
    ):
        super(Bottleneck, self).__init__()
		
		# downsample：当输入输出通道数不同时, 添加一卷积层
		self.downsample = None
        if in_channels != out_channels:
            down_stride = stride if dilation == 1 else 1 # dilation = 1
            self.downsample = nn.Sequential(
                Conv2d(
                    in_channels, out_channels,
                    kernel_size=1, stride=down_stride, bias=False
                ),
                norm_func(out_channels),
            )
            for modules in [self.downsample,]:
                for l in modules.modules():
                    if isinstance(l, Conv2d):
                        nn.init.kaiming_uniform_(l.weight, a=1)

        if dilation > 1:
            stride = 1 # reset to be 1
        stride_1x1, stride_3x3 = (stride, 1) if stride_in_1x1 else (1, stride)
		
		# 定义第1层卷积层
		self.conv1 = Conv2d(
            in_channels,
            bottleneck_channels,
            kernel_size=1,
            stride=stride_1x1,
            bias=False,
        )
        self.bn1 = norm_func(bottleneck_channels)

		# 定义第2、3层卷积层
		with_dcn = dcn_config.get("stage_with_dcn", False)
        if with_dcn:
            # omit
        else:
            self.conv2 = Conv2d(
                bottleneck_channels,
                bottleneck_channels,
                kernel_size=3,
                stride=stride_3x3,
                padding=dilation,
                bias=False,
                groups=num_groups,
                dilation=dilation
            )
            nn.init.kaiming_uniform_(self.conv2.weight, a=1)

        self.bn2 = norm_func(bottleneck_channels)

        self.conv3 = Conv2d(
            bottleneck_channels, out_channels, kernel_size=1, bias=False
        )
        self.bn3 = norm_func(out_channels)

        for l in [self.conv1, self.conv3,]:
            nn.init.kaiming_uniform_(l.weight, a=1)

    def forward(self, x):
        identity = x

        out = self.conv1(x)
        out = self.bn1(out)
        out = F.relu_(out)

        out = self.conv2(out)
        out = self.bn2(out)
        out = F.relu_(out)

        out0 = self.conv3(out)
        out = self.bn3(out0)

        if self.downsample is not None:
            identity = self.downsample(x)

        out += identity # 跳跃连接
        out = F.relu_(out)

        return out