Yolov3(Mxnet)修改检测层

最新推荐文章于 2023-05-16 10:14:22 发布

snow6666667

最新推荐文章于 2023-05-16 10:14:22 发布

阅读量2.7k

点赞数 3

文章标签： Yolov3 Gluoncv Mxnet

本文链接：https://blog.csdn.net/bingpoyinhui/article/details/102814989

版权

我们已经介绍了如何使用Mxnet中Gluoncv里model_zoo中的各种分类模型作为特征提取网络，快速修改Yolov3的基础网络，现在我们继续介绍如何修改Yolov3的检测层、特征变换层和输出层。Yolov3实际上已经具有一定的历史，其简洁的网络设计令人赏心悦目，但是简洁的现状也决定了它没有使用过多的深度学习技巧（使用了残差、FPN）。而目前深度学习中，Inception结构、SE block、空洞卷积（Dilated Conv）、可变形卷积（Deformable Conv）、分组卷积（Group Conv）、深度可分离卷积（DW Conv）、Channel Shuffle等等五花八门的操作不断涌现，虽然Yolov3提出时其中一些操作已经问世，但是最终为了高效和简洁并没有在模型中使用，当然加入这些五花八门的操作也不一定能提升MAP，而且有些操作可能和Yolov3的设计理念相左，比如采用不同感受野的Inception块，Yolov3本身设计就是不同特征层检测不同大小的物体，如果一个特征层同时提取了大物体和小物体的较完整特征，对模型来说不一定是好事。但是从我们个人来讲，魔改Yolov3的普通卷积然后再在VOC上训练玩玩，仍然是有意思的一件事。

先附上一张图，来自介绍Yolov3网络的文章：https://blog.csdn.net/qq_37541097/article/details/81214953

可以看到如果不想不小心设计出Yolov4，那么能改的卷积层实际上也不多，包括每个特征图出来后的Conv Set（1-3-1-3-1）、后面紧跟着的3*3卷积、1*1输出层和upsample之前的1*1卷积共四个地方。首先我们先在Gluoncv中找到这四个地方。

Gluoncv模型源码链接：https://github.com/dmlc/gluon-cv

Yolov3模型在gluoncv/model_zoo/yolo/yolo3.py中。

1. 输出层1x1卷积

YOLOOutputV3类中唯一一个具有学习参数的操作，并没有什么好改的空间：

self.prediction = nn.Conv2D(all_pred, kernel_size=1, padding=0, strides=1)

2. Conv Set（1-3-1-3-1）和后面紧跟着的3x3卷积

应该是具有最大修改空间的地方了，尤其是三个3x3卷积层，位置在YOLODetectionBlockV3中使用循环添加：

for _ in range(2):
    # 1x1 reduce
    self.body.add(_conv2d(channel, 1, 0, 1, norm_layer=norm_layer, norm_kwargs=norm_kwargs))
    # 3x3 expand
    self.body.add(_conv2d(channel * 2, 3, 1, 1, norm_layer=norm_layer, norm_kwargs=norm_kwargs))
    self.body.add(_conv2d(channel, 1, 0, 1, norm_layer=norm_layer, norm_kwargs=norm_kwargs))
    self.tip = _conv2d(channel * 2, 3, 1, 1, norm_layer=norm_layer, norm_kwargs=norm_kwargs)

3. upsample之前的1x1卷积

在YOLOV3类中：

self.transitions = nn.HybridSequential()
if i > 0:
    self.transitions.add(_conv2d(channel, 1, 0, 1, norm_layer=norm_layer, norm_kwargs=norm_kwargs))

上述所有位置使用的_conv2d均来自model_zoo中的darknet，定义如下所示，是Conv+BN+LeakyReLU：

def _conv2d(channel, kernel, padding, stride, norm_layer=BatchNorm, norm_kwargs=None):
    """A common conv-bn-leakyrelu cell"""
    cell = nn.HybridSequential(prefix='')
    cell.add(nn.Conv2D(channel, kernel_size=kernel,strides=stride, padding=padding, use_bias=False))
    cell.add(norm_layer(epsilon=1e-5, momentum=0.9, **({} if norm_kwargs is None else norm_kwargs)))
    cell.add(nn.LeakyReLU(0.1))
    return cell

我们可以按照想法封装不同_conv2d_xxx，调用并替换普通_conv2d，下面开始修改：

1. Deformable Conv

Mxnet中的contrib有Defromable Conv v1版和v2版，下面以v1版为例：

v2版mxnet源码和文档里面已经有了，不过pip安装mxnet好像还不能import，如果能用的话，使用方法：

from mxnet.gluon.contrib.cnn import ModulatedDeformableConvolution

https://github.com/msracver/Deformable-ConvNets/tree/master/DCNv2_op

from mxnet.gluon.contrib.cnn import DeformableConvolution
def _conv2d_deformable(channel, kernel, padding, stride, use_se=False, norm_layer=BatchNorm, norm_kwargs=None):
    """A common conv-bn-leakyrelu cell"""
    cell = nn.HybridSequential(prefix='')
    cell.add(DeformableConvolution(channel, kernel_size=kernel,
                       strides=stride, padding=padding))
    cell.add(norm_layer(epsilon=1e-5, momentum=0.9, **({} if norm_kwargs is None else norm_kwargs)))
    cell.add(nn.LeakyReLU(0.1))
    return cell

这样就可以把某些层的_conv2d普通卷积替换成Deformable Conv V1版的_conv2d_deformable了，不过deformable conv的速度较慢。

2. SE Block

我们可以使用model_zoo中mobilenetv3里的实现方式，使用了HardSigmoid代替Sigmoid，使用HardSwish作为激活函数，这两个分别是Sigmoid和Swish函数的近似快速版：

from gluoncv.nn import ReLU6, HardSigmoid, HardSwish

class SE_block(gluon.HybridBlock):
    def __init__(self, channel, **kwargs):
        super(SE_block, self).__init__(**kwargs)
        self.se = nn.HybridSequential(prefix='')
        self.se.add(nn.GlobalAvgPool2D())
        self.se.add(nn.Conv2D(channel // 4, kernel_size=1, use_bias=True))
        self.se.add(HardSwish())
        self.se.add(nn.Conv2D(channel, kernel_size=1, use_bias=True))
        self.se.add(HardSigmoid())

    # pylint: disable=unused-argument
    def hybrid_forward(self, F, x, *args):
        residual = x
        w = self.se(x)
        x = F.broadcast_mul(x, w)
        return x + residual

这样就可以把_conv2d添加SE Block：

def _conv2d_se(channel, kernel, padding, stride, norm_layer=BatchNorm, norm_kwargs=None):
    """A common conv-bn-leakyrelu cell"""
    cell = nn.HybridSequential(prefix='')
    cell.add(nn.Conv2D(channel, kernel_size=kernel,
                       strides=stride, padding=padding, use_bias=False))
    cell.add(norm_layer(epsilon=1e-5, momentum=0.9, **({} if norm_kwargs is None else norm_kwargs)))
    cell.add(nn.LeakyReLU(0.1))
    cell.add(SE_block(channel))
    return cell

当然也可以把Deformable和SE都加进去：

def _conv2d_deformable_se(channel, kernel, padding, stride, use_se=False, norm_layer=BatchNorm, norm_kwargs=None):
    """A common conv-bn-leakyrelu cell"""
    cell = nn.HybridSequential(prefix='')
    cell.add(DeformableConvolution(channel, kernel_size=kernel,
                       strides=stride, padding=padding))
    cell.add(norm_layer(epsilon=1e-5, momentum=0.9, **({} if norm_kwargs is None else norm_kwargs)))
    cell.add(nn.LeakyReLU(0.1))
    if use_se:
        cell.add(SE_block(channel))
    return cell

3. 深度可分离卷积（DW）

我们可以直接使用model_zoo中的mobilenet的模块，DW卷积可以有效的减小模型参数，不过可能Yolo中某些1*1卷积就和DW中自带的1*1卷积重复了，按需求取舍，而且mobilenetv2中提出倒置残差结构，靠1*1扩充通道，然后3*3卷积体特征，再用1*1卷积降低通道，和Resnet中普通卷积相反，所以如果使用DW卷积需要注意通道数变换。当然如果按照普通v1版DW卷积使用，只需要记住1*1卷积负责通道变换，3*3卷积负责独立的通道特征提取就可以了。如果目前使用mobilenet版的Yolov3，模型大部分参数都是后面检测层的参数，可以使用DW卷积进行模型压缩：

def _add_conv(out, channels=1, kernel=1, stride=1, pad=0,
              num_group=1, active=True, relu6=False, norm_layer=BatchNorm, norm_kwargs=None):
    out.add(nn.Conv2D(channels, kernel, stride, pad, groups=num_group, use_bias=False))
    out.add(norm_layer(scale=True, **({} if norm_kwargs is None else norm_kwargs)))
    if active:
        out.add(ReLU6() if relu6 else nn.Activation('relu'))

def _conv2d_dw(out, dw_channels, channels, stride, relu6=False,
                 norm_layer=BatchNorm, norm_kwargs=None):
    _add_conv(out, channels=dw_channels, kernel=3, stride=stride,
              pad=1, num_group=dw_channels, relu6=relu6,
              norm_layer=norm_layer, norm_kwargs=norm_kwargs)
    _add_conv(out, channels=channels, relu6=relu6,
              norm_layer=norm_layer, norm_kwargs=norm_kwargs)

使用的时候，就不能net.add(_conv2d_dw)了，可以看到_conv2d_dw把net当做参数，应该直接调用_conv2d_dw(net,...)。

4. Inception（ASPP）

DeepLab分割系列使用的ASPP感觉和最初的Inception十分相似，只是感受野大小的区别，因此我们可以使用model_zoo中DeepLab系列中的模块，并进行简单修改，使用1*1卷积和多种具有不同空洞的3*3卷积代替Inception中大卷积核和多层3*3卷积，每个分支的out_channel是in_channel的一半，防止concat之后通道过大，为了模型大小，我们还加入了近似深度可分离卷积groups=out_channels：

from gluoncv.nn import ReLU6, HardSigmoid, HardSwish

def _ASPPConv(in_channels, out_channels, atrous_rate, norm_layer, norm_kwargs):
    block = nn.HybridSequential()
    with block.name_scope():
        block.add(nn.Conv2D(in_channels=in_channels, channels=out_channels, kernel_size=3, padding=atrous_rate, dilation=atrous_rate, use_bias=False, groups=out_channels))
        block.add(norm_layer(epsilon=1e-5, momentum=0.9, **({} if norm_kwargs is None else norm_kwargs)))
        block.add(HardSwish())
    return block

class _ASPP(nn.HybridBlock):
    def __init__(self, in_channels=1, atrous_rates=1, norm_layer=BatchNorm, norm_kwargs=None, **kwargs):
        super(_ASPP, self).__init__()
        out_channels = in_channels // 2
        b0 = nn.HybridSequential()
        with b0.name_scope():
            b0.add(nn.Conv2D(in_channels=in_channels, channels=out_channels, kernel_size=1, use_bias=False, groups=out_channels))
            b0.add(norm_layer(epsilon=1e-5, momentum=0.9, **({} if norm_kwargs is None else norm_kwargs)))
            b0.add(HardSwish())
        rate1, rate2, rate3 = tuple(atrous_rates)
        b1 = _ASPPConv(in_channels, out_channels, rate1, norm_layer, norm_kwargs)
        b2 = _ASPPConv(in_channels, out_channels, rate2, norm_layer, norm_kwargs)
        b3 = _ASPPConv(in_channels, out_channels, rate3, norm_layer, norm_kwargs)
        self.concurent = gluon.contrib.nn.HybridConcurrent(axis=1)
        with self.concurent.name_scope():
            self.concurent.add(b0)
            self.concurent.add(b1)
            self.concurent.add(b2)
            self.concurent.add(b3)
    def hybrid_forward(self, F, x):
        return self.concurent(x)

之后便可以在_conv2d部分替换成_ASPP使用：

net.add(_ASPP(in_channels = channel, atrous_rates=[1, 2, 3], norm_layer=norm_layer, norm_kwargs=norm_kwargs))

从我实际粗略测试发现，deformable conv在mobilenet1.0为基础网络上应用，确实在VOC上MAP有近一个点提升，不过显存提升、速度下降，有点不值；ASPP由于有DW卷积，可以有效压缩模型大小，MAP几乎不变，纯DW卷积还没试，值得一试。

修改完成后，可以按照之前说的，将gluoncv文件夹重命名为gluoncv_new，然后import get_model就可以导入自己改的模型：

from gluoncv_new.model_zoo import get_model

记着get_model里面有两个参数，pretrained_base=True表示主干基础特征提取网络使用imagenet预训练模型，pretrained=True表示使用VOC或者COCO上预训练模型，由于我们改了网络，所以只能pretrained_base=True了。

snow6666667

关注

3
点赞
踩
29

收藏

觉得还不错? 一键收藏
1
评论
Yolov3(Mxnet)修改检测层

我们已经介绍了如何使用Mxnet中Gluoncv里model_zoo中的各种分类模型作为特征提取网络，快速修改Yolov3的基础网络，现在我们继续介绍如何修改Yolov3的检测层、特征变换层和输出层。Yolov3实际上已经具有一定的历史，其简洁的网络设计令人赏心悦目，但是简洁的现状也决定了它没有使用过多的深度学习技巧（使用了残差、FPN）。而目前深度学习中，Inception结构、SE block...
复制链接

扫一扫