使用 深度可分离卷积(Depthwise Separable Convolution) 改造CRNN网络

深度可分离卷积(Depthwise Separable Convolution)能够使得网络的参数大幅度减少,本文拟在改造CRNN,看看参数量的变化情况:

改造前crnn网络的cnn部分网络代码:

class CNN0(nn.Module):

    def __init__(self,imageHeight,nChannel):
        super(CNN0,self).__init__()
        assert imageHeight % 32 == 0,'image Height has to be a multiple of 32'

        self.conv0 = nn.Conv2d(in_channels=nChannel,out_channels=64,kernel_size=3,stride=1,padding=1)
        self.relu0 = nn.ReLU(inplace=True)
        self.pool0 = nn.MaxPool2d(kernel_size=2,stride=2)

        self.conv1 = nn.Conv2d(in_channels=64,out_channels=128,kernel_size=3,stride=1,padding=1)
        self.relu1 = nn.ReLU(inplace=True)
        self.pool1 = nn.MaxPool2d(kernel_size=2,stride=2)

        self.conv2 = nn.Conv2d(in_channels=128,out_channels=256,kernel_size=3,stride=1,padding=1)
        self.batchNorm2 = nn.BatchNorm2d(256)
        self.relu2 = nn.ReLU(inplace=True)

        self.conv3 = nn.Conv2d(in_channels=256, out_channels=256, kernel_size=3, stride=1, padding=1)
        self.relu3 = nn.ReLU(inplace=True)
        self.pool3 = nn.MaxPool2d(kernel_size=(2,2),stride=(2,1),padding=(0,1))

        self.conv4 = nn.Conv2d(in_channels=256, out_channels=512, kernel_size=3, stride=1, padding=1)
        self.batchNorm4 = nn.BatchNorm2d(512)
        self.relu4 = nn.ReLU(inplace=True)

        self.conv5 = nn.Conv2d(in_channels=512, out_channels=512, kernel_size=3, stride=1, padding=1)
        self.relu5 = nn.ReLU(inplace=True)
        self.pool5 = nn.MaxPool2d(kernel_size=(2,2),stride=(2,1),padding=(0,1))

        self.conv6 = nn.Conv2d(in_channels=512, out_channels=512, kernel_size=2, stride=1, padding=0)
        self.batchNorm6 = nn.BatchNorm2d(512)
        self.relu6= nn.ReLU(inplace=True)

    def forward(self,input):
        conv0 = self.conv0(input)
        relu0 = self.relu0(conv0)
        pool0 = self.pool0(relu0)
        print(pool0.size())

        conv1 = self.conv1(pool0)
        relu1 = self.relu1(conv1)
        pool1 = self.pool1(relu1)
        print(pool1.size())

        conv2 = self.conv2(pool1)
        batchNormal2 = self.batchNorm2(conv2)
        relu2 = self.relu2(batchNormal2)
        print(relu2.size())

        conv3 = self.conv3(relu2)
        relu3 = self.relu3(conv3)
        pool3 = self.pool3(relu3)
        print(pool3.size())

        conv4 = self.conv4(pool3)
        batchNormal4 = self.batchNorm4(conv4)
        relu4 = self.relu4(batchNormal4)
        print(relu4.size())

        conv5 = self.conv5(relu4)
        relu5 = self.relu5(conv5)
        pool5 = self.pool5(relu5)
        print(pool5.size())

        conv6 = self.conv6(pool5)
        batchNormal6 = self.batchNorm6(conv6)
        relu6 = self.relu6(batchNormal6)
        print(relu6.size())

        return relu6

使用torchsummary打印网络的结构和参数:

from torchsummary import summary
    device = torch.device('cuda' if torch.cuda.is_available() else 'cpu')
    net = CNN0(32, 1).to(device)
    print(summary(net, input_size=(1, 32, 320)))

运行结果:

----------------------------------------------------------------
        Layer (type)               Output Shape         Param #
================================================================
            Conv2d-1          [-1, 64, 32, 320]             640
              ReLU-2          [-1, 64, 32, 320]               0
         MaxPool2d-3          [-1, 64, 16, 160]               0
            Conv2d-4         [-1, 128, 16, 160]          73,856
              ReLU-5         [-1, 128, 16, 160]               0
         MaxPool2d-6           [-1, 128, 8, 80]               0
            Conv2d-7           [-1, 256, 8, 80]         295,168
       BatchNorm2d-8           [-1, 256, 8, 80]             512
              ReLU-9           [-1, 256, 8, 80]               0
           Conv2d-10           [-1, 256, 8, 80]         590,080
             ReLU-11           [-1, 256, 8, 80]               0
        MaxPool2d-12           [-1, 256, 4, 81]               0
           Conv2d-13           [-1, 512, 4, 81]       1,180,160
      BatchNorm2d-14           [-1, 512, 4, 81]           1,024
             ReLU-15           [-1, 512, 4, 81]               0
           Conv2d-16           [-1, 512, 4, 81]       2,359,808
             ReLU-17           [-1, 512, 4, 81]               0
        MaxPool2d-18           [-1, 512, 2, 82]               0
           Conv2d-19           [-1, 512, 1, 81]       1,049,088
      BatchNorm2d-20           [-1, 512, 1, 81]           1,024
             ReLU-21           [-1, 512, 1, 81]               0
================================================================
Total params: 5,551,360
Trainable params: 5,551,360
Non-trainable params: 0
----------------------------------------------------------------
Input size (MB): 0.04
Forward/backward pass size (MB): 31.68
Params size (MB): 21.18
Estimated Total Size (MB): 52.89
----------------------------------------------------------------

使用深度可分离卷积改造的crnn网络中cnn部分网络:

class CNN(nn.Module):

    def __init__(self,imageHeight,nChannel):
        super(CNN,self).__init__()
        assert imageHeight % 32 == 0,'image Height has to be a multiple of 32'

        self.depth_conv0 = nn.Conv2d(in_channels=nChannel,out_channels=nChannel,kernel_size=3,stride=1,padding=1,groups=nChannel)
        self.point_conv0 = nn.Conv2d(in_channels=nChannel,out_channels=64,kernel_size=1,stride=1,padding=0,groups=1)
        self.relu0 = nn.ReLU(inplace=True)
        self.pool0 = nn.MaxPool2d(kernel_size=2,stride=2)

        self.depth_conv1 = nn.Conv2d(in_channels=64,out_channels=64,kernel_size=3,stride=1,padding=1,groups=64)
        self.point_conv1 = nn.Conv2d(in_channels=64,out_channels=128,kernel_size=1,stride=1,padding=0,groups=1)
        self.relu1 = nn.ReLU(inplace=True)
        self.pool1 = nn.MaxPool2d(kernel_size=2,stride=2)

        self.depth_conv2 = nn.Conv2d(in_channels=128,out_channels=128,kernel_size=3,stride=1,padding=1,groups=128)
        self.point_conv2 = nn.Conv2d(in_channels=128,out_channels=256,kernel_size=1,stride=1,padding=0,groups=1)
        self.batchNorm2 = nn.BatchNorm2d(256)
        self.relu2 = nn.ReLU(inplace=True)

        self.depth_conv3 = nn.Conv2d(in_channels=256, out_channels=256, kernel_size=3, stride=1, padding=1, groups=256)
        self.point_conv3 = nn.Conv2d(in_channels=256, out_channels=256, kernel_size=1, stride=1, padding=0, groups=1)
        self.relu3 = nn.ReLU(inplace=True)
        self.pool3 = nn.MaxPool2d(kernel_size=(2,2),stride=(2,1),padding=(0,1))

        self.depth_conv4 = nn.Conv2d(in_channels=256, out_channels=256, kernel_size=3, stride=1, padding=1, groups=256)
        self.point_conv4 = nn.Conv2d(in_channels=256, out_channels=512, kernel_size=1, stride=1, padding=0, groups=1)
        self.batchNorm4 = nn.BatchNorm2d(512)
        self.relu4 = nn.ReLU(inplace=True)

        self.depth_conv5 = nn.Conv2d(in_channels=512, out_channels=512, kernel_size=3, stride=1, padding=1, groups=512)
        self.point_conv5 = nn.Conv2d(in_channels=512, out_channels=512, kernel_size=1, stride=1, padding=0, groups=1)
        self.relu5 = nn.ReLU(inplace=True)
        self.pool5 = nn.MaxPool2d(kernel_size=(2,2),stride=(2,1),padding=(0,1))

        #self.conv6 = nn.Conv2d(in_channels=512, out_channels=512, kernel_size=2, stride=1, padding=0)
        self.depth_conv6 = nn.Conv2d(in_channels=512, out_channels=512, kernel_size=2, stride=1, padding=0, groups=512)
        self.point_conv6 = nn.Conv2d(in_channels=512, out_channels=512, kernel_size=1, stride=1, padding=0, groups=1)
        self.batchNorm6 = nn.BatchNorm2d(512)
        self.relu6= nn.ReLU(inplace=True)

    def forward(self,input):
        depth0 = self.depth_conv0(input)
        point0 = self.point_conv0(depth0)
        relu0 = self.relu0(point0)
        pool0 = self.pool0(relu0)

        depth1 = self.depth_conv1(pool0)
        point1 = self.point_conv1(depth1)
        relu1 = self.relu1(point1)
        pool1 = self.pool1(relu1)

        depth2 = self.depth_conv2(pool1)
        point2 = self.point_conv2(depth2)
        batchNormal2 = self.batchNorm2(point2)
        relu2 = self.relu2(batchNormal2)

        depth3 = self.depth_conv3(relu2)
        point3 = self.point_conv3(depth3)
        relu3 = self.relu3(point3)
        pool3 = self.pool3(relu3)

        depth4 = self.depth_conv4(pool3)
        point4 = self.point_conv4(depth4)
        batchNormal4 = self.batchNorm4(point4)
        relu4 = self.relu4(batchNormal4)

        depth5 = self.depth_conv5(relu4)
        point5 = self.point_conv5(depth5)
        relu5 = self.relu5(point5)
        pool5 = self.pool5(relu5)

        depth6 = self.depth_conv6(pool5)
        point6 = self.point_conv6(depth6)
        batchNormal6 = self.batchNorm6(point6)
        relu6 = self.relu6(batchNormal6)

        return relu6

打印的结果:

----------------------------------------------------------------
        Layer (type)               Output Shape         Param #
================================================================
            Conv2d-1           [-1, 1, 32, 320]              10
            Conv2d-2          [-1, 64, 32, 320]             128
              ReLU-3          [-1, 64, 32, 320]               0
         MaxPool2d-4          [-1, 64, 16, 160]               0
            Conv2d-5          [-1, 64, 16, 160]             640
            Conv2d-6         [-1, 128, 16, 160]           8,320
              ReLU-7         [-1, 128, 16, 160]               0
         MaxPool2d-8           [-1, 128, 8, 80]               0
            Conv2d-9           [-1, 128, 8, 80]           1,280
           Conv2d-10           [-1, 256, 8, 80]          33,024
      BatchNorm2d-11           [-1, 256, 8, 80]             512
             ReLU-12           [-1, 256, 8, 80]               0
           Conv2d-13           [-1, 256, 8, 80]           2,560
           Conv2d-14           [-1, 256, 8, 80]          65,792
             ReLU-15           [-1, 256, 8, 80]               0
        MaxPool2d-16           [-1, 256, 4, 81]               0
           Conv2d-17           [-1, 256, 4, 81]           2,560
           Conv2d-18           [-1, 512, 4, 81]         131,584
      BatchNorm2d-19           [-1, 512, 4, 81]           1,024
             ReLU-20           [-1, 512, 4, 81]               0
           Conv2d-21           [-1, 512, 4, 81]           5,120
           Conv2d-22           [-1, 512, 4, 81]         262,656
             ReLU-23           [-1, 512, 4, 81]               0
        MaxPool2d-24           [-1, 512, 2, 82]               0
           Conv2d-25           [-1, 512, 1, 81]           2,560
           Conv2d-26           [-1, 512, 1, 81]         262,656
      BatchNorm2d-27           [-1, 512, 1, 81]           1,024
             ReLU-28           [-1, 512, 1, 81]               0
================================================================
Total params: 781,450
Trainable params: 781,450
Non-trainable params: 0
----------------------------------------------------------------
Input size (MB): 0.04
Forward/backward pass size (MB): 37.09
Params size (MB): 2.98
Estimated Total Size (MB): 40.11
----------------------------------------------------------------

可以看到,网络参数的数量大约减少了9倍,至于网络性能怎样,还没有测试。

  • 2
    点赞
  • 9
    收藏
    觉得还不错? 一键收藏
  • 1
    评论
### 回答1: 深度可分离卷积(Depthwise Separable Convolution)是一种卷积方式,它将卷积操作分为两步来进行:深度卷积和点卷积。其中,深度卷积对于每个输入通道分别做卷积,而点卷积则将各个输入通道的卷积结果按照权值线性组合。这样可以减少参数量,加速计算,并且能够在保持精度的前提下压缩模型。 ### 回答2: Depthwise separable convolution深度可分离卷积)是一种轻量级的卷积操作,它可以有效降低模型的参数量和计算量,从而实现更加高效的模型训练和推理。 相比于传统的卷积操作,depthwise separable convolution 由两个步骤构成:depthwise convolution(深度卷积)和pointwise convolution(逐点卷积)。具体来说,先对输入的每个通道单独进行卷积操作(即深度卷积),然后再通过逐点卷积来将各个通道的特征进行整合,最终得到输出结果。 对于一个用于目标识别的卷积网络来说,depthwise separable convolution 的主要优势在于它能够显著减少网络中的参数量和计算量。由于在进行深度卷积时,每个通道都是单独进行处理,所以会大幅降低计算量和计算时间。而逐点卷积则可以有效压缩卷积层的通道数,从而降低参数量和内存占用。 举个例子,假设对于一个输入大小为H×W×C的图像,原本需要使用大小为K×K×C×S的卷积核来进行卷积操作,其中S表示输出通道数目。那么使用 depthwise separable convolution 进行操作时,先使用大小为K×K×C的卷积核进行深度卷积(相当于使用了C个大小为K×K的卷积核),然后通过大小为1×1×CS的卷积核进行逐点卷积。这样,在输出相同结果的情况下,参数量和计算量就能大幅降低,从而加速模型的训练和推理。 总之,depthwise separable convolution 是一种轻量级的卷积操作,可以有效压缩模型的参数量和计算量,提高模型的计算效率。在目标识别等领域,可以作为一种强大的工具,用于设计更加高效的卷积神经网络。 ### 回答3: Depthwise separable convolution深度可分离卷积)是一种卷积神经网络(CNN)中用于减少网络参数个数和计算量的结构。它是由谷歌的研究者提出的一种卷积结构,并在MobileNet中得到广泛应用。 普通的卷积神经网络是由卷积层、池化层和全连接层组成。其中,卷积层是网络中最消耗时间和空间的部分,需要大量的计算资源。深度可分离卷积是一种卷积结构,通过分离卷积的过程,将卷积操作分为两个部分:深度卷积和逐点卷积。 首先,深度卷积只在每个输入通道上进行卷积操作,而不是在所有输入通道上同时进行。这样可以减少卷积核的数量。其次,逐点卷积使用1x1的卷积核,对每个通道分别进行卷积操作。这可以将通道之间的相互影响降到最低。 因为这种分离,深度可分离卷积可以明显降低计算量和模型参数,能够在保证模型精度的情况下,让模型具有更小的体积和更高的运行速度。相比于普通的卷积神经网络深度可分离卷积具有更好的效率和性能。 深度可分离卷积的应用可以广泛用于移动端设备、无线网络等资源有限的环境中。它在现代机器学习使用中得到了广泛的应用,包括在计算机视觉领域(如图像识别、物体检测)和语音处理领域(如语音识别)等。

“相关推荐”对你有帮助么?

  • 非常没帮助
  • 没帮助
  • 一般
  • 有帮助
  • 非常有帮助
提交
评论 1
添加红包

请填写红包祝福语或标题

红包个数最小为10个

红包金额最低5元

当前余额3.43前往充值 >
需支付:10.00
成就一亿技术人!
领取后你会自动成为博主和红包主的粉丝 规则
hope_wisdom
发出的红包
实付
使用余额支付
点击重新获取
扫码支付
钱包余额 0

抵扣说明:

1.余额是钱包充值的虚拟货币,按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载,可以购买VIP、付费专栏及课程。

余额充值