CNN中卷积层和池化层的计算

最新推荐文章于 2024-02-27 20:20:01 发布

张小猪的家

最新推荐文章于 2024-02-27 20:20:01 发布

阅读量4.5k

点赞数 6

分类专栏： ai 文章标签：卷积神经网络卷积神经网络计算机视觉深度学习

本文链接：https://blog.csdn.net/weixin_39574469/article/details/117672677

版权

ai 专栏收录该内容

25 篇文章 1 订阅

订阅专栏

使用卷积神经网络时候需要搞清楚卷积层输入输出的尺寸关系，计算公式如下：
在这里插入图片描述

这么说很抽象，举个例子，这是pytorch官方给的手写字识别的网络结构：
https://pytorch.org/tutorials/beginner/blitz/neural_networks_tutorial.html#sphx-glr-beginner-blitz-neural-networks-tutorial-py

这里是网络结构图：
在这里插入图片描述
这是对应的代码：

import torch
import torch.nn as nn
import torch.nn.functional as F


class Net(nn.Module):

    def __init__(self):
        super(Net, self).__init__()
        # 1 input image channel, 6 output channels, 5x5 square convolution
        # kernel
        self.conv1 = nn.Conv2d(1, 6, 5)
        self.conv2 = nn.Conv2d(6, 16, 5)
        # an affine operation: y = Wx + b
        self.fc1 = nn.Linear(16 * 5 * 5, 120)  # 5*5 from image dimension
        self.fc2 = nn.Linear(120, 84)
        self.fc3 = nn.Linear(84, 10)

    def forward(self, x):
        # Max pooling over a (2, 2) window
        x = F.max_pool2d(F.relu(self.conv1(x)), (2, 2))
        # If the size is a square, you can specify with a single number
        x = F.max_pool2d(F.relu(self.conv2(x)), 2)
        x = torch.flatten(x, 1) # flatten all dimensions except the batch dimension
        x = F.relu(self.fc1(x))
        x = F.relu(self.fc2(x))
        x = self.fc3(x)
        return x


net = Net()
print(net)

首先，输入的是尺寸为(32,32,1)的图片，对应的代码为self.conv1 = nn.Conv2d(1, 6, 5)，
其中第一个参数是输入通道数，第二个参数是输出通道数，第三参数是卷积核大小。

torch.nn.Conv2d(in_channels, out_channels, kernel_size, stride=1, padding=0, dilation=1, groups=1, bias=True, padding_mode=‘zeros’)

然后带入公式进行计算

H = (32 - 5 + 2 * 0) / 1 + 1 = 28

W = (32 - 5 + 2 * 0) / 1 + 1 = 28

D = 6

经过卷积后尺寸变成(28,28,6)

接下来进行池化，x = F.max_pool2d(F.relu(self.conv1(x)), (2, 2)) 使用的是(2,2)的池化，也就是在每个2x2的区域中取一个值，下图就是最大池化的过程，最大池化顾名思义就是取区域中的最大值，池化后H和W都变成原来的一半
在这里插入图片描述
此时尺寸就变成了(14,14,6)

接下来进行第二次卷积，输入的是尺寸为(32,32,1)的图片，对应的代码为self.conv2 = nn.Conv2d(6, 16, 5)，然后带入公式进行计算

H = (14 - 5 + 2 * 0) / 1 + 1 = 10

W = (14 - 5 + 2 * 0) / 1 + 1 = 10

D = 16

经过卷积后尺寸变成(10,10,16)

接下来进行第二次池化，x = F.max_pool2d(F.relu(self.conv2(x)), 2)，也是使用(2,2)的池化，池化后变成（5，5，16）

也就是接下来全连接层对应的输入self.fc1 = nn.Linear(16 * 5 * 5, 120)

张小猪的家

关注

6
点赞
踩
40

收藏

觉得还不错? 一键收藏
0
评论
CNN中卷积层和池化层的计算

使用卷积神经网络时候需要搞清楚卷积层输入输出的尺寸关系，计算公式如下：这么说很抽象，举个例子，这是pytorch官方给的手写字识别的网络结构：https://pytorch.org/tutorials/beginner/blitz/neural_networks_tutorial.html#sphx-glr-beginner-blitz-neural-networks-tutorial-py这里是网络结构图：这是对应的代码：import torchimport torch.nn as nn
复制链接

扫一扫