【PyTorch学习笔记】自用零基础详细标注-6.卷积层 Convolution Layers-CSDN博客

本文链接：https://blog.csdn.net/kanbarakun/article/details/132962997

卷积层 Convolution Layers

nn.Conv1d：一维卷积；nn.Conv2d：二维；nn.Conv3d：三维。以conv2d作为例子。

nn.Conv2d

torch.nn是对torch.nn.functional的封装，只需要学习torch.nn即可。

官方文档：https://pytorch.org/docs/stable/nn.html?highlight=torch+nn#module-torch.nn

（1）参数

in_channels (int)：输入图像的通道数

out_channels (int)：通过卷积后产生的输出通道数

weight：权重，卷积核

kernel_size (int or tuple)：卷积核大小

bias：偏置，卷积后结果是否再加上一个偏置值

stride：步长，卷积核每次移动距离，可以设置左右和上下移动相同，则输入一个数。若需要左右移动和上下移动距离不同，输入元祖（sH, sW）。默认值为1

padding：在输入图像的周围进行填充，可以给一个数或者一个元祖

dilation：空洞卷积，卷积核之间空一个一个

在这里插入图片描述

（2）卷积原理

在这里插入图片描述

计算过程：

在这里插入图片描述

对应相乘再相加，以第一步为例，output为11+22+01+00+11+20+12+21+1*0

卷积核每次移动步长由stride决定。

padding=1的情况，填充部分是0

在这里插入图片描述

简单的卷积实现：

#实现（2）中第一个计算
import torch
import torch.nn.functional as F     #卷积函数一般这样import

#需要tensor类型数据，打两个中括号说明是二维的矩阵
input = torch.tensor([[1, 2, 0, 3, 1],
                      [0, 1, 2, 3, 1],
                      [1, 2, 1, 0, 0],
                      [5, 2, 3, 1, 1],
                      [2, 1, 0, 1, 1]]) 

kernel = torch.tensor([[1, 2, 1]
                       [0, 1, 0]
                       [2, 1, 0]])

#conv2d中要求的input和weight的尺寸都需要四个数字，但上面自己创建的shape只有两个数字（高和宽）
#所以需要使用到pytorch中的尺寸变化，torch.reshape()，对照需求文档进行调整尺寸操作

#input一个平面，通道1，batchsize也为1
input = torch.reshape(input, (1, 1, 5, 5))
kernel = torch.reshape(kernel, (1, 1, 3, 3))

output = F.conv2d(input, kernel, stride=1)
#padding之后再练习，此处默认为0
print(output)

#尝试stride为2
output2 = F.conv2d(input, kernel, stride=2)
print(output2)

#加入padding的情况
output3 = F.conv2d(input, kernel, stride=1, padding=1)
print(output3)

使用到CIFAR10数据集，做一个卷积网络，并对之前知识进行一个回顾：

import torch
import torchvision
from torch import nn
from torch.nn import Conv2d
from torch.utils.data import DataLoader
from torch.utils.tensorboard import SummaryWriter

#准备数据集，在参数中直接进行tensor转换，可以直接用在网络当中
dataset = torchvision.datasets.CIFAR10("./data", train=False, transform=torchvision.transforms.ToTensor(),
																				download=True)
dataloader = DataLoader(dataset, batch_size=64)

class MyNet(nn.Module):
    def __init__(self):
        super().__inti__()
        self.conv1 = Conv2d(3, 6, 3, stride=1, padding=0)
        #输入channel为3层，输出为6层，kernel_size为3

    def forward(self, x):
        x = self.conv1(x)
        return x

mynet = MyNet()

writer = SummaryWriter("logs")    #更直观的查看结果方法
step = 0

for data in dataloader:
    imgs, targets = data
    output = mynet(imgs)
    print(imgs.shape)
    print(output.shape)    #对比原图像和输出图像的shape
    *#torch.Size([64, 3, 32, 32])*
    writer.add_images("input", imgs, step)
    
    *#torch.Size([64, 6, 30, 30])，*直接使用add_images会报错，因为默认处理是三个通道，这个有六通道
    #下面是一种不太严谨的处理办法。reshape时不清楚该变成多少时写-1，会自动根据其他信息变化
    output = torch.reshape(output, (-1, 3, 30, 30))
    writer.add_images("input", imgs, step)

    step += 1

（3）其他问题

nn.Conv2d和F.conv2d区别

nn.Conv2d是类式接口，F.conv2d是函数式接口。（一般大写都是类，小写是函数。）

nn.Conv2d是[2D卷积层]，而F.conv2d是[2D卷积操作]。

（1）nn.Conv2d

torch.nn.Conv2d(in_channels, out_channels, kernel_size, stride=1, padding=0, dilation=1, groups=1, bias=True, padding_mode=‘zeros’, device=None, dtype=None)