Pytorch基本操作（7）——池化层、非线性激活

本文链接：https://blog.csdn.net/xjl13007043300/article/details/123153173

1 前言

在学习李沐在B站发布的《动手学深度学习》PyTorch版本教学视频中发现在操作使用PyTorch方面有许多地方看不懂，往往只是“动手”了，没有动脑。所以打算趁着寒假的时间好好恶补、整理一下PyTorch的操作，以便跟上课程。

学习资源:

B站up主：我是土堆的视频：PyTorch深度学习快速入门教程（绝对通俗易懂！）【小土堆】
PyTorch中文手册：(pytorch handbook)
Datawhale开源内容：深入浅出PyTorch（thorough-pytorch）

2 池化层Pooling layers

2.1 最大池化层`torch.nn.MaxPool2d`

pytorch官方文档

kernel_size - 核大小
stride – 池化核移动的步长，一般就是池化核的大小
padding – implicit zero padding to be added on both sides
dilation – 空洞卷积的参数
return_indices – if True, will return the max indices along with the outputs. Useful for torch.nn.MaxUnpool2d later
ceil_mode – 出现池化核超出input范围时，是否保留此时覆盖的区域。when True, will use ceil instead of floor to compute the output shape

2.1.1 最大池化的目的

缩小图片，节省计算量

2.1.2 Dilated convolution空洞卷积

在这里插入图片描述

2.1.3 ceil floor mode取整

在这里插入图片描述

2.2 最大池化操作

在这里插入图片描述

2.3 代码操作

import torch
import torchvision
from torch import nn
from torch.nn import MaxPool2d
from torch.utils.data import DataLoader
from torch.utils.tensorboard import SummaryWriter

input = torch.tensor([[1, 2, 0, 3, 1], 
                     [0, 1, 2, 3, 1], 
                     [1, 2, 1, 0, 0],
                     [5, 2, 3, 1, 1], 
                     [2, 1, 0, 1, 1]], dtype = torch.float32)

input = torch.reshape(input, (-1, 1, 5, 5))
print(input.shape)

torch.Size([1, 1, 5, 5])

class Tudui(nn.Module):
    def __init__(self):
        super(Tudui, self).__init__()
        self.maxpool1 = MaxPool2d(kernel_size = 3, ceil_mode = True)
        
    def forward(self, input):
        output = self.maxpool1(input)
        return output

tudui = Tudui()
output = tudui(input)
output

tensor([[[[2., 3.],
          [5., 1.]]]])

class Tudui(nn.Module):
    def __init__(self):
        super(Tudui, self).__init__()
        self.maxpool1 = MaxPool2d(kernel_size = 3, ceil_mode = False) # 改为false试一下
        
    def forward(self, input):
        output = self.maxpool1(input)
        return output    
    
tudui = Tudui()
output = tudui(input)
output

tensor([[[[2.]]]])

2.3.1 试下真实数据集

dataset = torchvision.datasets.CIFAR10("../dataset", train = False, download = True, 
                                      transform = torchvision.transforms.ToTensor())
dataloader = DataLoader(dataset, batch_size = 64)

Files already downloaded and verified

tudui = Tudui()

writer = SummaryWriter('logs_maxpool')
step = 0

for data in dataloader:
    imgs, targets = data
    writer.add_images("input", imgs, step)
    output = tudui(imgs)
    writer.add_images("output", output, step)
    step += 1

writer.close()

对比来看output的图像明显模糊（如马赛克）且可以保留部分特征，减少了计算量

在这里插入图片描述

3 非线性激活Non-linear Activations

非线性变换的主要目的就是给网络引入更多的非线性特征，提升模型的泛化能力

3.1 `torch.nn.ReLU`

官方介绍

$ReLU(x)=(x)^+ =max(0,x)$

在这里插入图片描述

3.1.1 代码

关于inplace参数

from torch import nn
from torch.nn import ReLU
from torch.nn import Sigmoid

input = torch.tensor([[1, -0.5], 
                     [-1, 3]])

input = torch.reshape(input, (-1, 1, 2, 2))
print(input.shape)

torch.Size([1, 1, 2, 2])

class Tudui(nn.Module):
    def __init__(self):
        super(Tudui, self).__init__()
        self.relu1 = ReLU()
        
    def forward(self, input):
        output = self.relu1(input)
        return output
    
tudui = Tudui()
output = tudui(input)
print(output)

tensor([[[[1., 0.],
          [0., 3.]]]])

3.2 `torch.nn.Sigmoid`

官方介绍

$Sigmoid(x)=\sigma(x)= \frac{1}{1+exp(−x)}$

在这里插入图片描述

3.2.1 数据集实操

class Tudui(nn.Module):
    def __init__(self):
        super(Tudui, self).__init__()
        self.relu1 = ReLU()
        self.sigmoid1 = Sigmoid()
        
    def forward(self, input):
        output = self.sigmoid1(input)
        return output

tensor([[[[0.7311, 0.3775],
          [0.2689, 0.9526]]]])

tudui = Tudui()

writer = SummaryWriter('logs_sigmoid')
step = 0
for data in dataloader:
    imgs, targets = data
    writer.add_images("input", imgs, step)
    
    output = tudui(imgs)
    writer.add_images("output", output, step)
    
    step += 1
    
writer.close()