第四章神经网络的基本组件

最新推荐文章于 2025-10-11 22:16:11 发布

大白学深度学习

最新推荐文章于 2025-10-11 22:16:11 发布

阅读量605

点赞数 15

CC 4.0 BY-SA版权

分类专栏：【小土堆】Pytorch入门文章标签：神经网络深度学习 pytorch 学习

本文链接：https://blog.csdn.net/MYX_309/article/details/153055299

【小土堆】Pytorch入门专栏收录该内容

4 篇文章

订阅专栏

nn.Module

神经网络的基本骨架，你的神经网络需要继承这个类

官方文档

https://docs.pytorch.org/docs/stable/generated/torch.nn.Module.html#torch.nn.Module

框架

import torch.nn as nn
import torch.nn.functional as F

class Model(nn.Module):
    def __init__(self) -> None:
        super().__init__()
        self.conv1 = nn.Conv2d(1, 20, 5)
        self.conv2 = nn.Conv2d(20, 20, 5)

    def forward(self, x):
        x = F.relu(self.conv1(x))
        return F.relu(self.conv2(x))

forward()为向前传播函数，input经过forward后输出

实例

创建一个简单的神经网络，返回结果＋1

import torch

class Myx(nn.Module):
    def __init__(self):
        super().__init__()

    def forward(self,input):
        output=input +1
        return output
myx=Myx()#实例化
x=torch.tensor(1.0)#创建一个x，数据类型为tensor型
res=myx(x)#这里会直接调用forwar函数是因为继承了nn.Module中的__call__方法，自动调用了forward函数
print(res)

tensor(2.)

卷积

卷积操作

官方文档

https://docs.pytorch.org/docs/stable/generated/torch.nn.functional.conv2d.html#torch.nn.functional.conv2d

参数

torch.nn.functional.conv2d(input, weight, bias=None, stride=1, padding=0, dilation=1, groups=1)

input需要为(minibatch,in_channels,iH,iW)——batch为一次性抓取几个数据

weight需要为 (out_channels,in_channels/groups,kH,kW)——group通常为1

channel为通道数，H为高，W为宽

stride为步长，可以为一个数，也可以为元组（分别控制横向和纵向的步长）

实例

import torch
input=torch.tensor([[1,2,0,3,1],
                    [0,1,2,3,1],
                    [1,2,1,0,0],
                    [5,2,3,1,1],
                    [2,1,0,1,1]])
kernel=torch.tensor([[1,2,1],
                     [0,1,0],
                     [2,1,0]])
input=torch.reshape(input,(1,1,5,5))#reshape使数据满足输入要求
kernel=torch.reshape(kernel,(1,1,3,3))#reshape使数据满足输入要求
output=F.conv2d(input,kernel,stride=1)
print(output)

tensor([[[[10, 12, 12],
          [18, 16, 16],
          [13,  9,  3]]]])

改变步长（stride）

output=F.conv2d(input,kernel,stride=2)
print(output)

tensor([[[[10, 12],
          [13,  3]]]])

改变填充（padding），默认上下左右都填充一行0

output1=F.conv2d(input,kernel,stride=1)
print(output1)
output2=F.conv2d(input,kernel,stride=1,padding=1)
print(output2)

tensor([[[[10, 12, 12],
          [18, 16, 16],
          [13,  9,  3]]]])
tensor([[[[ 1,  3,  4, 10,  8],
          [ 5, 10, 12, 12,  6],
          [ 7, 18, 16, 16,  8],
          [11, 13,  9,  3,  4],
          [14, 13,  9,  7,  4]]]])

卷积层

官方文档

https://docs.pytorch.org/docs/stable/generated/torch.nn.Conv2d.html#torch.nn.Conv2d

参数

(in_channels, out_channels, kernel_size, stride=1, padding=0, dilation=1, groups=1, bias=True, padding_mode='zeros', device=None, dtype=None)

最重要的就是前两个，输入通道数和输出通道数

卷积核在神经网络训练中是不断调整的

outchannel可以简单理解为卷积核的个数，有多少个卷积核，就输出多少层结果，也就是输出的通道数

实例

from torch import nn
import torchvision
import torch
from torch.utils.data import DataLoader

datasets=torchvision.datasets.CIFAR10(root="./dataset",train=False,transform=torchvision.transforms.ToTensor())
dataloader=DataLoader(datasets,batch_size=64)

class Myx(nn.Module):
    def __init__(self):
        super(Myx, self).__init__()
        self.conv1 = nn.Conv2d(in_channels=3, out_channels=6, kernel_size=3, stride=1, padding=0)#设置卷积层
    def forward(self,x):#设置向前传递函数
        x=self.conv1(x)
        return x
myx=Myx()
print(myx)

Myx(
  (conv1): Conv2d(3, 6, kernel_size=(3, 3), stride=(1, 1))
)

上面看到神经网络的架构，让我们输出一下图片，看看效果

for data in dataloader:
    images,targets=data
    output=myx(images)
    print(images.shape)
    print(output.shape)

torch.Size([64, 3, 32, 32])
torch.Size([64, 6, 30, 30])
torch.Size([64, 3, 32, 32])
torch.Size([64, 6, 30, 30])
...

可以看到，images中64为摘取图片数量，3为通道数，32x32为图片尺寸。

经过卷积后，output中，图片数量没变，通道数变为6即设置的卷积核的个数，图片尺寸变小30x30

作用

卷积层是卷积神经网络（CNN）的核心组件，其主要作用是通过卷积操作提取输入数据的局部特征。这些特征包括边缘、纹理、形状等，能够帮助神经网络更好地理解图像内容。

池化

官方文档

https://docs.pytorch.org/docs/stable/generated/torch.nn.MaxPool2d.html#torch.nn.MaxPool2d

池化操作

池化操作就是使用特定大小的池化核，在输入中取最大值

参数

kernel_size

stride=None（步长默认为池化核的大小）

padding=0

dilation=1

return_indices=False（通常不改）

ceil_mode=False（ceil为保留池化最后那几行的那些进行取最大值操作）

池化层

from torch import nn
import torch

input=torch.tensor([[1,2,0,3,1],
                    [0,1,2,3,1],
                    [1,2,1,0,0],
                    [5,2,3,1,1],
                    [2,1,0,1,1]],dtype=torch.float)# "max_pool2d" not implemented for 'Long'，报错后修改数据类型为浮点数
input=input.reshape(-1,1,5,5)#输入要求(N,C,H,W)形式，N为batch size的大小
print(input.shape)

class Myx(nn.Module):
    def __init__(self):
        super(Myx, self).__init__()
        self.maxpool1=MaxPool2d(kernel_size=3,ceil_mode=True)
    def forward(self,input):
        output=self.maxpool1(input)
        return output
myx=Myx()
res=myx(input)
print(res)

更改ceil mode后，舍弃最后那几行取最大值，结果变小

class Myx(nn.Module):
    def __init__(self):
        super(Myx, self).__init__()
        self.maxpool1=MaxPool2d(kernel_size=3,ceil_mode=False)
    def forward(self,input):
        output=self.maxpool1(input)
        return output
myx=Myx()
res=myx(input)
print(res)

tensor([[[[2.]]]])

作用

池化操作可以提取特征降维，加快训练速度

非线性激活（激活层）

ReLU、sigmoid等激活函数

官方文档

https://docs.pytorch.org/docs/stable/generated/torch.nn.ReLU.html#torch.nn.ReLU

ReLU

输入小于0的都输出为0，大于0的正常输出

from torch import nn
from torch.nn import ReLU
import torch

input=torch.tensor([[1,-0.5],
                    [-1,3]])
print(input.shape)
input=input.reshape(-1,1,2,2)#要求输入格式为（batch_size,channel,H,W）
print(input.shape)

class Myx(nn.Module):
    def __init__(self):
        super().__init__()
        self.relu1=ReLU()#是否对结果进行原地替换，即当结果为0时，是否把输入替换为0，默认为False
    def forward(self,input):
        output=self.relu1(input)
        return output

myx=Myx()
res=myx(input)
print(res)

torch.Size([2, 2])
torch.Size([1, 1, 2, 2])
tensor([[[[1., 0.],
          [0., 3.]]]])

Sigmoid

sigmoid函数将输出映射到0-1范围

import torchvision
from torch.utils.data import DataLoader
from torch.utils.tensorboard import SummaryWriter
from torch.nn import Sigmoid
dataset=torchvision.datasets.CIFAR10("D:\myx\learn_pytorch\.dataset",train=False,transform=torchvision.transforms.ToTensor(),download=True)
dataloader=DataLoader(dataset,batch_size=64)

class Myx(nn.Module):
    def __init__(self):
        super().__init__()
        self.sigmoid1=Sigmoid()
    def forward(self,input):
        output=self.sigmoid1(input)
        return output
myx=Myx()

writer=SummaryWriter("../logs")
step=0

for data in dataloader:
    images,targets=data
    writer.add_image("input",images,step)
    output=myx(images)
    writer.add_image("output",output,step)
    step+=1
writer.close()

作用

非线性激活函数主要作用是引入非线性属性到网络中，这使得神经网络能够学习和模拟复杂的数据模式和函数，否则神经网络仍然是个线性回归模型

线性层（全连接层）

官方文档

https://docs.pytorch.org/docs/stable/generated/torch.nn.Linear.html#torch.nn.Linear

作用

用于对数据进行线性变换，将数据从in_features维映射到out_feature维，它提供了全连接、批量输入和输出以及多维数据输入等功能

参数

（in_features, out_features, bias=True, device=None, dtype=None）

实例

import torch
from torch import nn
import torchvision.datasets
from torch.nn import Linear
from torch.utils.data import DataLoader

dataset=torchvision.datasets.CIFAR10("D:\myx\learn_pytorch\.dataset",train=False,transform=torchvision.transforms.ToTensor(),download=True)
dataloader=DataLoader(dataset,batch_size=64)

class Myx(nn.Module):
    def __init__(self):
        super(Myx, self).__init__()
        self.linear1=Linear(196608,10)

    def forward(self,input):
        output=self.linear1(input)
        return output
myx=Myx()

for data in dataloader:
    images,targets=data
    print(images.shape)
    images=torch.reshape(images,(1,1,1,-1))
    print(images.shape)
    res=myx(images)
    print(res.shape)

torch.Size([64, 3, 32, 32])
torch.Size([1, 1, 1, 196608])
torch.Size([1, 1, 1, 10])

其中，使用torch中的flatten函数可以将多维数据拉成一维，与上面的reshape(1,1,1,-1)作用相同

for data in dataloader:
    images,targets=data
    print(images.shape)
    images=torch.flatten(images)
    print(images.shape)
    res=myx(images)
    print(res.shape)

torch.Size([64, 3, 32, 32])
torch.Size([196608])
torch.Size([10])

其他层

归一化层

官方文档

https://docs.pytorch.org/docs/stable/generated/torch.nn.BatchNorm2d.html#torch.nn.BatchNorm2d

参数

(num_features, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True, device=None, dtype=None)

num_features为输入的通道数

Recurrent layers

常用于LSTM、RNN 长短时记忆网络、循环神经网络

Transformer layers

广泛用于NLP领域

dropout layers

防止过拟合

sparse layers

处理稀疏数据，减少计算量和内存消耗

第四章 神经网络的基本组件

nn.Module

官方文档

框架

实例

卷积

卷积操作

官方文档

参数

实例

卷积层

官方文档

参数

实例

作用

池化

官方文档

池化操作

参数

池化层

作用

非线性激活（激活层）

官方文档

ReLU

Sigmoid

作用

线性层（全连接层）

官方文档

作用

参数

实例

其他层

归一化层

官方文档

参数

Recurrent layers

Transformer layers

dropout layers

sparse layers

第四章神经网络的基本组件