pytorch学习（三）——模型层

Suppose-dilemma

已于 2023-04-19 14:44:57 修改

阅读量996

点赞数 1

分类专栏：深度学习文章标签： pytorch 深度学习学习

于 2022-10-23 11:07:55 首次发布

本文链接：https://blog.csdn.net/ifhuke/article/details/127470608

版权

深度学习专栏收录该内容

24 篇文章 13 订阅

订阅专栏

本文详细介绍了如何在PyTorch中自定义模型层，利用预训练模型提升效果，以及使用Sequential、ModuleList和ModuleDict等不同方式构建模型。涵盖了继承nn.Module、使用add_module、Sequential容器以及高级容器如ModuleList和ModuleDict的实例。

摘要由CSDN通过智能技术生成

当我们构建了数据管道能够将数据一个batch一个batch的取出来后，下一步就是构建模型了，模型的构建将很大程度的影响学习的效果，pytorch的模型层全部都在 torch.nn 模块下。

1. 自定义模型层

如果需要查看模型层的各个API及每个API的作用，大家可以去官网查看，网址放在这里了：https://pytorch.org/docs/stable/nn.html#shuffle-layers
在这里插入图片描述

如果需要自己定义模型层，那么需要继承 nn.Module 模块，类的初始化方法 __init__ 第一行必须调用父类的方法并定义模型层，必须实现 forward(input) 方法并将各层连接起来才行。

下面展示一个示例：

class Net(nn.Module):
    def __init__(self):
        super(Net, self).__init__()#调用父类的初始化方法
        # 3层lstm
        self.lstm = nn.LSTM(input_size = 3,hidden_size = 3,num_layers = 5,batch_first = True)
        self.linear = nn.Linear(3,3)
        self.block = Block()

	# 重写forward方法
    def forward(self,x_input):
    	# 定义网络参数传递过程
        x = self.lstm(x_input)[0][:,-1,:]
        x = self.linear(x)
        y = self.block(x,x_input)
        return y

2. 使用预训练模型

因为迁移学习所带来的的影响，使用预训练的模型往往能够带来更好的效果，在pytorch中，很多预训练模型都集成到了 torchvision 中的 models 模块，可以在官网中查看支持的各个已训练好的模型，官网网址 https://pytorch.org/vision/stable/models.html，调用方法也十分简单，比如调用残差神经网络，只需要使用下面的语句即可

from torchvision import models
model = models.resnet152(pretrained=True)

上面的 pretrained=True 表示将模型下载下来，默认的下载路径为 C:\Users\ASUS\.cache\torch\hub\checkpoints ，下载的模型都存储在这里。
在这里插入图片描述

3. 模型构建风格

pytorch构建模型时有许多中风格可以用来添加网络层，第一种就是上面的那种继承 nn.Module 并且自定义类的风格，下面还有几种风格。

3.1 使用 `add_module` 方法

使用 add_module 能够往模型中添加模型层，示例如下

net = nn.Sequential()
net.add_module("conv1",nn.Conv2d(in_channels=3,out_channels=32,kernel_size = 3))
net.add_module("pool1",nn.MaxPool2d(kernel_size = 2,stride = 2))
net.add_module("conv2",nn.Conv2d(in_channels=32,out_channels=64,kernel_size = 5))
net.add_module("pool2",nn.MaxPool2d(kernel_size = 2,stride = 2))
net.add_module("dropout",nn.Dropout2d(p = 0.1))
net.add_module("adaptive_pool",nn.AdaptiveMaxPool2d((1,1)))
net.add_module("flatten",nn.Flatten())
net.add_module("linear1",nn.Linear(64,32))
net.add_module("relu",nn.ReLU())
net.add_module("linear2",nn.Linear(32,1))
net.add_module("sigmoid",nn.Sigmoid())

print(net)

3.2 添加进 `Sequential`

net = nn.Sequential(
    nn.Conv2d(in_channels=3,out_channels=32,kernel_size = 3),
    nn.MaxPool2d(kernel_size = 2,stride = 2),
    nn.Conv2d(in_channels=32,out_channels=64,kernel_size = 5),
    nn.MaxPool2d(kernel_size = 2,stride = 2),
    nn.Dropout2d(p = 0.1),
    nn.AdaptiveMaxPool2d((1,1)),
    nn.Flatten(),
    nn.Linear(64,32),
    nn.ReLU(),
    nn.Linear(32,1),
    nn.Sigmoid()
)

print(net)

这种方式构建时不能给每个层指定名称。

3.3 Sequential作为模型容器

class Net(nn.Module):

    def __init__(self):
        super(Net, self).__init__()
        self.conv = nn.Sequential(
            nn.Conv2d(in_channels=3,out_channels=32,kernel_size = 3),
            nn.MaxPool2d(kernel_size = 2,stride = 2),
            nn.Conv2d(in_channels=32,out_channels=64,kernel_size = 5),
            nn.MaxPool2d(kernel_size = 2,stride = 2),
            nn.Dropout2d(p = 0.1),
            nn.AdaptiveMaxPool2d((1,1))
        )
        self.dense = nn.Sequential(
            nn.Flatten(),
            nn.Linear(64,32),
            nn.ReLU(),
            nn.Linear(32,1),
            nn.Sigmoid()
        )
    def forward(self,x):
        x = self.conv(x)
        y = self.dense(x)
        return y

3.4 ModuleList作为模型容器

class Net(nn.Module):

    def __init__(self):
        super(Net, self).__init__()
        self.layers = nn.ModuleList([
            nn.Conv2d(in_channels=3,out_channels=32,kernel_size = 3),
            nn.MaxPool2d(kernel_size = 2,stride = 2),
            nn.Conv2d(in_channels=32,out_channels=64,kernel_size = 5),
            nn.MaxPool2d(kernel_size = 2,stride = 2),
            nn.Dropout2d(p = 0.1),
            nn.AdaptiveMaxPool2d((1,1)),
            nn.Flatten(),
            nn.Linear(64,32),
            nn.ReLU(),
            nn.Linear(32,1),
            nn.Sigmoid()]
        )
    def forward(self,x):
        for layer in self.layers:
            x = layer(x)
        return x
net = Net()
print(net)

3.5 ModuleDict作为模型容器

class Net(nn.Module):

    def __init__(self):
        super(Net, self).__init__()
        self.layers_dict = nn.ModuleDict({"conv1":nn.Conv2d(in_channels=3,out_channels=32,kernel_size = 3),
               "pool": nn.MaxPool2d(kernel_size = 2,stride = 2),
               "conv2":nn.Conv2d(in_channels=32,out_channels=64,kernel_size = 5),
               "dropout": nn.Dropout2d(p = 0.1),
               "adaptive":nn.AdaptiveMaxPool2d((1,1)),
               "flatten": nn.Flatten(),
               "linear1": nn.Linear(64,32),
               "relu":nn.ReLU(),
               "linear2": nn.Linear(32,1),
               "sigmoid": nn.Sigmoid()
              })
    def forward(self,x):
        layers = ["conv1","pool","conv2","pool","dropout","adaptive",
                  "flatten","linear1","relu","linear2","sigmoid"]
        for layer in layers:
            x = self.layers_dict[layer](x)
        return x
net = Net()
print(net)

4. 模型保存

pytorch框架提供了内置函数来保存和加载整个网络。需要注意的一个重要细节是，这将保存模型的参数而不是保存整个模型。例如，如果我们有一个3层多层感知机，我们需要单独指定架构。因为模型本身可以包含任意代码，所以模型本身难以序列化。因此，为了恢复模型，我们需要用代码生成架构，然后从磁盘加载参数。

例如我们将模型保存在一个叫做 mlp.params 的文件中，代码如下：

torch.save(net.state_dict(), 'mlp.params')

如果需要恢复模型，则需要实例化一个原始模型的备份，再将其参数进行读取即可，代码如下：

clone = net()
clone.load_state_dict(torch.load('mlp.params'))
clone.eval()