PyTorch模型定义 | 模型容器 | 模型块 | 修改模型 | 模型读取与保存

幼稚的人呐

已于 2022-08-22 18:01:12 修改

阅读量701

点赞数 3

分类专栏： # PyTorch基础篇文章标签： pytorch 修改模型模型保存模型读取

于 2022-08-18 12:00:33 首次发布

本文链接：https://blog.csdn.net/liujiesxs/article/details/126396802

版权

PyTorch基础篇：

一、PyTorch模型容器

基于nn.Module，我们可以通过Sequential，ModuleList和ModuleDict三种方式定义PyTorch模型。

1.Sequential

nn.Sequetial：按顺序的将一组网络层包装起来

其特性总结如下：

顺序性:各网络层之间严格按照顺序构建

自带forward():自带的forward里，通过for循环依次执行前向传播运算

下面使用Sequetial来包装LeNet模型。

class LeNetSequential(nn.Module):
    def __init__(self, classes):
        super(LeNetSequential, self).__init__()
        # 将卷积层与池化层包装成features网络层
        self.features = nn.Sequential(
            nn.Conv2d(3, 6, 5),
            nn.ReLU(),
            nn.MaxPool2d(kernel_size=2, stride=2),
            nn.Conv2d(6, 16, 5),
            nn.ReLU(),
            nn.MaxPool2d(kernel_size=2, stride=2),)
		
		# 将全连接网络包装成classifier层
        self.classifier = nn.Sequential(
            nn.Linear(16*5*5, 120),
            nn.ReLU(),
            nn.Linear(120, 84),
            nn.ReLU(),
            nn.Linear(84, classes),)
            
	# 前向传播
    def forward(self, x):
        x = self.features(x)
        # 形状变换
        x = x.view(x.size()[0], -1)
        x = self.classifier(x)
        return x

其源码为：

	class Sequential(Module):
	    r"""A sequential container.
	    Modules will be added to it in the order they are passed in the constructor.
	    Alternatively, an ordered dict of modules can also be passed in.
	
	    To make it easier to understand, here is a small example::
	
	        # Example of using Sequential
	        model = nn.Sequential(
	                  nn.Conv2d(1,20,5),
	                  nn.ReLU(),
	                  nn.Conv2d(20,64,5),
	                  nn.ReLU()
	                )
	
	        # Example of using Sequential with OrderedDict
	        model = nn.Sequential(OrderedDict([
	                  ('conv1', nn.Conv2d(1,20,5)),
	                  ('relu1', nn.ReLU()),
	                  ('conv2', nn.Conv2d(20,64,5)),
	                  ('relu2', nn.ReLU())
	                ]))
	    """
	
	    def __init__(self, *args):
	        super(Sequential, self).__init__()
	        # 将传入的网络层添加到Sequential中
	        # 判断输入的参数是否为有序字典
	        if len(args) == 1 and isinstance(args[0], OrderedDict):
	            for key, module in args[0].items():
	                self.add_module(key, module)
	        else:
	        # 如果不是的话，就直接将传入的网络层添加到Sequential中
	            for idx, module in enumerate(args):
	                self.add_module(str(idx), module)
	
	    def _get_item_by_idx(self, iterator, idx):
	        """Get the idx-th item of the iterator"""
	        size = len(self)
	        idx = operator.index(idx)
	        if not -size <= idx < size:
	            raise IndexError('index {} is out of range'.format(idx))
	        idx %= size
	        return next(islice(iterator, idx, None))
	
	    def __getitem__(self, idx):
	        if isinstance(idx, slice):
	            return self.__class__(OrderedDict(list(self._modules.items())[idx]))
	        else:
	            return self._get_item_by_idx(self._modules.values(), idx)
	
	    def __setitem__(self, idx, module):
	        key = self._get_item_by_idx(self._modules.keys(), idx)
	        return setattr(self, key, module)
	
	    def __delitem__(self, idx):
	        if isinstance(idx, slice):
	            for key in list(self._modules.keys())[idx]:
	                delattr(self, key)
	        else:
	            key = self._get_item_by_idx(self._modules.keys(), idx)
	            delattr(self, key)
	
	    def __len__(self):
	        return len(self._modules)
	
	    def __dir__(self):
	        keys = super(Sequential, self).__dir__()
	        keys = [key for key in keys if not key.isdigit()]
	        return keys
	
	    def forward(self, input):
	    	# 对Sequential中的网络层进行循环
	        for module in self._modules.values():
	            input = module(input)
	        return input

从源码中，我们可以发现Sequetial继承自Module，所以Sequetial仍然有8个有序字典。

LeNetSequential(
  (features): Sequential(
    (0): Conv2d(3, 6, kernel_size=(5, 5), stride=(1, 1))
    (1): ReLU()
    (2): MaxPool2d(kernel_size=2, stride=2, padding=0, dilation=1, ceil_mode=False)
    (3): Conv2d(6, 16, kernel_size=(5, 5), stride=(1, 1))
    (4): ReLU()
    (5): MaxPool2d(kernel_size=2, stride=2, padding=0, dilation=1, ceil_mode=False)
  )
  (classifier): Sequential(
    (0): Linear(in_features=400, out_features=120, bias=True)
    (1): ReLU()
    (2): Linear(in_features=120, out_features=84, bias=True)
    (3): ReLU()
    (4): Linear(in_features=84, out_features=2, bias=True)
  )
)

打印上述网络层，我们可以发现上述网络层是没有命名的，是采用序号来索引网络层的。在深层网络中，很难采用序号来索引每个网络层。下面我们利用有序字典对网络层进行命名。

# 有序字典
from collections import OrderedDict

class LeNetSequentialOrderDict(nn.Module):
    def __init__(self, classes):
        super(LeNetSequentialOrderDict, self).__init__()

        self.features = nn.Sequential(OrderedDict({
   
            'conv1': nn.Conv2d(3, 6, 5),
            'relu1': nn.ReLU(inplace=True),
            'pool1': nn.MaxPool2d(kernel_size=2, stride=2),

            'conv2': nn.Conv2d(6, 16, 5),
            'relu2': nn.ReLU(inplace=True),
            'pool2': nn.MaxPool2d(kernel_size=2, stride=2),
        }))

        self.classifier = nn.Sequential(OrderedDict({
   
            'fc1': nn.Linear(16*5*5, 120),
            'relu3': nn.ReLU(),

            'fc2': nn.Linear(120, 84),
            'relu4': nn.ReLU(inplace=True),

            'fc3': nn.Linear(84, classes),
        }))

    def forward(self, x):
        x = self.features(x)
        x = x.view(x.size()[0], -1)
        x = self.classifier(x)
        return x

此时，我们可以发现：网络层已经命名。我们可以通过名称来索引每个网络层。

	LeNetSequentialOrderDict(
	  (features): Sequential(
	    (conv1): Conv2d(3, 6, kernel_size=(5, 5), stride=(1, 1))
	    (relu1): ReLU(inplace=True)
	    (pool1): MaxPool2d(kernel_size=2, stride=2, padding=0, dilation=1, ceil_mode=False)
	    (conv2): Conv2d(6, 16, kernel_size=(5, 5), stride=(1, 1))
	    (relu2): ReLU(inplace=True)
	    (pool2): MaxPool2d(kernel_size=2, stride=2, padding=0, dilation=1, ceil_mode=False)
	  )
	  (classifier): Sequential(
	    (fc1): Linear(in_features=400, out_features=120, bias=True)
	    (relu3): ReLU()
	    (fc2): Linear(in_features=120, out_features=84, bias=True)
	    (relu4): ReLU(inplace=True)
	    (fc3): Linear(in_features=84, out_features=2, bias=True)
	  )
	)

可以看到，使用Sequential定义模型的好处在于简单、易读，同时使用Sequential定义的模型不需要再写forward，因为顺序已经定义好了。但使用Sequential也会使得模型定义丧失灵活性，比如需要在模型中间加入一个外部输入时就不适合用Sequential的方式实现。使用时需根据实际需求加以选择。

2.ModuleList

ModuleList 接收一个子模块（或层，需属于nn.Module类）的列表作为输入，然后也可以类似List那样进行append和extend操作。同时，子模块或层的权重也会自动添加到网络中来。像python的list一样包装多个网络层，可以像python的list一样进行迭代，以迭代方式调用网络层。主要方法有：

append():在ModuleList后面添加网络层

extend():拼接两个ModuleList

insert():指定在ModuleList中位置插入网络层

下面我们用nn.ModuleList来实现20层的全连接网络的实现

class ModuleList(nn.Module):
    def __init__(self):
        super(ModuleList, self).__init__()
        self.linears = nn.ModuleList([nn.Linear(10, 10) for i in range(20)])

    def forward(self, x):
        for i, linear in enumerate(self.linears):
            x = linear(x)
        return x

要特别注意的是，nn.ModuleList 并没有定义一个网络，它只是将不同的模块储存在一起。ModuleList中元素的先后顺序并不代表其在网络中的真实位置顺序，需要经过forward函数指定各个层的先后顺序后才算完成了模型的定义。具体实现时用for循环即可完成。

3.ModuleDict

ModuleDict和ModuleList的作用类似，只是ModuleDict能够更方便地为神经网络的层添加名称。像python的dict一样包装多个网络层，以索引方式调用网络层。主要方法有：

clear():清空ModuleDict

items():返回可迭代的键值对(key-value pairs)

keys():返回字典的键(key)

values():返回字典的值(value)

pop():返回一对键值，并从字典中删除

class ModuleDict(nn.Module):
    def __init__(self):
        super(ModuleDict, self).__init__()
        self.choices = nn.ModuleDict({
   
            'conv': nn.Conv2d(10, 10, 3),
            'pool': nn.MaxPool2d(3)
        })

        self.activations = nn.ModuleDict({
   
            'relu': nn.ReLU(),
            'prelu': nn.PReLU()
        })
	
	# 两个可选择的属性
    def forward(self, x, choice, act):
        x = self.choices[choice](x)
        x = self.activations[act](x)
        return x

net = ModuleDict()
fake_img = torch.randn((4, 10, 32, 32))
output = net(fake_img, 'conv', 'relu')

4.三种容器的总结

nn.Sequential :顺序性，各网络层之间严格按顺序执行，常用于block构建
nn.ModuleList :迭代性，常用于大量重复网构建，通过for循环实现重复构建
nn.ModuleDict :索引性,常用于可选择的网络层

Sequential适用于快速验证结果，因为已经明确了要用哪些层，直接写一下就好了，不需要同时写__init__和forward；

ModuleList和ModuleDict在某个完全相同的层需要重复出现多次时，非常方便实现，可以”一行顶多行“；

当我们需要之前层的信息的时候，比如 ResNets 中的残差计算，当前层的结果需要和之前层中的结果进行融合，一般使用 ModuleList/ModuleDict 比较方便。