通过Pytorch搭建网络以及初始化参数

最新推荐文章于 2024-05-03 19:18:31 发布

xLyons

最新推荐文章于 2024-05-03 19:18:31 发布

阅读量2.6k

点赞数

分类专栏： Pytorch学习笔记文章标签：深度学习 pytorch 神经网络

本文链接：https://blog.csdn.net/qq_38124658/article/details/109692353

版权

Pytorch学习笔记专栏收录该内容

2 篇文章 0 订阅

订阅专栏

前言：关于torch.nn

torch.nn包含了大量的函数及类，如nn.Linear(); nn.ReLU()等等，如想了解nn构造块包含了哪些函数，文档可参考：torch.nn.

一、通过nn.Module搭建模型

方式1：直接继承torch.nn.Module

nn.Module的nn模块提供了模型构造类，你可以通过继承它来搭建你自己的网络层。torch.nn.Module 这个类的内部有多达 48 个函数，这个类是 PyTorch 中所有 neural network module的基类，可以通过继承nn.Module来完成自己的网络搭建,文档可参考：torch.nn.Module; 知乎

nn.Module中的函数forward()和__init__()需要通过子类来实现，不然就会报错，__init__()主要作用是定义基础的网络层，forward()则是实现各层网络的连接，由于nn.Module模块中自带了__call__()函数，所以当项搭建好后的网络传递数据的时候，forward()函数会自动运行;

一个简单的例子：

import torch
from torch import nn

class MLP(nn.Module):
    def __init__(self):
        super(MLP, self).__init__()
        self.hidden = nn.Linear(784, 256)
        self.act = nn.ReLU()
        self.output = nn.Linear(256, 10)
    
    def forward(self, x):
        a = self.act(self.hidden(x))
        return self.output(a)
        
X = torch.rand(2, 784)
net = MLP()
print(net)
net(X)

方式2：通过nn.Module.add_module添加网络层

上一个例子是将各网络层添加到__init__()中来搭建网络，nn.Module父类中自带了函数add_module()，可以通过调用该函数，来添加网络层；添加完的网络层都存放在self._modules中，这里请注意forward()函数的写法：

"""
    函数功能：构建一个容器，用于存放模块
    语法笔记：
        1.与python自带的字典相比较，OrderedDict表示有序的字典；
        2.enumerate() 函数用于将一个可遍历的数据对象(如列表、元组或字符串)组合为一个索引序列，同时列出数据和数据下标；
"""
from collections import OrderedDict

class MySequential(nn.Module):

    def __init__(self, *args):
        super(MySequential, self).__init__()
        
        # 如果传入的是一个有序的字典
        # isinstance(args[0], OrderdDict)判断args[0]的类型是不是有序字典
        if len(args) == 1 and isinstance(args[0], OrderdDict):
            for key, module in args[0].items():
                self.add_module(key, module)
        # 否则遍历的方式读取
        else:
            for idx, module in enumerate(args):
                self.add_module(str(idx), module)
        
    def forward(self, input):
        for module in self._modules.values():
            # 每层网络输入后，都会返回一个输出
            # 当前层的输出作为下一层的输入
            input = module(input)
        return input

下面两种方式的输出结果一致

X = torch.rand(2, 784)
# 通过传有序的字典
Dict = OrderedDict([
    ('0', nn.Linear(784, 256)), 
    ('1', nn.ReLU()), 
    ('2', nn.Linear(256, 10))
])
net1 = MySequential(Dict)
print(net1)
net1(X)

print('----------------------------------------')
# 通过传可迭代的对象
net2 = MySequential(
        nn.Linear(784, 256),
        nn.ReLU(),
        nn.Linear(256, 10), 
        )
print(net2)
net2(X)

方式3：通过nn.Sequential有顺序添加网络层

nn.Sequential作为容易也是用于存放网络层，但要求是按照顺序进行排列的，所以必须确保上一层的输出与下一层的输入的size保持一致，实例如下：

import torch
import torch.nn as nn
import torch.nn.functional as F
class net_seq(nn.Module):
    def __init__(self):
        super(net2, self).__init__()
        self.seq = nn.Sequential(
                        nn.Conv2d(1,20,5),
                        nn.ReLU(),
                        nn.Conv2d(20,64,5),
                        nn.ReLU()
                        )      
    def forward(self, x):
        return self.seq(x)
    
net = net_seq()
print(net)

也可以通过OrderedDict来指定每个module的名字：

from collections import OrderedDict

class net_seq(nn.Module):
    def __init__(self):
        super(net_seq, self).__init__()
        self.seq = nn.Sequential(OrderedDict([
                            ('conv1', nn.Conv2d(1,20,5)),
                            ('relu1', nn.ReLU()),
                            ('conv2', nn.Conv2d(20,64,5)),
                            ('relu2', nn.ReLU())
                            ]))
    def forward(self, x):
        return self.seq(x)
    
net = net_se:q()
print(net)

方式4：通过nn.ModuleList搭建网络

ModuleList接收一个子模块的列表作为输入，也可以进行append和extend操作；

net = nn.ModuleList([nn.Linear(784, 256), nn.ReLU()])
net.append(nn.Linear(256, 10)) # # 类似List的append操作

print(net[-1])  # 类似List的索引访问
print(net)
# net(torch.zeros(1, 784)) # 会报NotImplementedError

ModuleList仅仅是一个储存各种模块的列表，这些模块之间没有联系也没有顺序;
ModuleList没有实现forward功能，如果对net进行输入操作会报错；意思就是你需要自己将各层网络连接起来，相比较于sequential更有定制性；
ModuleList相比较于python的list会自动添加网络的参数parameters；

nn.Sequential与nn.ModuleList的区别

nn.Sequential实现了forward函数，可以见上面的例子。ModuleList需要在类的内部自己实现forward函数；
```
def forward(self, x):
    for m in self.modlist:
        x = m(x)
        return x
```
如果完全使用nn.Sequential是可以的，只是会失去部分灵活性，不可进行定制了；
nn.Sequential可以使用OrderedDict进行命名；
有时候网络中会有许多相似或者重复的层，这时会考虑通过for循环来创建它们，而不是一行一行地写；
```
linears = [nn.Linear(10, 10) for i in range(5)]
```
但是问题是，这样创建出来的层它们之间的参数是一样的。

参见：知乎

方式5：通过ModuleDict搭建网络

目前感觉ModuleDict和ModuleList的区别是它可以自己命名网络层的名字而已；

net = nn.ModuleDict({
    'linear': nn.Linear(784, 256),
    'act': nn.ReLU(),
})
net['output'] = nn.Linear(256, 10) # 添加
print(net['linear']) # 访问
print(net.output)
print(net)
# net(torch.zeros(1, 784)) # 会报NotImplementedError

其他补充

如果某层的某个参数的requires_grad为False，则该层的这个参数不会被更新；
torch.nn.ReLU()和torch.nn.functional.relu()本质上没什么区别；源代码显示，nn.ReLU()是通过调用nn.functional.relu()来实现的；

二、模型参数的初始化

2.1 添加参数

方式1：model.state_dict()

model.state_dict()返回的是一个有序的字典，分别对应参数的名称及具体参数；

# 如果一个网络各层已经定义好了参数，可以通过遍历的方式来访问它；
for params, value in net.state_dict().items(): 
    print(f'params:{params} \n value.size:{value.size()}')
    
    
输出如下：
params:conv1.weight 
 value.size:torch.Size([32, 3, 3, 3])
params:conv1.bias 
 value.size:torch.Size([32])
params:conv2.weight 
 value.size:torch.Size([32, 3, 3, 3])
params:conv2.bias 
 value.size:torch.Size([32])
params:dense1.weight 
 value.size:torch.Size([128, 288])
params:dense1.bias 
 value.size:torch.Size([128])
params:dense2.weight 
 value.size:torch.Size([10, 128])
params:dense2.bias 
 value.size:torch.Size([10])

方式2：model.named_parameters

model.named_parameters除了返回参数Tensor外，还会返回对应的名字；它与前面提到的model.state_dict()不同的是，它是一个迭代器。

print(type(net.named_parameters()))
for name, param in net.named_parameters():
    print(name, param.size())
    
输出如下：
<class 'generator'>
0.weight torch.Size([3, 4])
0.bias torch.Size([3])
2.weight torch.Size([1, 3])
2.bias torch.Size([1])

方式3：model.parameters()

先说功能，model.parameters()是一个迭代器，但是只能返回参数的值，不可返回名字；准确来说，model.parameters()是通过model.named_parameters来实现的，定义如下：

def parameters(self, recurse: bool = True) -> Iterator[Parameter]:
    r"""Returns an iterator over module parameters.

        This is typically passed to an optimizer.

        Args:
            recurse (bool): if True, then yields parameters of this module
                and all submodules. Otherwise, yields only parameters that
                are direct members of this module.

        Yields:
            Parameter: module parameter

        Example::

            >>> for param in model.parameters():
            >>>     print(type(param), param.size())
            <class 'torch.Tensor'> (20L,)
            <class 'torch.Tensor'> (20L, 1L, 5L, 5L)

        """
    for name, param in self.named_parameters(recurse=recurse):
        yield param

2.2 初始化模型的参数

pytorch的nn.init提供了多种预设的初始化方式，可以访问：nn.init；下面以将权重参数以均值为0，标准差为0.01的正态分布来设置：

for name, param in net.named_parameters():
    if 'weight' in name:
        init.normal_(param, mean=0, std=0.01)
        print(name, param.data)

使用常数填充nn.init.constant_(tensor, val)：

w = torch.empty(3, 5)
nn.init.constant_(w, 0.3)

xLyons

关注

0
点赞
踩
16

收藏

觉得还不错? 一键收藏
0
评论
通过Pytorch搭建网络以及初始化参数

前言：关于torch.nntorch.nn包含了大量的函数及类，如nn.Linear(); nn.ReLU()等等，如想了解nn构造块包含了哪些函数，文档可参考：torch.nn.一、通过nn.Module搭建模型方式1：直接继承torch.nn.Modulenn.Module的nn模块提供了模型构造类，你可以通过继承它来搭建你自己的网络层。torch.nn.Module 这个类的内部有多达 48 个函数，这个类是 PyTorch 中所有 neural network module的基类，可以通过继
复制链接

扫一扫