动手学深度学习（三）深度学习计算

泡泡鱼ovo

已于 2024-09-12 22:45:02 修改

阅读量588

点赞数 15

分类专栏：深度学习文章标签：深度学习人工智能

于 2024-09-12 22:39:17 首次发布

本文链接：https://blog.csdn.net/Willowii/article/details/142184408

版权

深度学习专栏收录该内容

7 篇文章 0 订阅

订阅专栏

一、模型构造

1、继承`Module`类来构造模型来构造模型

class MLP(nn.Module):
    # 声明带有模型参数的层，这里声明了两个全连接层
    def __init__(self, **kwargs):
        # 调用MLP父类Block的构造函数来进行必要的初始化。这样在构造实例时还可以指定其他函数
        # 参数，如“模型参数的访问、初始化和共享”一节将介绍的模型参数params
        super(MLP, self).__init__(**kwargs)
        self.hidden = nn.Linear(784, 256) # 隐藏层
        self.act = nn.ReLU()
        self.output = nn.Linear(256, 10)  # 输出层
         

    # 定义模型的前向计算，即如何根据输入x计算返回所需要的模型输出
    def forward(self, x):
        a = self.act(self.hidden(x))
        return self.output(a)

`2、Sequential`类继承自`Block`类

Sequential类它提供add函数来逐一添加串联的Module子类实例，而模型的前向计算就是将这些实例按添加的顺序逐一计算。

net = MySequential(
        nn.Linear(784, 256),
        nn.ReLU(),
        nn.Linear(256, 10), 
        )
print(net)
net(X)

3、`ModuleList`类

①定义

ModuleList 是 PyTorch 中的一种容器类，位于 torch.nn 模块下，专门用于存储多个子模块（即网络层）。

net = nn.ModuleList([nn.Linear(784, 256), nn.ReLU()])
net.append(nn.Linear(256, 10)) # # 类似List的append操作
print(net[-1])  # 类似List的索引访问
print(net)

`②ModuleList` 和 Python 普通列表的区别

注册模块：ModuleList 中的所有子模块都会被注册为模型的一部分。PyTorch 会自动识别并将它们的参数纳入模型的训练和保存中。而普通的 Python 列表并不会注册其中的模块。
参数追踪：使用 ModuleList 后，model.parameters() 可以追踪到列表中的所有模块参数。如果使用普通列表，模型中的这些层的参数将不会被自动管理。

（1）`ModuleList`

class Module_ModuleList(nn.Module):
    def __init__(self):
        super(Module_ModuleList, self).__init__()
        self.linears = nn.ModuleList([nn.Linear(10, 10)])

（2）Python列表

class Module_List(nn.Module):
    def __init__(self):
        super(Module_List, self).__init__()
        self.linears = [nn.Linear(10, 10)]

由结果可以看出，使用了nn.ModuleList([nn.Linear(10, 10)])，自动注册了模块并进行参数追踪，而使用列表 [nn.Linear(10, 10)]定义的参数将不会被自动管理。

4、ModuleDict类

ModuleDict 是 PyTorch 中 torch.nn 模块下的一个容器类，专门用于存储多个子模块，并以字典的形式组织这些子模块。与 Python 的普通字典不同，ModuleDict 中的子模块会被自动注册为模型的一部分，这使得 PyTorch 可以自动追踪、保存和加载这些模块及其参数。

net = nn.ModuleDict({
    'linear': nn.Linear(784, 256),
    'act': nn.ReLU(),
})
net['output'] = nn.Linear(256, 10) # 添加
print(net['linear']) # 访问

二、模型参数的访问初始化和共享

init模块，它包含了多种模型初始化方法。

1、访问模型参数

①`net.named_parameters()`

net.named_parameters() ： PyTorch 中的一个方法，用于返回模型中所有可训练参数的名称和参数本身（权重和偏置）。

print(type(net.named_parameters()))
for name, param in net.named_parameters():
    print(name, param.size())

② `nn.Parameter`

nn.Parameter：用于定义可以被优化（即可以通过梯度下降等算法进行训练）的参数。当你创建一个 nn.Parameter 对象时，它会自动注册到模型的参数列表中，这意味着它将被包含在模型的参数优化过程中。

class MyModel(nn.Module):
    def __init__(self, **kwargs):
        super(MyModel, self).__init__(**kwargs)
        self.weight1 = nn.Parameter(torch.rand(20, 20))
        self.weight2 = torch.rand(20, 20)
    def forward(self, x):
        pass

初始化权重的梯度是None，训练过程中回代才改变。

③参数的数值和梯度访问

param.data和param.grad访问和修改相关属性。

for name, param in net.named_parameters():
    if 'weight' in name:
        init.normal_(param, mean=0, std=0.01)
        print(name, param.data)
        print(name, param.grad)

2、初始化模型参数

①使用init中的方法初始化

下面代码分别是正态分布初始化和常数初始化。

init.normal_(param, mean=0, std=0.01)
init.constant_(param, val=0)

②自定义初始化

参数初始化时使用with torch.no_grad()来暂时禁用梯度计算，这对于初始化权重是有用的，因为我们不希望在初始化时计算梯度。

def init_weight_(tensor):
    with torch.no_grad():
        tensor.uniform_(-10, 10)
        tensor *= (tensor.abs() >= 5).float()

3、共享模型参数

当不同层指向的是同一个实例时，它们共享同样的权重。如果你初始化或更新其中一个层的参数，实际上这几个层都会受到映像。

linear = nn.Linear(1, 1, bias=False)
net = nn.Sequential(linear, linear) 
print(net)
for name, param in net.named_parameters():
    init.constant_(param, val=3)
    print(name, param.data)

三、自定义层

1、不含模型参数的自定义层

class CenteredLayer(nn.Module):
    def __init__(self, **kwargs):
        super(CenteredLayer, self).__init__(**kwargs)
    def forward(self, x):
        return x - x.mean()
layer = CenteredLayer()
layer(torch.tensor([1, 2, 3, 4, 5], dtype=torch.float))

2、含模型参数的自定义层

class MyListDense(nn.Module):
    def __init__(self):
        super(MyListDense, self).__init__()
        self.params = nn.ParameterList([nn.Parameter(torch.randn(4, 4)) for i in range(3)])
        self.params.append(nn.Parameter(torch.randn(4, 1)))

    def forward(self, x):
        for i in range(len(self.params)):
            x = torch.mm(x, self.params[i])
        return x
net = MyListDense()
print(net)

四、读取和存储

1、读写`Tensor`

torch.save()：将张量存到指定文件中。

torch.load()：载入指定文件中的张量。

y = torch.zeros(4)
torch.save([x, y], 'xy.pt')
xy_list = torch.load('xy.pt')
xy_list

2、读写模型

state_dict()方法：

保存模型的参数：通过 state_dict()，你可以将模型的参数提取出来并保存为一个字典，以便稍后加载或分享。
加载模型的参数：可以通过 load_state_dict() 方法将保存的参数字典加载到模型中。
检查模型的当前参数状态：state_dict() 方便调试时检查模型的权重和偏置。

class MLP(nn.Module):
    def __init__(self):
        super(MLP, self).__init__()
        self.hidden = nn.Linear(3, 2)
        self.act = nn.ReLU()
        self.output = nn.Linear(2, 1)

    def forward(self, x):
        a = self.act(self.hidden(x))
        return self.output(a)

net = MLP()
net.state_dict()

PATH = "./net.pt"
torch.save(net.state_dict(), PATH)

net2 = MLP()
net2.load_state_dict(torch.load(PATH))
Y2 = net2(X)
Y2 == Y