【动手学深度学习】深度学习计算

chaoql

已于 2023-04-20 10:48:16 修改

阅读量1.4k

点赞数 1

分类专栏： # 深度学习数据科学文章标签：深度学习 pytorch 人工智能

于 2022-01-25 11:11:36 首次发布

本文链接：https://blog.csdn.net/qq_43510916/article/details/122681557

版权

数据科学同时被 2 个专栏收录

29 篇文章 2 订阅

订阅专栏

深度学习

10 篇文章 0 订阅

订阅专栏

深度学习计算

本文为李沐老师《动手学深度学习》一书的学习笔记，原书地址为：Dive into Deep Learning。

文章目录

深度学习计算

1. 模型构造

1.1 继承`Module`类来构造

import torch
from torch import nn

class MLP(nn.Module):
    # 声明带有模型参数的层，这里声明了两个全连接层
    def __init__(self, **kwargs):
        # 调用MLP父类Module的构造函数来进行必要的初始化。这样在构造实例时还可以指定其他函数
        # 参数，如“模型参数的访问、初始化和共享”一节将介绍的模型参数params
        super(MLP, self).__init__(**kwargs)
        self.hidden = nn.Linear(784, 256) # 隐藏层
        self.act = nn.ReLU() # 激活层
        self.output = nn.Linear(256, 10)  # 输出层

    # 定义模型的前向计算，即如何根据输入x计算返回所需要的模型输出
    def forward(self, x):
        a = self.act(self.hidden(x))
        return self.output(a)

1.2 `Sequential`类来构造

import torch
from torch import nn
from torch.nn import init

net = nn.Sequential(nn.Linear(4, 3), nn.ReLU(), nn.Linear(3, 1))  # pytorch已进行默认初始化
print(net)
#Sequential(
#  (0): Linear(in_features=4, out_features=3, bias=True)
#  (1): ReLU()
#  (2): Linear(in_features=3, out_features=1, bias=True)
#)

2. 模型参数的访问、初始化和共享

import torch
from torch import nn
from torch.nn import init

net = nn.Sequential(nn.Linear(4, 3), nn.ReLU(), nn.Linear(3, 1))  # pytorch已进行默认初始化
print(net)
X = torch.rand(2, 4)
Y = net(X).sum()

2.1 访问模型的层数据

2.1.1 访问多层感知机`net`的所有层数

print(type(net.named_parameters()))
for name, param in net.named_parameters():
    print(name, param.size())

<class 'generator'>
0.weight torch.Size([3, 4])
0.bias torch.Size([3])
2.weight torch.Size([1, 3])
2.bias torch.Size([1])

2.1.2 索引访问任意层

索引0表示隐藏层为Sequential实例最先添加的层。

for name, param in net[0].named_parameters():
    print(name, param.size(), type(param))

weight torch.Size([3, 4]) <class 'torch.nn.parameter.Parameter'>
bias torch.Size([3]) <class 'torch.nn.parameter.Parameter'>

2.2 初始化模型参数

将权重参数初始化成均值为0、标准差为0.01的正态分布随机数，并依然将偏差参数清零。

for name, param in net.named_parameters():
    if 'weight' in name:
        init.normal_(param, mean=0, std=0.01)
        print(name, param.data)

0.weight tensor([[ 0.0030,  0.0094,  0.0070, -0.0010],
        [ 0.0001,  0.0039,  0.0105, -0.0126],
        [ 0.0105, -0.0135, -0.0047, -0.0006]])
2.weight tensor([[-0.0074,  0.0051,  0.0066]])

for name, param in net.named_parameters():
    if 'bias' in name:
        init.constant_(param, val=0)
        print(name, param.data)

0.bias tensor([0., 0., 0.])
2.bias tensor([0.])

2.3 读模型参数

class MLP(nn.Module):
    def __init__(self):
        super(MLP, self).__init__()
        self.hidden = nn.Linear(3, 2)
        self.act = nn.ReLU()
        self.output = nn.Linear(2, 1)

    def forward(self, x):
        a = self.act(self.hidden(x))
        return self.output(a)

net = MLP()
net.state_dict()

OrderedDict([('hidden.weight', tensor([[ 0.2448,  0.1856, -0.5678],
                      [ 0.2030, -0.2073, -0.0104]])),
             ('hidden.bias', tensor([-0.3117, -0.4232])),
             ('output.weight', tensor([[-0.4556,  0.4084]])),
             ('output.bias', tensor([-0.3573]))])

2.4 读优化器参数

optimizer = torch.optim.SGD(net.parameters(), lr=0.001, momentum=0.9)
optimizer.state_dict()

{'param_groups': [{'dampening': 0,
   'lr': 0.001,
   'momentum': 0.9,
   'nesterov': False,
   'params': [4736167728, 4736166648, 4736167368, 4736165352],
   'weight_decay': 0}],
 'state': {}}

3. 保存和加载模型

PyTorch中保存和加载训练模型有两种常见的方法:

仅保存和加载模型参数(state_dict)；
保存和加载整个模型。

3.1 保存和加载`state_dict`(推荐方式)

保存：

torch.save(model.state_dict(), PATH) # 推荐的文件后缀名是pt或pthCopy to clipboardErrorCopied

加载：

model = TheModelClass(*args, **kwargs)
model.load_state_dict(torch.load(PATH))

3.2 保存和加载整个模型

保存：

torch.save(model, PATH)

加载：

model = torch.load(PATH)