Pytorch学习笔记

最新推荐文章于 2022-05-12 09:00:00 发布

Jeremy_lf

最新推荐文章于 2022-05-12 09:00:00 发布

阅读量221

点赞数

分类专栏： Pytorch 文章标签：深度学习神经网络 tensorflow 人工智能 python

本文链接：https://blog.csdn.net/Jeremy_lf/article/details/103718401

版权

Pytorch 专栏收录该内容

11 篇文章 1 订阅

订阅专栏

with torch.no_grad()

在使用pytorch时，并不是所有的操作都需要进行计算图的生成（计算过程的构建，以便梯度反向传播等操作）。而对于tensor的计算操作，默认是要进行计算图的构建的，在这种情况下，可以使用 with torch.no_grad():，强制之后的内容不进行计算图构建。

model.eval()                                # 测试模式
with torch.no_grad():
   pass

@torch.no_grad()
def eval():
	...

问题7：在预训练模型基础上fine-tune时，loss变为nan

a. 梯度爆炸造成loss爆炸

原因很简单，学习率较高的情况下，直接影响到每次更新值的程度比较大，走的步伐因此也会大起来。过大的学习率会导致无法顺利地到达最低点，稍有不慎就会跳出可控制区域，此时我们将要面对的就是损失成倍增大(跨量级)。

解决方法也很简单，降低初始的学习率，并设置学习率衰减。

b. 输入数据的正确性（是否存在脏样本）

在训练的时候，如果数据比较多，99%的数据是对的，但有1%的数据不正常，或者损坏，在训练过程中这些数据往往会造成nan或者inf，这时候需要仔细挑选自己的数据。

问题1:加载模型需要指定GPU或者CPU

state = torch.load(path, map_location='cuda:0')

问题2:torch.randn与torch.rand区别

torch.rand(sizes, out=None) → Tensor 均匀分布
torch.randn(sizes,out=None) → Tensor  标准正太分布
torch.normal(means, std, out=None) → Tensor 离散正太分布
torch.linspace(start, end, steps=100, out=None) → Tensor 线性间距向量

问题3：cuda与cpu的选择
选择设备参数cuda为gpu，cpu为采用cpu运行 device = torch.device(‘cuda’ if use_cuda else ‘cpu’)

gpu的选择

1、利用环境变量设置

import os
os.environ["CUDA_VISIBLE_DEVICES"] = 0

2、命令行设置

CUDA_VISIBLE_DEVICES = 0  python train.py

3、代码行设置

torch.cuda.set_device(6)

问题4：保存和加载模型
保存和加载网络结构和参数

torch.save(resnet, 'model.pkl')
model = torch.load('model.pkl')

保存和加载网络中的参数

torch.save(resnet.state_dict(), 'params.pkl')
resnet.load_state_dict(torch.load('params.pkl'))

问题5：加载预训练模型

import torchvision

# 下载并加载resnet.
resnet = torchvision.models.resnet18(pretrained=True)

# 如果你只想要finetune模型最顶层的参数
for param in resnet.parameters():
    # 将resent的参数设置成不更新
    param.requires_grad = False
    
# 把resnet的全连接层fc 替换成自己设置的线性层nn.Linear
# 比如说,输入维度是resnet.fc.in_features, 输出是100维
resnet.fc = nn.Linear(resnet.fc.in_features, 100) 

# 测试一下
images = Variable(torch.randn(10, 3, 256, 256))
outputs = resnet(images)
print (outputs.size())   # (10, 100)

问题6：pytorch优化器选择
torch.optim是一个实现了各种优化算法的库。

opt_SGD = torch.optim.SGD(net_SGD.parameters(),lr=Learning_rate)
opt_Momentum = torch.optim.SGD(net_Momentum.parameters(),lr=Learning_rate,momentum=0.8,nesterov=True)
opt_RMSprop = torch.optim.RMSprop(net_RMSprop.parameters(),lr=Learning_rate,alpha=0.9)
opt_Adam = torch.optim.Adam(net_Adam.parameters(),lr=Learning_rate,betas=(0.9,0.99))
opt_Adagrad = torch.optim.Adagrad(net_Adagrad.parameters(),lr=Learning_rate)

进行单次优化optimizer.step()，所有的optimizer都实现了step()方法，这个方法会更新所有的参数。

for input, target in dataset:
    optimizer.zero_grad()
    output = model(input)
    loss = loss_fn(output, target)
    loss.backward()
    optimizer.step()

简单的应用

import torch

# N is batch size; D_in is input dimension;
# H is hidden dimension; D_out is output dimension.
N, D_in, H, D_out = 64, 1000, 100, 10

# Create random Tensors to hold inputs and outputs
x = torch.randn(N, D_in)
y = torch.randn(N, D_out)

# Use the nn package to define our model and loss function.
model = torch.nn.Sequential(
    torch.nn.Linear(D_in, H),
    torch.nn.ReLU(),
    torch.nn.Linear(H, D_out),
)
loss_fn = torch.nn.MSELoss(reduction='sum')

# Use the optim package to define an Optimizer that will update the weights of
# the model for us. Here we will use Adam; the optim package contains many other
# optimization algoriths. The first argument to the Adam constructor tells the
# optimizer which Tensors it should update.
learning_rate = 1e-4
optimizer = torch.optim.Adam(model.parameters(), lr=learning_rate)
for t in range(500):
    # Forward pass: compute predicted y by passing x to the model.
    y_pred = model(x)

    # Compute and print loss.
    loss = loss_fn(y_pred, y)
    print(t, loss.item())

    # Before the backward pass, use the optimizer object to zero all of the
    # gradients for the variables it will update (which are the learnable
    # weights of the model). This is because by default, gradients are
    # accumulated in buffers( i.e, not overwritten) whenever .backward()
    # is called. Checkout docs of torch.autograd.backward for more details.
    optimizer.zero_grad()

    # Backward pass: compute gradient of the loss with respect to model
    # parameters
    loss.backward()

    # Calling the step function on an Optimizer makes an update to its
    # parameters
    optimizer.step()

说明：优化器最好避免使用SGD，训练的时候会出现loss变小到增大的过程，根本原因在于学习率过大导致的。解决这个问题有一些现成的办法，其中之一就是随着训练步骤的进行，对学习率乘上一个参数（0.99或者0.999）使得学习率随着学习步骤的进行而下降。另一种办法就是采用现成的优化算法（Adam/RMSprop等）。

问题6:torchvision models

torchvision.models模块的子模块中包含以下模型结构。

AlexNet
VGG
ResNet
SqueezeNet
DenseNet

你可以使用随机初始化的权重来创建这些模型。

import torchvision.models as models
resnet18 = models.resnet18()
alexnet = models.alexnet()
squeezenet = models.squeezenet1_0()
densenet = models.densenet_161()

并且提供了预训练模型

import torchvision.models as models
#pretrained=True就可以使用预训练的模型
resnet18 = models.resnet18(pretrained=True)
alexnet = models.alexnet(pretrained=True)

Jeremy_lf

关注

0
点赞
踩
0

收藏

觉得还不错? 一键收藏
0
评论
Pytorch学习笔记

问题1:加载模型需要指定GPU或者CPUstate = torch.load(path, map_location=‘cuda:0’)问题2:torch.randn与torch.rand区别torch.rand(sizes, out=None) → Tensor 均匀分布torch.randn(sizes,out=None) → Tensor 标准正太分布torch.normal(me...
复制链接

扫一扫

专栏目录