Task02：PyTorch进阶训练技巧

JxWang05

已于 2022-03-27 12:55:28 修改

阅读量1.2k

点赞数

分类专栏：深入浅出PyTorch（进阶）文章标签： pytorch

于 2022-03-19 19:45:26 首次发布

本文链接：https://blog.csdn.net/weixin_52202311/article/details/123596554

版权

深入浅出PyTorch（进阶）专栏收录该内容

4 篇文章 0 订阅

订阅专栏

Task02：PyTorch进阶训练技巧

0. 教程地址
1. 自定义损失函数
- 1.1 函数方式
- 1.2 类方式
2. 动态调整学习率
- 2.1 官方API
- 2.2 自定义scheduler
3. 模型微调
4. 半精度训练
- 4.1 概念
- 4.2 实践

0. 教程地址

https://github.com/datawhalechina/thorough-pytorch

1. 自定义损失函数

1.1 函数方式

通过输出值和目标值进行计算，返回损失值

>>> import torch
>>> def my_loss(output, target):
...     loss = torch.mean((output - target)**2)
...     return loss
...
>>>

1.2 类方式

一般地，Loss函数部分继承自_loss, 部分继承自_WeightedLoss

而_WeightedLoss继承自_loss，_loss继承自 nn.Module

所以，我们自定义的损失函数类就需要继承自nn.Module类

下面以分割领域常见的Dice Loss损失函数举例：

$\frac{2|X∩Y|}{|X|+|Y|}$

>>> import torch.nn as nn
>>> class DiceLoss(nn.Module):
...     def __init__(self, weight=None, size_average=True):
...         super(DiceLoss,self).__init__()
...     def forward(self, inputs, targets, smooth=1):
...         inputs = F.sigmoid(inputs)
...         inputs = inputs.view(-1)
...         targets = targets.view(-1)
...         intersection = (inputs * targets).sum()
...         dice = (2.*intersection + smooth)/(inputs.sum() + targets.sum() + smooth)
...         return 1 - dice
...
>>>

还有一些其他的常用Loss，如BCE-Dice Loss：

>>> class DiceBCELoss(nn.Module):
...     def __init__(self, weight=None, size_average=True):
...         super(DiceBCELoss, self).__init__()
...     def forward(self, inputs, targets, smooth=1):
...         inputs = F.sigmoid(inputs)
...         inputs = inputs.view(-1)
...         targets = targets.view(-1)
...         intersection = (inputs * targets).sum()
...         dice_loss = 1 - (2.*intersection + smooth)/(inputs.sum() + targets.sum() + smooth)
...         BCE = F.binary_cross_entropy(inputs, targets, reduction='mean')
...         Dice_BCE = BCE + dice_loss
...         return Dice_BCE
...
>>>

Jaccard/Intersection over Union (IoU) Loss：

>>> class IoULoss(nn.Module):
...     def __init__(self, weight=None, size_average=True):
...         super(IoULoss, self).__init__()
...     def forward(self, inputs, targets, smooth=1):
...         inputs = F.sigmoid(inputs)
...         inputs = inputs.view(-1)
...         targets = targets.view(-1)
...         intersection = (inputs * targets).sum()
...         total = (inputs + targets).sum()
...         union = total - intersection
...         IoU = (intersection + smooth)/(union + smooth)
...         return 1 - IoU
...
>>>

Focal Loss：

>>> ALPHA = 0.8
>>> GAMMA = 2
>>> class FocalLoss(nn.Module):
...     def __init__(self, weight=None, size_average=True):
...         super(FocalLoss, self).__init__()
...     def forward(self, inputs, targets, alpha=ALPHA, gamma=GAMMA, smooth=1):
...         inputs = F.sigmoid(inputs)
...         inputs = inputs.view(-1)
...         targets = targets.view(-1)
...         BCE = F.binary_cross_entropy(inputs, targets, reduction='mean')
...         BCE_EXP = torch.exp(-BCE)
...         focal_loss = alpha * (1-BCE_EXP)**gamma * BCE
...         return focal_loss
...
>>>

2. 动态调整学习率

2.1 官方API

lr_scheduler.LambdaLR
lr_scheduler.MultiplicativeLR
lr_scheduler.StepLR
lr_scheduler.MultiStepLR
lr_scheduler.ExponentialLR
lr_scheduler.CosineAnnealingLR
lr_scheduler.ReduceLROnPlateau
lr_scheduler.CyclicLR
lr_scheduler.OneCycleLR
lr_scheduler.CosineAnnealingWarmRestarts

我们在使用官方给出的torch.optim.lr_scheduler时，需要将scheduler.step()放在optimizer.step()后面进行使用。

2.2 自定义scheduler

通过自定义函数adjust_learning_rate来改变param_group中lr的值来实现

>>> def adjust_learning_rate(optimizer, epoch):
...     lr = args.lr * (0.1 ** (epoch // 30))
...     for param_group in optimizer.param_groups:
...         param_group['lr'] = lr
...
>>>

调用过程如下：

def adjust_learning_rate(optimizer,...):
    ...
optimizer = torch.optim.SGD(model.parameters(),lr = args.lr,momentum = 0.9)
for epoch in range(10):
    train(...)
    validate(...)
    adjust_learning_rate(optimizer,epoch)

3. 模型微调

3.1 概念

通过对已经训练好的模型进行参数调整，使其用于新数据集训练

3.2 流程

拿到源模型，可以下载或者自行在源数据集训练
将源模型输出层外所有的结构和参数复制到目标模型
重构目标模型的输出层，并随机初始化改成的对应参数
使用目标数据集训练目标模型

3.3 图示

在这里插入图片描述

3.4 实践

>>> # 使用requires_grad=False冻结部分网络层，只计算新初始化的层的梯度
>>> def set_parameter_requires_grad(model, feature_extracting):
...     if feature_extracting:
...         for param in model.parameters():
...             param.requires_grad = False
...
>>> import torchvision.models as models
>>> # 冻结参数的梯度
>>> feature_extract = True
>>> # 通过传入pretrained参数，决定是否使用预训练好的权重
>>> model = models.resnet50(pretrained=True)
Downloading: "https://download.pytorch.org/models/resnet50-19c8e357.pth" to C:\Users\Hunter-G/.cache\torch\hub\checkpoints\resnet50-19c8e357.pth
100%|██████████████████████████████████████████████████████████████████████████████████████████| 97.8M/97.8M [01:23<00:00, 1.23MB/s]
>>> set_parameter_requires_grad(model, feature_extract)
>>> # 修改模型
>>> num_ftrs = model.fc.in_features
>>> model.fc = nn.Linear(in_features=512, out_features=4, bias=True)
>>> model.fc
Linear(in_features=512, out_features=4, bias=True)
>>>

在训练过程中，model仍会回传梯度，但是参数更新只会发生在fc层。

4. 半精度训练

4.1 概念

PyTorch默认的浮点数存储方式用的是torch.float32

但使用torch.float16格式的信息也不会影响结果

由于数位减了一半，因此被称为“半精度”，具体如下图：
在这里插入图片描述

4.2 实践

导入相关的包
```
from torch.cuda.amp import autocast
```
装饰模型
在定义模型时，用autocast装饰模型中的forward函数

@autocast()   
def forward(self, x):
    ...
    return x

训练过程

 for x in train_loader:
	x = x.cuda()
	with autocast():
        output = model(x)
        ...

半精度训练主要适用于数据本身的size比较大（比如说3D图像、视频等）。

当数据本身的size并不大时（比如手写数字MNIST数据集的图片尺寸只有28*28）

使用半精度训练则可能不会带来显著的提升。

JxWang05

关注

0
点赞
踩
0

收藏

觉得还不错? 一键收藏
0
评论
Task02：PyTorch进阶训练技巧

Task05：PyTorch进阶训练技巧1. 自定义损失函数1.1 函数方式1.2 类方式2. 动态调整学习率2.1 官方API2.2 自定义scheduler3. 模型微调3.1 概念3.2 流程3.3 图示3.4 实践4. 半精度训练4.1 概念4.2 实践1. 自定义损失函数1.1 函数方式通过输出值和目标值进行计算，返回损失值>>> import torch>>> def my_loss(output, target):... loss = t
复制链接

扫一扫