Python训练模型加快训练速度之使用AMP(AUTOMATIC MIXED PRECISION EXAMPLES自动混合精度)

本文介绍如何利用torch.cuda.amp模块进行自动混合精度训练,通过torch.autocast和GradScaler提升模型训练效率,包括导入模块、设置精度混合和梯度累积的示例,适用于多模型和复杂优化场景。
摘要由CSDN通过智能技术生成

由于我自己的模型训练时间会比较久,所以找到了该方法加快我的模型训练速度

具体使用的torch包

torch.cuda.amp

在torch官方文档中表示,“自动使用混合精度训练”,一般是同时使用torch.autocasttorch.cuda.amp.GradScaler

Ordinarily, “automatic mixed precision training” uses torch.autocast and torch.cuda.amp.GradScaler together, as shown in the Automatic Mixed Precision examples and Automatic Mixed Precision recipe. However, torch.autocast and GradScaler are modular, and may be used separately if desired.

具体使用方法

  1. 首先就是导入torch包
from torch.cuda.amp import autocast as autocast

scaler = torch.cuda.amp.GradScaler()

2.使用精度混合加快训练速度
在未使用amp之前的代码:

model = Net().cuda()
optimizer = optim.SGD(model.parameters(), ...)

for epoch in epochs:
    for input, target in data:
        optimizer.zero_grad()
        output = model(input)
        loss = loss_fn(output, target)
        loss.backward()
        optimizer.step()

使用了amp之后

# Creates model and optimizer in default precision
model = Net().cuda()
optimizer = optim.SGD(model.parameters(), ...)

# Creates a GradScaler once at the beginning of training.

for epoch in epochs:
    for input, target in data:
        optimizer.zero_grad()

        # Runs the forward pass with autocasting.
        with autocast():
            output = model(input)
            loss = loss_fn(output, target)

        # Scales loss.  Calls backward() on scaled loss to create scaled gradients.
        # Backward passes under autocast are not recommended.
        # Backward ops run in the same dtype autocast chose for corresponding forward ops.
        scaler.scale(loss).backward()

        # scaler.step() first unscales the gradients of the optimizer's assigned params.
        # If these gradients do not contain infs or NaNs, optimizer.step() is then called,
        # otherwise, optimizer.step() is skipped.
        scaler.step(optimizer)

        # Updates the scale for next iteration.
        scaler.update()

考虑到梯度累加,可以这样使用:

for epoch in epochs:
    for i, (input, target) in enumerate(data):
        with autocast():
            output = model(input)
            loss = loss_fn(output, target)
            loss = loss / iters_to_accumulate

        # Accumulates scaled gradients.
        scaler.scale(loss).backward()

        if (i + 1) % gradient_accumulation_steps == 0:
            # may unscale_ here if desired (e.g., to allow clipping unscaled gradients)

            scaler.step(optimizer)
            scaler.update()
            optimizer.zero_grad()

当模型中包含多个模型,损失和优化器时:

for epoch in epochs:
    for input, target in data:
        optimizer0.zero_grad()
        optimizer1.zero_grad()
        with autocast():
            output0 = model0(input)
            output1 = model1(input)
            loss0 = loss_fn(2 * output0 + 3 * output1, target)
            loss1 = loss_fn(3 * output0 - 5 * output1, target)

        # (retain_graph here is unrelated to amp, it's present because in this
        # example, both backward() calls share some sections of graph.)
        scaler.scale(loss0).backward(retain_graph=True)
        scaler.scale(loss1).backward()

        # You can choose which optimizers receive explicit unscaling, if you
        # want to inspect or modify the gradients of the params they own.
        scaler.unscale_(optimizer0)

        scaler.step(optimizer0)
        scaler.step(optimizer1)

        scaler.update()
  • 2
    点赞
  • 2
    收藏
    觉得还不错? 一键收藏
  • 0
    评论
评论
添加红包

请填写红包祝福语或标题

红包个数最小为10个

红包金额最低5元

当前余额3.43前往充值 >
需支付:10.00
成就一亿技术人!
领取后你会自动成为博主和红包主的粉丝 规则
hope_wisdom
发出的红包
实付
使用余额支付
点击重新获取
扫码支付
钱包余额 0

抵扣说明:

1.余额是钱包充值的虚拟货币,按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载,可以购买VIP、付费专栏及课程。

余额充值