Python训练模型加快训练速度之使用AMP(AUTOMATIC MIXED PRECISION EXAMPLES自动混合精度)

最新推荐文章于 2024-02-07 17:33:40 发布

Ciao112

最新推荐文章于 2024-02-07 17:33:40 发布

阅读量1.3k

点赞数 2

文章标签： python pytorch 深度学习

本文链接：https://blog.csdn.net/qq_39641349/article/details/124695592

版权

本文介绍如何利用torch.cuda.amp模块进行自动混合精度训练，通过torch.autocast和GradScaler提升模型训练效率，包括导入模块、设置精度混合和梯度累积的示例，适用于多模型和复杂优化场景。

摘要由CSDN通过智能技术生成

由于我自己的模型训练时间会比较久，所以找到了该方法加快我的模型训练速度

具体使用的torch包

torch.cuda.amp

在torch官方文档中表示，“自动使用混合精度训练”，一般是同时使用torch.autocast和torch.cuda.amp.GradScaler。

Ordinarily, “automatic mixed precision training” uses torch.autocast and torch.cuda.amp.GradScaler together, as shown in the Automatic Mixed Precision examples and Automatic Mixed Precision recipe. However, torch.autocast and GradScaler are modular, and may be used separately if desired.

具体使用方法

首先就是导入torch包

from torch.cuda.amp import autocast as autocast

scaler = torch.cuda.amp.GradScaler()

2.使用精度混合加快训练速度
在未使用amp之前的代码：

model = Net().cuda()
optimizer = optim.SGD(model.parameters(), ...)

for epoch in epochs:
    for input, target in data:
        optimizer.zero_grad()
        output = model(input)
        loss = loss_fn(output, target)
        loss.backward()
        optimizer.step()

使用了amp之后

# Creates model and optimizer in default precision
model = Net().cuda()
optimizer = optim.SGD(model.parameters(), ...)

# Creates a GradScaler once at the beginning of training.

for epoch in epochs:
    for input, target in data:
        optimizer.zero_grad()

        # Runs the forward pass with autocasting.
        with autocast():
            output = model(input)
            loss = loss_fn(output, target)

        # Scales loss.  Calls backward() on scaled loss to create scaled gradients.
        # Backward passes under autocast are not recommended.
        # Backward ops run in the same dtype autocast chose for corresponding forward ops.
        scaler.scale(loss).backward()

        # scaler.step() first unscales the gradients of the optimizer's assigned params.
        # If these gradients do not contain infs or NaNs, optimizer.step() is then called,
        # otherwise, optimizer.step() is skipped.
        scaler.step(optimizer)

        # Updates the scale for next iteration.
        scaler.update()

考虑到梯度累加，可以这样使用：

for epoch in epochs:
    for i, (input, target) in enumerate(data):
        with autocast():
            output = model(input)
            loss = loss_fn(output, target)
            loss = loss / iters_to_accumulate

        # Accumulates scaled gradients.
        scaler.scale(loss).backward()

        if (i + 1) % gradient_accumulation_steps == 0:
            # may unscale_ here if desired (e.g., to allow clipping unscaled gradients)

            scaler.step(optimizer)
            scaler.update()
            optimizer.zero_grad()

当模型中包含多个模型，损失和优化器时：

for epoch in epochs:
    for input, target in data:
        optimizer0.zero_grad()
        optimizer1.zero_grad()
        with autocast():
            output0 = model0(input)
            output1 = model1(input)
            loss0 = loss_fn(2 * output0 + 3 * output1, target)
            loss1 = loss_fn(3 * output0 - 5 * output1, target)

        # (retain_graph here is unrelated to amp, it's present because in this
        # example, both backward() calls share some sections of graph.)
        scaler.scale(loss0).backward(retain_graph=True)
        scaler.scale(loss1).backward()

        # You can choose which optimizers receive explicit unscaling, if you
        # want to inspect or modify the gradients of the params they own.
        scaler.unscale_(optimizer0)

        scaler.step(optimizer0)
        scaler.step(optimizer1)

        scaler.update()

Ciao112

关注

2
点赞
踩
2

收藏

觉得还不错? 一键收藏
0
评论
Python训练模型加快训练速度之使用AMP(AUTOMATIC MIXED PRECISION EXAMPLES自动混合精度)

由于我自己的模型训练时间会比较久，所以找到了该方法加快我的模型训练速度具体使用的torch包torch.cuda.amp在torch官方文档中表示，“自动使用混合精度训练”，一般是同时使用torch.autocast和torch.cuda.amp.GradScaler。Ordinarily, “automatic mixed precision training” uses torch.autocast and torch.cuda.amp.GradScaler together, as sho
复制链接

扫一扫