Pytorch使用GPU训练模型加速

最新推荐文章于 2025-03-05 11:58:22 发布

柏常青

最新推荐文章于 2025-03-05 11:58:22 发布

阅读量1w

点赞数 12

分类专栏： Pytorch 文章标签： pytorch 深度学习 python

本文链接：https://blog.csdn.net/beauthy/article/details/121116760

版权

Pytorch 专栏收录该内容

19 篇文章

订阅专栏

Pytorch使用GPU训练模型加速

深度学习神经网络训练经常很耗时，耗时主要来自两个部分，数据准备和自参数迭代。
当数据准备是主要耗时原因时，采用多进程准备数据。当迭代过程是训练耗时主力时，采用GPU加速。

需要了解GPU信息：

提示：默认环境配置完成

1. 查看设备GPU信息

import torch
from torch import nn
#  查看gpu信息
cudaMsg = torch.cuda.is_available()
gpuCount = torch.cuda.device_count()
print("1.是否存在GPU:{}".format(cudaMsg), "如果存在有：{}个".format(gpuCount))

输出信息：
在这里插入图片描述

2. 让张量在GPU与CPU之间移动

test_tensor = torch.rand((100, 100))
tensor_gpu = test_tensor.to("cuda:0")  # 或者写成tensor_gpu = test_tensor.cuda()
tensor_cpu = tensor_gpu.to("cpu")  # 或写为tensor_gpu.cpu()

print("test_tensor的device是：{}".format(test_tensor.device),
      "\ntensor_gpu的device是{}".format(tensor_gpu.device),
      "\ntensor_cpu的device是{}".format(tensor_cpu.device))

输出信息：
在这里插入图片描述

3. Python中使用GPU加速模型

需将模型和数据移动到GPU上。

# 定义模型
model = nn.Linear(10, 1)
print("The device is gpu begin?:", next(model.parameters()).is_cuda)
device = torch.device("cuda:0" if torch.cuda.is_available() else "cpu")  # 有没有GPU
model.to(device)  # 移动模型到cuda
print("The device is gpu later?:", next(model.parameters()).is_cuda)
print("The device is gpu,", next(model.parameters()).device)

输出信息：
在这里插入图片描述
注释：next(model.parameters())是什么？

model.parameters()对象是一个迭代器，next() 方法从迭代器中检索下一项。从打印结果看，它返回了全连接网络fc的模型参数。
weight: tensor([[ 0.2762, -0.1113, 0.2837, -0.2646, 0.1973, 0.1510, -0.2463, -0.1493, -0.0235, 0.1590]], device='cuda:0')
在这里插入图片描述
Weight相当于A转置，是一个列向量。

4. 创建支持多个gpu数据并行的模型

model = nn.Linear(10, 1)
print("The device is gpu begin?:", next(model.parameters()).is_cuda)
model = nn.DataParallel(model)	
print("The device is gpu begin?:", next(model.module.parameters()).device)

在这里插入图片描述
如果使用多个GPU训练模型，则需将模型设置为数据并行模型。

# 定义模型
model = U2NET(3, 1)
if torch.cuda.device_count() > 1:
model = nn.DataParallel(model) # 包装为并行风格模型
# 训练模型
...
features = features.to(device) # 移动数据到cuda
labels = labels.to(device) 
# 或者 labels = labels.cuda() if torch.cuda.is_available() else labels
...

pytorch项目GPU加速实例

项目过程
1、数据准备：均匀分布构建样本特征数据X,定义线性回归函数生成标签Y.
2、模型定义: 采用简单的全连接网络模型
3、建模、优化函数设置、损失函数设置、训练模型函数：将模型和数据移到GPU上，设置优化方法和损失函数开始模拟特征data和label之间的规律，也就是训练学习过程。
4、测试评估

import torch
from torch import nn

# 准备数据（这里制造的假数据符合线性回归）
sample_n = 1000000  # 样本数量设置为一百万
sample_t = 1000  # 测试样本
# 线性回归，y = x A转置 + b
X = 10 * torch.rand([sample_n, 2]) - 5.0  # 一百万组数据，每组数据两个值，均匀分布
X_test = 10 * torch.rand([sample_t, 2]) - 5.0
w0 = torch.tensor([[2.0, -3.0]])
b0 = torch.tensor([[10.0]])
Y = X @ w0.t() + b0 + torch.normal(0.0, 2.0, size=[sample_n, 1])  # 设定w,b，得到标签数据。
Y_t = X_test @ w0.t() + b0 + torch.normal(0.0, 2.0, size=[sample_t, 1])
# X@w0.t()此处为矩阵乘法
# torch.normal(0.0, 2.0, size=[sample_n, 1])为干扰或噪声
print("torch.cuda.is_available() = ", torch.cuda.is_available())
# 将数据集中数据和标签移到GPU上
data = X.cuda()
label = Y.cuda()
# ---检查数据是否移到GPU上啦---
print("X.device:", X.device)
print("Y.device:", Y.device)


# 定义模型
class LinearRegression(nn.Module):
    def __init__(self):
        super().__init__()
        self.w = nn.Parameter(torch.rand_like(w0))
        self.b = nn.Parameter(torch.rand_like(b0))

    # 正向传播
    def forward(self, x):
        return x @ self.w.t() + self.b


# 训练
def train(epoches):
    import time
    tic = time.time()
    # 建模
    linear = LinearRegression()
    # 移动模型到GPU上
    device = torch.device("cuda:0" if torch.cuda.is_available() else "cpu")
    linear.to(device)

    # 建立优化函数和损失函数
    optimizer = torch.optim.Adam(linear.parameters(), lr=0.01)
    loss_func = nn.MSELoss()
    for epoch in range(epoches):
        optimizer.zero_grad()
        Y_pre = linear(data)
        loss = loss_func(Y_pre, label)
        loss.backward()
        optimizer.step()
        if epoch % 10 == 0:
            print({"epoch": epoch, "loss": loss.item()})
    torch.save(linear, "./linear_parameter.pth")
    toc = time.time()
    print("time used:", toc - tic)


# 测试
data_t = X_test.cuda()
label_t = Y_t.cuda()


def test():
    import time
    tic = time.time()
    loss_func = nn.MSELoss()
    linear_t = torch.load("./linear_parameter.pth")
    linear_t.eval()
    Y_pre = linear_t(data_t)
    loss_t = loss_func(Y_pre, label_t)
    toc = time.time()
    print("time used:{};and the loss_test={}".format((toc - tic), loss_t))


train(500)
test()

训练过程：

......
{'epoch': 480, 'loss': 35.423038482666016}
{'epoch': 490, 'loss': 34.51709747314453}
time used: 1.6621572971343994
time used:0.0019948482513427734;and the loss_test=34.555931091308594