![12ad3fb45f37d5acce12fb7b95a64872.png](https://i-blog.csdnimg.cn/blog_migrate/f8b3097b67873c31b28e2ae511d88d7e.jpeg)
服务器配置
服务器ip:10.1.12.179
系统环境:centos7
Gpu驱动版本:440.33.01
GPU型号:Tesla P4:8G,Tesla T4:16G,如下图:
![ba37ca11516464a24797e67adc4fb5f4.png](https://i-blog.csdnimg.cn/blog_migrate/98b21c64854ffa1a85abe565e2923725.jpeg)
测试环境
docker-nvidia容器,Ubuntu18.04+cuda10.2+cudnn7,pytorch=1.2.0
显卡运行测试
分别进行了单GPU和多GPU的模型训练,并【成功通过】测试代码,运行输出结果如下:
![de5ff1ee24a4b9517718bb8acc5fdfb1.png](https://i-blog.csdnimg.cn/blog_migrate/f94e42d5a8459164020bea3077d270c2.png)
单GPU测试:通过
![45126977075896d31b7a9e431d4f7a64.png](https://i-blog.csdnimg.cn/blog_migrate/e1a915283325e893a143612470abf738.jpeg)
多GPU测试:通过
![00beea2b64175f6d046a9e4b7f2c3bcd.png](https://i-blog.csdnimg.cn/blog_migrate/c5e1f3117f2da3e7a0ffb51b866bb4f0.jpeg)
显卡性能测试
![608fbf838f546ae8035f8ecc9bfac426.png](https://i-blog.csdnimg.cn/blog_migrate/f32190fb7cbe50d89aac4f925368db44.png)
测试总结
1、支持不同型号的显卡进行多GPU训练
2、多GPU可以显著提升训练的最大batch_size,但对显存占用和计算时间没有明显的提升,可能是因为显卡型号不同的原因
3、最好选择相同型号的显卡,训练故障,代码错误等排查更加方便,尽量避免由于型号不同导致的不必要错误
测试代码
# -*- coding: utf-8 -*-
import torch
from tqdm import tqdm
import torch.nn as nn
class Model(torch.nn.Module):
def __init__(self, in_channels, out_channels, kernel_size, stride=1):
super(Model, self).__init__()
self.conv = torch.nn.Sequential(nn.Conv2d(in_channels, out_channels, kernel_size, stride=stride))
•
def forward(self, x):
b = self.conv(x)
loss = torch.sum(b)
loss = loss * loss
return loss, x
if __name__ == '__main__':
a = torch.randn([8, 3, 4000, 2000])
model = Model(3, 100, 1, 1)
device = torch.device("cuda" if torch.cuda.is_available() else "cpu")
if torch.cuda.is_available():
print("using cuda")
model = model.cuda(device)
a = a.cuda()
•
print(torch.cuda.device_count() > 1)
if torch.cuda.device_count() > 1:
model = torch.nn.DataParallel(model)
pass
optimizor = torch.optim.Adam(model.parameters(), lr=0.005)
for i in tqdm(range(100000000)):
loss, _ = model(a)
optimizor.zero_grad()
loss.backward(torch.ones_like(loss))
optimizor.step()
if i % 100 == 0: