1. 检查PyTorch版本
torch.__version__ # PyTorch version
torch.version.cuda # Corresponding CUDA version
torch.backends.cudnn.version() # Corresponding cuDNN version
torch.cuda.get_device_name(0) # GPU type
2. 更新PyTorch
PyTorch将被安装在anaconda3/lib/python3.7/site-packages/torch/目录下。
conda update pytorch torchvision -c pytorch
3. 固定随机种子
torch.manual_seed(0)
torch.cuda.manual_seed_all(0)
4. 在命令行指定环境变量
CUDA_VISIBLE_DEVICES=0,1 python train.py
或在代码中指定
os.environ['CUDA_VISIBLE_DEVICES'] = '0,1'
5. 判断是否有CUDA支持
torch.cuda.is_available()
6. 设置为cuDNN benchmark模式
Benchmark模式会提升计算速度,但是由于计算中有随机性,每次网络前馈结果略有差异。
torch.backends.cudnn.benchmark = True
如果想要避免这种结果波动,设置
torch.backends.cudnn.deterministic = True
7. 张量(tensor)基本信息
tensor.type() # Data type
tensor.size() # Shape of the tensor. It is a subclass of Python tuple
tensor.dim() # Number of dimensions.
8. 数据类型转换
# Set default tensor type. Float in PyTorch is much faster than double.
torch.set_default_tensor_type(torch.FloatTensor)
# Type convertions.
tensor = tensor.cuda()
tensor = tensor.cpu()
tensor = tensor.float()
tensor = tensor.long()
9. torch.Tensor与np.ndarray转换
# torch.Tensor -> np.ndarray.
ndarray = tensor.cpu().numpy()
# np.ndarray -> torch.Tensor.
tensor = torch.from_numpy(ndarray).float()
tensor = torch.from_numpy(ndarray.copy()).float() # If ndarray has neg
10. torch.Tensor与PIL.Image转换
PyTorch中的张量默认采用N×D×H×W的顺序,并且数据范围在[0, 1],需要进行转置和规范化。
# torch.Tensor -> PIL.Image.
image = PIL.Image.fromarray(torch.clamp(tensor * 255, min=0, max=255
).byte().permute(1, 2, 0).cpu().numpy())
image = torchvision.transforms.functional.to_pil_image(tensor) # Equivalently way
# PIL.Image -> torch.Tensor.
tensor = torch.from_numpy(np.asarray(PIL.Image.open(path))
).permute(2, 0, 1).float() / 255
tensor = torchvision.transforms.functional.to_tensor(PIL.Image.open(path)) # Equivalently way
11. np.ndarray与PIL.Image转换
# np.ndarray -> PIL.Image.
image = PIL.Image.fromarray(ndarray.astypde(np.uint8))
# PIL.Image -> np.ndarray.
ndarray = np.asarray(PIL.Image.open(path))
12. 从只包含一个元素的张量中提取值
这在训练时统计loss的变化过程中特别有用。否则这将累积计算图,使GPU存储占用量越来越大。
value = tensor.item()
13. 拼接张量
注意torch.cat和torch.stack的区别在于torch.cat沿着给定的维度拼接,而torch.stack会新增一维。例如当参数是3个10×5的张量,torch.cat的结果是30×5的张量,而torch.stack的结果是3×10×5的张量。
tensor = torch.cat(list_of_tensors, dim=0)
tensor = torch.stack(list_of_tensors, dim=0)
14. 标签平滑(label smoothing)***
for images, labels in train_loader:
images, labels = images.cuda(), labels.cuda()
N = labels.size(0)
# C is the number of classes.
smoothed_labels = torch.full(size=(N, C), fill_value=0.1 / (C - 1)).cuda()
smoothed_labels.scatter_(dim=1, index=torch.unsqueeze(labels, dim=1), value=0.9)
score = model(images)
log_prob = torch.nn.functional.log_softmax(score, dim=1)
loss = -torch.sum(log_prob * smoothed_labels) / N
optimizer.zero_grad()
loss.backward()
optimizer.step()
参考链接https://zhuanlan.zhihu.com/p/59205847