1.tensorboard只显示部分的step
【解决方案】 tensorboard为了加速,不做人事。只需要在tensorboard 可视化命令时参加参数–samples_per_plugin 即可
示例:
tensorboard --logdir ./train2-weight[61336] --bind_all --samples_per_plugin=images=1000000000000000
解释一下:这里面的–samples_per_plugin=images=10000000就是显示1000000张图片出来,所以这个值尽可能大一点就好。
2.torchvision里面的transforms.resize()的弊端。
直接上源码:
class Resize(torch.nn.Module):
"""Resize the input image to the given size.
If the image is torch Tensor, it is expected
to have [..., H, W] shape, where ... means an arbitrary number of leading dimensions
Args:
size (sequence or int): Desired output size. If size is a sequence like
(h, w), output size will be matched to this. If size is an int,
smaller edge of the image will be matched to this number.
i.e, if height > width, then image will be rescaled to
(size * height / width, size).
In torchscript mode size as single int is not supported, use a sequence of length 1: ``[size, ]``.
interpolation (InterpolationMode): Desired interpolation enum defined by
:class:`torchvision.transforms.InterpolationMode`. Default is ``InterpolationMode.BILINEAR``.
If input is Tensor, only ``InterpolationMode.NEAREST``, ``InterpolationMode.BILINEAR`` and
``InterpolationMode.BICUBIC`` are supported.
For backward compatibility integer values (e.g. ``PIL.Image.NEAREST``) are still acceptable.
从源码中可以看出,resize是有一个Interpolation的部分,意思就是如果你的图像的尺寸被扩大了,这个时候他就会使用双线性插值法。对于大部分pic2pic的low-level问题来说,这种方式对结果的影响并不大,但是对于分割这种结果非0即1的任务,还是有影响的。
3.dataparallel多卡训练的结果在单卡跑出来结果指标变低
问题描述:在训练的时候采用了一下多卡训练代码:
##定义网络以及多卡设置
device = torch.device('cuda')
device_ids = [0,1,2,3,4,]
os.environ['CUDA_VISIBLE_DEVICE']='0,1,2,3,4'
net = Model()
net= torch.dataparallel(net,device_ids = device_ids)
net.to(device)
##训练
for data in dataloader:
out = net(data['input'].to(device))
loss = CE(out,data['label'].to(device))
optmize.zero_grad()
loss.backward()
optimize.step()
##保存参数
ckpt = torch.save(net.satet_dict())
后续测试的时候使用单卡逻辑加载:
##加载模型
device = torch.device('cuda')
os.environ['CUDA_VISIBLE_DEVICE']='0'
net = Model()
net.load_state_dict(torch.load('ckpt'),map_location = device)
net.to(device)
##推理
net.eval()
for data in test_laoder:
out = net(data['input'].to(device))
实际上上面这种单卡加载模型的方式加载多卡保存的模型是不对的,正确的是dataparallel训练的模型无论后续单卡还是多卡推理,都要用dataparallel包装网络。
##定义网络以及多卡设置
device = torch.device('cuda')
device_ids = [0]
os.environ['CUDA_VISIBLE_DEVICE']='0'
net = Model()
net= torch.dataparallel(net,device_ids = device_ids)
net.load_state_dict(torch.load('ckpt'),map_location = device)
net.to(device)
##推理
net.eval()
for data in test_laoder:
out = net(data['input'].to(device))