调试问题
1.多线程报错
1.1错误1
AttributeError: Can't pickle local object 'get_data.<locals>.dl_train.<locals>.<lambda>'
1.2错误2
"RuntimeError: DataLoader worker (pid (s) 11343, 11344) exited unexpectedly"
1.3错误3
''BrokenPipeError: [Errno 32] Broken pipe''
目前来说,遇到的多线程报错,是这三类
2.显存溢出错误
错误案例
RuntimeError: CUDA out of memory. Tried to allocate 256.00 MiB (GPU 0; 11.00 GiB total capacity; 10.01 GiB already allocated; 0 bytes free; 10.15 GiB reserved in total by PyTorch)
2.文件路径错误
###2.1未知路径扩展
ValueError: unknown file extension:
3.Tensor-Numpy错误
###3.1张量错误
TypeError: can't convert cuda:0 device type tensor to NumPy. Use Tensor. CPU() to copy the tensor to host memory first.
4.通道数量错误
RuntimeError: Given groups=1, weight of size [64, 3, 3, 3], expected input[8, 6, 512, 512] to have 3 channels, but got 6 channels instead
问题分析:
5.通道数错误
5.1问题描述
“RuntimeError: running_mean should contain *** elements not ***”;
5.2解决方案
我们可以继续看看上一条提示信息:“File python/anaconda/anaconda3/envs/conda-general/lib/python3.7/site-packages/torch/nn/functional.py", line 1670, in batch_norm”
有一个值得注意的信息是batch_norm,而我们的模型中也刚好使用了BN的操作,所以应该是BN的设置出现了问题,
6.多线程问题
6.1问题描述
TypeError: ‘NoneType’ object is not subscriptable
6.2解决方案
(1)将num_workers设置为0。