码遇到的各种error
接上一篇YOLOv3-Pytorch版本自己学习及训练数据的记录!
过程中遇到的各种问题(下了好多个版本项目...)
- .cfg文件版本中遇到的
- .yaml文件版本中遇到的
-
- 1. yaml文件报错AttributeError: 'str' object has no attribute 'get'
- 2.UnicodeDecodeError:’gbk’ codec can’t decode byte 0xae in position - : illegal multibyte sequence
- 3. TypeError: can't convert cuda:0 device type tensor to numpy. Use Tensor.cpu() to copy the tensor to host memory first.
- 4. RuntimeError: view size is not compatible with input tensor's size and stride (at least one dimension spans across two contiguous subspaces). Use .reshape(...) instead.
- 5. 用detect.py检测图片发现什么目标都识别不出来,用初始yolov3.pt也没有结果
- 记录时间2021/3/24
- 6. OMP: Error #15: Initializing libiomp5md.dll, but found libiomp5md.dll already initialized.
- 记录时间2021/4/1
.cfg文件版本中遇到的
1. OSError: 页面文件太小,无法完成操作;BrokenPipeError; Error loading caffe2_detectron_ops_gpu.dll
OSError: 页面文件太小,无法完成操作。
BrokenPipeError: [Errno 32] Broken pipe
Error loading “D:\Anaconda3\envs\py36\lib\site-packages\torch\lib\caffe2_detectron_ops_gpu.dll” or one of its dependencies.
num_workers改成0
train.py传入参数那里改,如果没有的话就是在前面dataloader改
2. RuntimeError: CUDA out of memory.
形如RuntimeError: CUDA out of memory. Tried to allocate 1.04 GiB (GPU 0; 4.00 GiB total capacity; 86.63 MiB already allocated; 2.52 GiB free; 94.00 MiB reserved in total by PyTorch)
显存不够,调小训练的batch-size,其他进程关掉点或者重启一下电脑
3. 至今还不会解决的:RuntimeError:Expected all tensors tobe on the same device, but found at least two devices,cuda:0 and cpu!
用CPU可以训练,但是–device 0 命令就会报错,搜了一圈都解决不了T T 还好yaml版的我可以用(
先留在这