1、No labels found in E:\datasets\aerial_crack_labels\datasets_strange\score\train\labels\. See https://github.com/ultralytics/yolov5/wiki/Train-Custom-Data
根据images文件夹的路径寻找labels的路径失败。
解决方式:
在datasets.py脚本里边定义函数:img2label_paths
#---------------------------------------------------------------定义函数-----------------------------------------
def img2label_paths(img_paths):
# Define label paths as a function of image paths
sa, sb = os.sep + 'images' + os.sep, os.sep + 'labels' + os.sep # /images/, /labels/ substrings
return [sb.join(x.rsplit(sa, 1)).rsplit('.', 1)[0] + '.txt' for x in img_paths]
#------------------------------------------------------------------------------------------------------------------------
注释掉原来的self.labels_files,重写self.labels_files:
重新运行train.py,既可正常训练。
2‘Do not know how to handle these types to promote: {‘DoubleTensor‘, ‘FloatTensor‘}
跑完一个epoch,总是报错!
该错误出现在YOLOv5早期的版本里边,两种解决方法:换用较新版本;更改源码;
错误定位在torch_utils.py脚本里,报错部分源码为:
def update(self, model):
self.updates += 1
d = self.decay(self.updates)
with torch.no_grad():
if type(model) in (nn.parallel.DataParallel, nn.parallel.DistributedDataParallel):
msd, esd = model.module.state_dict(), self.ema.module.state_dict()
else:
msd, esd = model.state_dict(), self.ema.state_dict()
for k, v in esd.items():
if v.dtype.is_floating_point:
v *= d
v += (1. - d) * msd[k].detach()
在for循环语句里边加入if判断语句,修改后的源码为:
def update(self, model):
self.updates += 1
d = self.decay(self.updates)
with torch.no_grad():
if type(model) in (nn.parallel.DataParallel, nn.parallel.DistributedDataParallel):
msd, esd = model.module.state_dict(), self.ema.module.state_dict()
else:
msd, esd = model.state_dict(), self.ema.state_dict()
for k, v in esd.items():
if v.dtype.is_floating_point:
# ------------------------新增加代码--------------
if v.dtype != msd[k].dtype:
v = v.to(msd[k].dtype)
# ---------------------新增代码--------------------
v *= d
v += (1. - d) * msd[k].detach()
代码即可运行!
但会发现,map等各项指标均不会更新,可能是加了if判断语句造成参数不更新所致,解决方案:
在改脚本中加入:
def is_parallel(model):
# Returns True if model is of type DP or DDP
return type(model) in (nn.parallel.DataParallel, nn.parallel.DistributedDataParallel)
def de_parallel(model):
# De-parallelize a model: returns single-GPU model if model is of type DP or DDP
return model.module if is_parallel(model) else model
并重写update函数:
def update(self, model):
# Update EMA parameters
with torch.no_grad():
self.updates += 1
d = self.decay(self.updates)
msd = de_parallel(model).state_dict() # model state_dict
for k, v in self.ema.state_dict().items():
if v.dtype.is_floating_point:
v *= d
v += (1 - d) * msd[k].detach()
即可完美运行啦:(唉,这个bug阻碍了好多天)
3、TypeError: can't convert cuda:0 device type tensor to numpy. Use Tensor.cpu() to copy the tensor to host memory first.!
Traceback (most recent call last):
File "train.py", line 776, in <module>
train(hyp)
File "train.py", line 438, in train
results, maps, times = test.test(opt.data,
File "E:\pruning-yolov5\mobile-yolov5-pruning-distillation-master\test.py", line 219, in test
output_to_target(
File "E:\pruning-yolov5\mobile-yolov5-pruning-distillation-master\utils\utils.py", line 979, in output_to_target
return np.array(targets)
File "D:\anaconda\conda\envs\pytorch1.7\lib\site-packages\torch\tensor.py", line 621, in __array__
return self.numpy()
TypeError: can't convert cuda:0 device type tensor to numpy. Use Tensor.cpu() to copy the tensor to host memory first.
按照提示:现将tensorrt数据转移到cpu上:
将 self.numpy() 改成 self.cpu().numpy()
即可!