1.UnicodeDecodeError: 'gbk' codec can't decode byte 0x80 in position 176: illegal multibyte sequence
这个错误是由于 yaml
模块在读取 origincar.yaml
文件时尝试使用默认的编码(例如 GBK
),但文件包含了非法的多字节序列。为了解决这个问题,可以显式地指定文件编码为 utf-8
。
解决方法就是将yaml文件中的注释都去掉,然后再运行就可以了
2.RuntimeError: indices should be either on cpu or on the same device as the indexed tensor (cpu)
错误信息具体如下:
Traceback (most recent call last):
File "D:\pycharm\yolov5v2.0\train.py", line 469, in <module>
train(hyp, tb_writer, opt, device)
File "D:\pycharm\yolov5v2.0\train.py", line 291, in train
loss, loss_items = compute_loss(pred, targets.to(device), model) # scaled by batch_size
File "D:\pycharm\yolov5v2.0\utils\utils.py", line 443, in compute_loss
tcls, tbox, indices, anchors = build_targets(p, targets, model) # targets
File "D:\pycharm\yolov5v2.0\utils\utils.py", line 532, in build_targets
a, t = at[j], t.repeat(na, 1, 1)[j] # filter
RuntimeError: indices should be either on cpu or on the same device as the indexed tensor (cpu)
出现这个错误是因为在将张量传递给 build_targets
函数时,某些张量在不同的设备(CPU或GPU)上。为了修复这个错误,你需要确保所有相关的张量都在同一个设备上。
解决方法:将build_targets函数做出如下修改:
关键修改点:
- 获取目标张量的设备:在函数开始处获取
targets
张量的设备 (device = targets.device
)。 - 移动相关张量到相同设备:
anchors
张量在每次循环中被移到device
。- 在计算
gain
的时候,确保torch.tensor
调用使用了device=device
参数。 at
张量在创建时被移到device
def build_targets(p, targets, model):
# Build targets for compute_loss(), input targets(image, class, x, y, w, h)
device = targets.device
det = model.module.model[-1] if isinstance(model, (nn.parallel.DataParallel, nn.parallel.DistributedDataParallel)) else model.model[-1] # Detect() module
na, nt = det.na, targets.shape[0] # number of anchors, targets
tcls, tbox, indices, anch = [], [], [], []
gain = torch.ones(6, device=device) # normalized to gridspace gain
off = torch.tensor([[1, 0], [0, 1], [-1, 0], [0, -1]], device=device).float() # overlap offsets
at = torch.arange(na, device=device).view(na, 1).repeat(1, nt) # anchor tensor, same as .repeat_interleave(nt)
g = 0.5 # offset
style = 'rect4'
for i in range(det.nl):
anchors = det.anchors[i].to(device) # Move anchors to the same device
gain[2:] = torch.tensor(p[i].shape, device=device)[[3, 2, 3, 2]] # xyxy gain
# Match targets to anchors
a, t, offsets = [], targets * gain, 0
if nt:
r = t[None, :, 4:6] / anchors[:, None] # wh ratio
j = torch.max(r, 1. / r).max(2)[0] < model.hyp['anchor_t'] # compare
# j = wh_iou(anchors, t[:, 4:6]) > model.hyp['iou_t'] # iou(3,n) = wh_iou(anchors(3,2), gwh(n,2))
a, t = at[j], t.repeat(na, 1, 1)[j] # filter
# overlaps
gxy = t[:, 2:4] # grid xy
z = torch.zeros_like(gxy)
if style == 'rect2':
j, k = ((gxy % 1. < g) & (gxy > 1.)).T
a, t = torch.cat((a, a[j], a[k]), 0), torch.cat((t, t[j], t[k]), 0)
offsets = torch.cat((z, z[j] + off[0], z[k] + off[1]), 0) * g
elif style == 'rect4':
j, k = ((gxy % 1. < g) & (gxy > 1.)).T
l, m = ((gxy % 1. > (1 - g)) & (gxy < (gain[[2, 3]] - 1.))).T
a, t = torch.cat((a, a[j], a[k], a[l], a[m]), 0), torch.cat((t, t[j], t[k], t[l], t[m]), 0)
offsets = torch.cat((z, z[j] + off[0], z[k] + off[1], z[l] + off[2], z[m] + off[3]), 0) * g
# Define
b, c = t[:, :2].long().T # image, class
gxy = t[:, 2:4] # grid xy
gwh = t[:, 4:6] # grid wh
gij = (gxy - offsets).long()
gi, gj = gij.T # grid xy indices
# Append
indices.append((b, a, gj, gi)) # image, anchor, grid indices
tbox.append(torch.cat((gxy - gij, gwh), 1)) # box
anch.append(anchors[a]) # anchors
tcls.append(c) # class
return tcls, tbox, indices, anch
3.TypeError: can't convert cuda:0 device type tensor to numpy. Use Tensor.cpu() to copy the tensor to host memory first.
报错信息具体如下:
Traceback (most recent call last):
File "D:\pycharm\yolov5v2.0\train.py", line 469, in <module>
train(hyp, tb_writer, opt, device)
File "D:\pycharm\yolov5v2.0\train.py", line 340, in train
results, maps, times = test.test(opt.data,
File "D:\pycharm\yolov5v2.0\test.py", line 176, in test
plot_images(img, output_to_target(output, width, height), paths, str(f), names) # predictions
File "D:\pycharm\yolov5v2.0\utils\utils.py", line 914, in output_to_target
return np.array(targets)
File "D:\Anaconda3\envs\yolov5\lib\site-packages\torch\_tensor.py", line 956, in __array__
return self.numpy()
TypeError: can't convert cuda:0 device type tensor to numpy. Use Tensor.cpu() to copy the tensor to host memory first.
这个错误是由于尝试将一个在CUDA设备上的张量直接转换为NumPy数组引起的。你需要首先将张量从CUDA设备移动到CPU,然后再进行转换。可以通过调用.cpu()
方法实现。
解决方法:修改 output_to_target
函数
def output_to_target(output, width, height):
# Convert model output to target format [batch_id, class_id, x, y, w, h, conf]
targets = []
if isinstance(output, torch.Tensor):
output = output.cpu().numpy()
if isinstance(output, np.ndarray):
output = [output] # 将单个 NumPy 数组封装到列表中,以便统一处理
for i, o in enumerate(output):
if o is not None:
if isinstance(o, torch.Tensor):
o = o.cpu().numpy() # 确保张量被转换为 NumPy 数组
for pred in o:
box = pred[:4]
w = (box[2] - box[0]) / width
h = (box[3] - box[1]) / height
x = box[0] / width + w / 2
y = box[1] / height + h / 2
conf = pred[4]
cls = int(pred[5])
targets.append([i, cls, x, y, w, h, conf])
return np.array(targets)
首先检查 output
是否是一个张量,如果是,将其转换为 NumPy 数组。将 output
封装到一个列表中(如果它是单个 NumPy 数组),以便统一处理。然后遍历 output
列表中的每个元素,并确保它们是 NumPy 数组。如果 output
列表中的某个元素是张量,将其转换为 NumPy 数组。最终将处理后的目标值添加到 targets
列表中并返回其 NumPy 数组。
检查gpu是否可用:
python -c "import torch; print(torch.cuda.is_available())"
返回True就是可用
可以将train.py中 --device', default=''中间添上0
parser.add_argument('--device', default='0', help='cuda device, i.e. 0 or 0,1,2,3 or cpu')