Pytorch_YOLOv4训练自己的数据集

yanyanxsdxx

已于 2024-09-25 16:33:22 修改

阅读量222

点赞数 1

文章标签： pytorch YOLO 人工智能

于 2024-09-25 16:22:55 首次发布

本文链接：https://blog.csdn.net/yanyanxsdxx/article/details/142526728

版权

数据集构建（和YOLOv5格式一致）：

每一个image和label文件的存放满足如下的关系

```

../coco/images/train2017/000000109622.jpg # image

../coco/labels/train2017/000000109622.txt # label

```

要额外生成三个txt文件：train.txt，val.txt，test.txt

这是val.txt

预训练模型的下载：

baidu链接：https://pan.baidu.com/s/1nyQlH-GHrmddCEkuv-VmAg

提取码：78bg

代码：

该版本的复现者是YOLOv4的二作：**Chien-Yao Wang**，他也是CSPNet的一作。再值得说的是YOLOv4 和 YOLOv5都用到了CSPNet。这个PyTorch版本的YOLOv4是基于 ultralytic的YOLOv3基础上实现的。ultralytic 复现的YOLOv3 应该最强的YOLOv3 PyTorch复现：https://github.com/ultralytics/yolov3。我们将使用该本本的YOLO v4训练自己的数据集，并提供详细的代码修改和训练，测试的整个过程。

遇到问题

发生异常: RuntimeError       (note: full exception trace is shown but execution is paused at: _run_module_as_main)
Expected all tensors to be on the same device, but found at least two devices, cuda:0 and cpu!
  File "/home/user-lbyjh/anaconda3/envs/yjhtorch16/lib/python3.8/site-packages/thop/vision/basic_hooks.py", line 69, in count_normalization
    m.total_ops += flops
  File "/home/user-lbyjh/anaconda3/envs/yjhtorch16/lib/python3.8/site-packages/torch/nn/modules/module.py", line 1131, in _call_impl
    hook_result = hook(self, input, result)
  File "/home/user-lbyjh/anaconda3/envs/yjhtorch16/lib/python3.8/site-packages/torch/nn/modules/container.py", line 141, in forward
    input = module(input)
  File "/home/user-lbyjh/anaconda3/envs/yjhtorch16/lib/python3.8/site-packages/torch/nn/modules/module.py", line 1128, in _call_impl
    result = forward_call(*input, **kwargs)
  File "/home/user-lbyjh/Pytorch_YOLO-v4-master/models.py", line 298, in forward_once
    x = module(x)
  File "/home/user-lbyjh/Pytorch_YOLO-v4-master/models.py", line 244, in forward
    return self.forward_once(x)
  File "/home/user-lbyjh/anaconda3/envs/yjhtorch16/lib/python3.8/site-packages/torch/nn/modules/module.py", line 1110, in _call_impl
    return forward_call(*input, **kwargs)
  File "/home/user-lbyjh/Pytorch_YOLO-v4-master/train.py", line 261, in train
    pred = model(imgs)
  File "/home/user-lbyjh/Pytorch_YOLO-v4-master/train.py", line 409, in <module>
    train()  # train normally
  File "/home/user-lbyjh/anaconda3/envs/yjhtorch16/lib/python3.8/runpy.py", line 87, in _run_code
    exec(code, run_globals)
  File "/home/user-lbyjh/anaconda3/envs/yjhtorch16/lib/python3.8/runpy.py", line 194, in _run_module_as_main (Current frame)
    return _run_code(code, main_globals, None,
RuntimeError: Expected all tensors to be on the same device, but found at least two devices, cuda:0 and cpu!

这是由于train.py中只能选择多卡或者CPU没有单卡的情况。将代码改成：

 if device.type != 'cpu' and torch.cuda.device_count() > 1 and torch.distributed.is_available():
        dist.init_process_group(backend='nccl',  # 'distributed backend'
                                init_method='tcp://127.0.0.1:9999',  # distributed training init method
                                world_size=1,  # number of nodes for distributed training
                                rank=0)  # distributed training node rank
        model = torch.nn.parallel.DistributedDataParallel(model, find_unused_parameters=True)
        model.yolo_layers = model.module.yolo_layers  # move yolo layer indices to top level

 else:
        model = model.to(device)  # 将模型移动到当前设备
        model.yolo_layers = model.yolo_layers  # move yolo layer indices to top level

发生异常: RuntimeError       (note: full exception trace is shown but execution is paused at: _run_module_as_main)
shape '[8, 3, 10, 20, 20]' is invalid for input of size 816000
  File "/home/user-lbyjh/Pytorch_YOLO-v4-master/models.py", line 197, in forward
    p = p.view(bs, self.na, self.no, self.ny, self.nx).permute(0, 1, 3, 4, 2).contiguous()  # prediction
  File "/home/user-lbyjh/anaconda3/envs/yjhtorch16/lib/python3.8/site-packages/torch/nn/modules/module.py", line 1110, in _call_impl
    return forward_call(*input, **kwargs)
  File "/home/user-lbyjh/Pytorch_YOLO-v4-master/models.py", line 296, in forward_once
    yolo_out.append(module(x, out))
  File "/home/user-lbyjh/Pytorch_YOLO-v4-master/models.py", line 244, in forward
    return self.forward_once(x)
  File "/home/user-lbyjh/anaconda3/envs/yjhtorch16/lib/python3.8/site-packages/torch/nn/modules/module.py", line 1110, in _call_impl
    return forward_call(*input, **kwargs)
  File "/home/user-lbyjh/Pytorch_YOLO-v4-master/train.py", line 268, in train
    pred = model(imgs)
  File "/home/user-lbyjh/Pytorch_YOLO-v4-master/train.py", line 416, in <module>
    train()  # train normally
  File "/home/user-lbyjh/anaconda3/envs/yjhtorch16/lib/python3.8/runpy.py", line 87, in _run_code
    exec(code, run_globals)
  File "/home/user-lbyjh/anaconda3/envs/yjhtorch16/lib/python3.8/runpy.py", line 194, in _run_module_as_main (Current frame)
    return _run_code(code, main_globals, None,
RuntimeError: shape '[8, 3, 10, 20, 20]' is invalid for input of size 816000

修改.cfg文件中的配置将每个[yolo]前一个卷积中定义的filters改成和自己数据集匹配即可。

[convolutional]
size=1
stride=1
pad=1
# filters=(5+classes)*3
filters=30
activation=linear

yanyanxsdxx

关注

1
点赞
踩
0

收藏

觉得还不错? 一键收藏
0
评论
复制链接

分享到 QQ

分享到新浪微博

扫一扫