基于Pytorch搭建YOLOV5目标检测平台-训练PASCAL VOC数据集遇到的问题汇总

V-mlik

已于 2024-06-04 17:45:54 修改

阅读量1.4k

点赞数 1

文章标签：开发语言目标检测 pytorch python 深度学习

于 2022-11-02 19:37:57 首次发布

本文链接：https://blog.csdn.net/m0_61820721/article/details/127633408

版权

前言：

为yolov5项目安装所需要的库出现的报错（通过git方式克隆项目的方式，手动安装就不会出现这个报错）

在pycharm终端输入测试指令后，不能成功运行，出现如图情况

使用yolov5训练的问题汇总

yolov5报错:RuntimeError: a view of a leaf Variable that requires grad is being used in an in-place

解决 OMP: Error #15: Initializing libiomp5md.dll, but found libiomp5md.dll already initialized问题

解决“TypeError: can‘t convert cuda:0 device type tensor to numpy. ......”问题

解决OSError: [WinError 1455] 页面文件太小，无法完成操作报错

结尾：

前言：

这篇文章是我根据B站的一个教学视频的学习，把第一次训练PASCAL VOC数据集中遇到的所有的问题进行了一个汇总，并给出了相应的解决方法

为yolov5项目安装所需要的库出现的报错（通过git方式克隆项目的方式，手动安装就不会出现这个报错）

在输入pip install -r requirements.txt -i https://pypi.tuna.tsinghua.edu.cn/simple后，发项opencv无法成功安装

输入下方代码换源后得以解决

pip install opencv-python -i https://pypi.tuna.tsinghua.edu.cn/simple

借鉴了此文章

http://t.csdn.cn/tGPtohttp://t.csdn.cn/tGPto

克隆yolov5项目时，使用git方式获取的项目文件与教程的并不一致（也许是我自己的原因），导致并不能成功跑起来，因此后来选择手动下载

在我用手动安装的yolov5项目时，安装库的时候有没有了第一条的报错

在pycharm终端输入测试指令后，不能成功运行，出现如图情况

测试指令

python detect.py --source ./inference/images/ --weights weights/yolov5s.pt --conf 0.4

在输入指令后发现不仅没有成功执行，并且使用的是CPU而非GPU

点击倒数第二个蓝色链接，跳转到upsampling.py，找到如图注释处的代码段，改为非注释代码即可成功运行

下列代码为修改后的代码

def forward(self, input: Tensor) -> Tensor:
        return F.interpolate(input, self.size, self.scale_factor, self.mode, self.align_corners)

但是使用的是CPU的而非GPU的问题并没有解决，不过在Anacconda中的虚拟环境中输入指令使用的就是GPU，具体原因未知

但是上述的问题独立存在于pycharm和Anacconda中，在Anacconda中也有一个upsampling.py，同样需要修改后才能正常运行

使用yolov5训练的问题汇总

训练指令

python train.py --data data/voc-new.yaml --cfg models/yolov5s-voc.yaml --weights weights/yolov5s.pt --batch-size 16 --epochs 200

yolov5报错:RuntimeError: a view of a leaf Variable that requires grad is being used in an in-place

找到models下的yolo.py文件，根据提示在149行找到下图代码

更改代码如下图所示

借鉴了此文章
http://t.csdn.cn/0QhOphttp://t.csdn.cn/0QhOp

解决 OMP: Error #15: Initializing libiomp5md.dll, but found libiomp5md.dll already initialized问题

OMP: Error #15: Initializing libiomp5md.dll, but found libiomp5md.dll already initialized.
OMP: Hint This means that multiple copies of the OpenMP runtime have been linked into the program. That is dangerous, since it can degrade performance or cause incorrect results. The best thing to do is to ensure that only a single OpenMP runtime is linked into the process, e.g. by avoiding static linking of the OpenMP runtime in any library. As an unsafe, unsupported, undocumented workaround you can set the environment variable KMP_DUPLICATE_LIB_OK=TRUE to allow the program to continue to execute, but that may cause crashes or silently produce incorrect results. For more information, please see http://www.intel.com/software/products/support/.

翻译后的意思是

OMP:错误#15:初始化libiomp5md.dll，但发现libiomp5md.dll已经初始化。
OMP:提示这意味着OpenMP运行时的多个副本已经链接到程序中。这是危险的，因为它会降低性能或导致不正确的结果。最好的做法是确保只有一个OpenMP运行时链接到进程中，例如避免在任何库中静态链接OpenMP运行时。作为一种不安全、不受支持、无文档记录的解决方案，您可以设置环境变量KMP_DUPLICATE_LIB_OK=TRUE，以允许程序继续执行，但这可能会导致崩溃或悄悄地产生不正确的结果。欲了解更多信息，请参见http://www.intel.com/software/products/support/。

这里找到"F:\Anaconda\envs\虚拟环境名\Library\bin\libiomp5md.dll"并删除

借鉴了此文章http://t.csdn.cn/uyvCyhttp://t.csdn.cn/uyvCy

解决“TypeError: can‘t convert cuda:0 device type tensor to numpy. ......”问题

报错如图所示

Traceback (most recent call last):
  File "train.py", line 469, in <module>
    train(hyp, tb_writer, opt, device)
  File "train.py", line 347, in train
    save_dir=log_dir)
  File "/home/xxx/Detection/test.py", line 176, in test
    plot_images(img, output_to_target(output, width, height), paths, str(f), names)  # predictions
  File "/home/xxx/Detection/utils/utils.py", line 914, in output_to_target
    return np.array(targets)
  File "/home/xxx/anaconda3/envs/pytorch1.5/lib/python3.7/site-packages/torch/tensor.py", line 492, in __array__
    return self.numpy()
TypeError: can't convert cuda:0 device type tensor to numpy. Use Tensor.cpu() to copy the tensor to host memory first.

报错原因：

1.需要先将 tensor 转换到 CPU ，因为 Numpy 是 CPU-only

2.如果想把CUDA tensor格式的数据改成numpy时，需要先将其转换成cpu float-tensor随后再转到numpy格式。 numpy不能读取CUDA tensor 需要将它转化为 CPU tensor

解决方法：

根据最后一行报错找到tensor.py中的下列代码，注释代码为源代码，改为非注释代码

借鉴了下列文章

http://t.csdn.cn/Nlxsnhttp://t.csdn.cn/Nlxsn http://t.csdn.cn/GbJTchttp://t.csdn.cn/GbJTc

解决OSError: [WinError 1455] 页面文件太小，无法完成操作报错

原因：模型太大，而系统分配的分页内存太小，无法训练

解决方法：

http://t.csdn.cn/IO3bWhttp://t.csdn.cn/IO3bW

结尾：

到这里就是我遇到的所有问题了，之后就成功的开始训练了

V-mlik

关注

1
点赞
踩
11

收藏

觉得还不错? 一键收藏
2
评论
基于Pytorch搭建YOLOV5目标检测平台-训练PASCAL VOC数据集遇到的问题汇总

使用yolov5训练的问题汇总：yolov5报错:RuntimeError: a view of a leaf Variable that requires grad is being used in an in-place；解决 OMP: Error #15: Initializing libiomp5md.dll, but found libiomp5md.dll already initialized问题；解决“TypeError: can‘t convert cuda:0 device type t
复制链接

扫一扫