（省流：换旧版本）yolov5训练数据集时P, R等数据为０的解决方案 2022.2.24

迅羽的轻语

已于 2022-02-25 19:08:46 修改

阅读量7.3k

点赞数 1

分类专栏：方便自己看的小教程文章标签： pytorch

于 2022-02-24 23:51:12 首次发布

本文链接：https://blog.csdn.net/qq_51253844/article/details/123123432

版权

方便自己看的小教程专栏收录该内容

4 篇文章 0 订阅

订阅专栏

博主分享了在从PyTorch 1.10.2（CUDA 11.3）降级到1.9.0（预期CUDA 10.2）过程中遇到的挑战，包括错误警告、模型表现异常和版本冲突。文章详细探讨了如何诊断问题并最终通过调整Python版本和CUDA设置来修复。

摘要由CSDN通过智能技术生成

（py 3.6 cuda 11.3 torch1.10.2 -> py 3.9 cuda 10.2 torch 1.9.0）

之前下载的pytorch版本是直接从Start Locally | PyTorch　中的start locally选择的1.10.2中下载的cuda版本11.3的指令，如下图．

训练出的模型中results.png显示的像precision，mAP之类的图表全部不是０就是nan，混淆矩阵部分也全是ＦＮ．同时报错

UserWarning: Detected call of `lr_scheduler.step()` before `optimizer.step()`. In PyTorch 1.1.0 and later, you should call them in the opposite order: `optimizer.step()` before `lr_scheduler.step()`. Failure to do this will result in PyTorch skipping the first value of the learning rate schedule. See more details at https://pytorch.org/docs/stable/optim.html#how-to-adjust-learning-rate

Previous PyTorch Versions | PyTorch　之后我选择下载历史版本1.9.0,pip安装指令如下：

pip install torch==1.9.0 torchvision==0.10.0 torchaudio==0.9.0

（其实这个是ＯＳＸ的版本）数据恢复正常．但当我使用print(torch.version.cuda) 输出cuda版本时显示为10.2,我使用nvcc -V查询cuda版本时显示为11.3，usr/local中也不存在10.2的文件，但是还能用，奇奇怪怪．第二天报错：CUDA kernel errors might be asynchronously reported at some other API call,so the stacktrace below might be incorrect.For debugging consider passing CUDA_LAUNCH_BLOCKING=1.然后就寄掉惹．

使用pip安装，pytorch版本1.9.0,cuda11.3,报错UserWarning: CUDA initialization: CUDA unknown error - this may be due to an incorrectly set up environment, e.g. changing env variable CUDA_VISIBLE_DEVICES after program start. Setting the available devices to be zero. (Triggered internally at /pytorch/c10/cuda/CUDAFunctions.cpp:115.)
return torch._C._cuda_getDeviceCount() > 0

此时数据正常，gpu_mem显示为０．

最后cuda10.2+pytorch1.9.0安装，发现import torch报错，将python版本切换为3.9,解决问题！

自用的检验方法：

import torch
print(torch.__version__) # PyTorch version
import torchvision
print(torchvision.__version__)

print(torch.version.cuda) # Corresponding CUDA version
print(torch.backends.cudnn.version()) # Corresponding cuDNN version
print(torch.cuda.get_device_name(0)) # GPU type

yolov5训练(train)的时候 P R 值为0_m0_59080342的博客-CSDN博客　　同样的情况

YOLOv5目标检测 - 迷途小书童的Note迷途小书童的Note 　配置时使用的教程