UR5-Pick-and-Place-Simulation 调试笔记（二）-2024.04.28

qq_36674060

已于 2024-05-11 16:12:49 修改

阅读量176

点赞数 2

文章标签： python

于 2024-04-28 16:54:37 首次发布

本文链接：https://blog.csdn.net/qq_36674060/article/details/138283874

版权

运行的例子： UR5-Pick-and-Place-Simulation

要debug的指令：rosrun vision lego-vision.py -show。
家人们，就是说这篇文章里的错误都搞完，这条指令就能运行了~~

(TAMP) xjfeng@xjfeng:~/yolov5$ rosrun vision lego-vision.py -show
Loading model best.pt
YOLOv5 🚀 v7.0-305-g4456c953 Python-3.8.19 torch-2.3.0+cu121 CUDA:0 (NVIDIA GeForce GTX 1650, 3896MiB)

Fusing layers... 
python3: relocation error: /home/xjfeng/anaconda3/envs/TAMP/lib/python3.8/site-packages/torch/lib/../../nvidia/cudnn/lib/libcudnn_cnn_infer.so.8: symbol _ZN5cudnn24cublasLtMatmulDescCreateEPP26cublasLtMatmulDescOpaque_t19cublasComputeType_t14cudaDataType_t version libcudnn_ops_infer.so.8 not defined in file libcudnn_ops_infer.so.8 with link time reference

一种说法是cuDNN版本不匹配：

(TAMP) xjfeng@xjfeng:~$ python 
Python 3.8.19 (default, Mar 20 2024, 19:58:24) 
[GCC 11.2.0] :: Anaconda, Inc. on linux
Type "help", "copyright", "credits" or "license" for more information.
>>> import torch
>>> print(torch.version.cuda)
12.1
>>> exit()
(TAMP) xjfeng@xjfeng:~$ nvcc -V
nvcc: NVIDIA (R) Cuda compiler driver
Copyright (c) 2005-2019 NVIDIA Corporation
Built on Wed_Oct_23_19:24:38_PDT_2019
Cuda compilation tools, release 10.2, V10.2.89
(TAMP) xjfeng@xjfeng:~$ nvidia-smi
Sun Apr 28 17:53:23 2024       
+-----------------------------------------------------------------------------+
| NVIDIA-SMI 525.105.17   Driver Version: 525.105.17   CUDA Version: 12.0     |
|-------------------------------+----------------------+----------------------+
| GPU  Name        Persistence-M| Bus-Id        Disp.A | Volatile Uncorr. ECC |
| Fan  Temp  Perf  Pwr:Usage/Cap|         Memory-Usage | GPU-Util  Compute M. |
|                               |                      |               MIG M. |
|===============================+======================+======================|
|   0  NVIDIA GeForce ...  Off  | 00000000:01:00.0  On |                  N/A |
| N/A   48C    P8     2W /  50W |    629MiB /  4096MiB |     31%      Default |
|                               |                      |                  N/A |
+-------------------------------+----------------------+----------------------+

把pytorch降到1.8.0版本，如下：

(TAMP) xjfeng@xjfeng:~$ pip install torch==1.8.0 torchvision==0.9.0 torchaudio==0.8.0

报新的错：

(TAMP) xjfeng@xjfeng:~/yolov5$ rosrun vision lego-vision.py -show
Loading model best.pt
YOLOv5 🚀 v7.0-305-g4456c953 Python-3.8.19 torch-1.8.0 CUDA:0 (NVIDIA GeForce GTX 1650, 3896MiB)

Fusing layers... 
Traceback (most recent call last):
  File "/home/xjfeng/yolov5/hubconf.py", line 50, in _create
    model = DetectMultiBackend(path, device=device, fuse=autoshape)  # detection model
  File "/home/xjfeng/yolov5/models/common.py", line 467, in __init__
    model = attempt_load(weights if isinstance(weights, list) else w, device=device, inplace=True, fuse=fuse)
  File "/home/xjfeng/yolov5/models/experimental.py", line 107, in attempt_load
    model.append(ckpt.fuse().eval() if fuse and hasattr(ckpt, "fuse") else ckpt.eval())  # model in eval mode
  File "/home/xjfeng/yolov5/models/yolo.py", line 192, in fuse
    m.conv = fuse_conv_and_bn(m.conv, m.bn)  # update conv
  File "/home/xjfeng/yolov5/utils/torch_utils.py", line 286, in fuse_conv_and_bn
    fusedconv.weight.copy_(torch.mm(w_bn, w_conv).view(fusedconv.weight.shape))
RuntimeError: CUDA error: CUBLAS_STATUS_INTERNAL_ERROR when calling `cublasCreate(handle)`

During handling of the above exception, another exception occurred:

Traceback (most recent call last):
  File "/home/xjfeng/yolov5/hubconf.py", line 65, in _create
    model = attempt_load(path, device=device, fuse=False)  # arbitrary model
  File "/home/xjfeng/yolov5/models/experimental.py", line 99, in attempt_load
    ckpt = (ckpt.get("ema") or ckpt["model"]).to(device).float()  # FP32 model
  File "/home/xjfeng/anaconda3/envs/TAMP/lib/python3.8/site-packages/torch/nn/modules/module.py", line 673, in to
    return self._apply(convert)
  File "/home/xjfeng/yolov5/models/yolo.py", line 206, in _apply
    self = super()._apply(fn)
  File "/home/xjfeng/anaconda3/envs/TAMP/lib/python3.8/site-packages/torch/nn/modules/module.py", line 387, in _apply
    module._apply(fn)
  File "/home/xjfeng/anaconda3/envs/TAMP/lib/python3.8/site-packages/torch/nn/modules/module.py", line 387, in _apply
    module._apply(fn)
  File "/home/xjfeng/anaconda3/envs/TAMP/lib/python3.8/site-packages/torch/nn/modules/module.py", line 387, in _apply
    module._apply(fn)
  [Previous line repeated 1 more time]
  File "/home/xjfeng/anaconda3/envs/TAMP/lib/python3.8/site-packages/torch/nn/modules/module.py", line 409, in _apply
    param_applied = fn(param)
  File "/home/xjfeng/anaconda3/envs/TAMP/lib/python3.8/site-packages/torch/nn/modules/module.py", line 671, in convert
    return t.to(device, dtype if t.is_floating_point() or t.is_complex() else None, non_blocking)
RuntimeError: CUDA error: invalid device function

The above exception was the direct cause of the following exception:

Traceback (most recent call last):
  File "/home/xjfeng/catkin_ws/src/vision/scripts/lego-vision.py", line 475, in <module>
    load_models()
  File "/home/xjfeng/catkin_ws/src/vision/scripts/lego-vision.py", line 465, in load_models
    model = torch.hub.load(path_yolo,'custom',path=weight, source='local')
  File "/home/xjfeng/anaconda3/envs/TAMP/lib/python3.8/site-packages/torch/hub.py", line 339, in load
    model = _load_local(repo_or_dir, model, *args, **kwargs)
  File "/home/xjfeng/anaconda3/envs/TAMP/lib/python3.8/site-packages/torch/hub.py", line 368, in _load_local
    model = entry(*args, **kwargs)
  File "/home/xjfeng/yolov5/hubconf.py", line 88, in custom
    return _create(path, autoshape=autoshape, verbose=_verbose, device=device)
  File "/home/xjfeng/yolov5/hubconf.py", line 83, in _create
    raise Exception(s) from e
Exception: CUDA error: invalid device function. Cache may be out of date, try `force_reload=True` or see https://docs.ultralytics.com/yolov5/tutorials/pytorch_hub_model_loading for help.

网上说11.1版本的torch1.8.0不会报上面的错，所以安装torch==1.8.0+cu111：

(TAMP) xjfeng@xjfeng:~/yolov5$ pip install torch==1.8.0+cu111 torchvision==0.9.0+cu111 torchaudio==0.8.0 -f https://download.pytorch.org/whl/torch_stable.html
Looking in indexes: https://pypi.tuna.tsinghua.edu.cn/simple
Looking in links: https://download.pytorch.org/whl/torch_stable.html
Collecting torch==1.8.0+cu111
  Downloading https://download.pytorch.org/whl/cu111/torch-1.8.0%2Bcu111-cp38-cp38-linux_x86_64.whl (1982.2 MB)
     ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 2.0/2.0 GB 1.3 MB/s eta 0:00:00
Collecting torchvision==0.9.0+cu111
  Downloading https://download.pytorch.org/whl/cu111/torchvision-0.9.0%2Bcu111-cp38-cp38-linux_x86_64.whl (17.6 MB)
     ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 17.6/17.6 MB 251.7 kB/s eta 0:00:00
Requirement already satisfied: torchaudio==0.8.0 in /home/xjfeng/anaconda3/envs/TAMP/lib/python3.8/site-packages (0.8.0)
Requirement already satisfied: typing-extensions in /home/xjfeng/anaconda3/envs/TAMP/lib/python3.8/site-packages (from torch==1.8.0+cu111) (4.11.0)
Requirement already satisfied: numpy in /home/xjfeng/anaconda3/envs/TAMP/lib/python3.8/site-packages (from torch==1.8.0+cu111) (1.24.4)
Requirement already satisfied: pillow>=4.1.1 in /home/xjfeng/anaconda3/envs/TAMP/lib/python3.8/site-packages (from torchvision==0.9.0+cu111) (10.3.0)
Installing collected packages: torch, torchvision
  Attempting uninstall: torch
    Found existing installation: torch 1.8.0
    Uninstalling torch-1.8.0:
      Successfully uninstalled torch-1.8.0
  Attempting uninstall: torchvision
    Found existing installation: torchvision 0.9.0
    Uninstalling torchvision-0.9.0:
      Successfully uninstalled torchvision-0.9.0
Successfully installed torch-1.8.0+cu111 torchvision-0.9.0+cu111

安装成功，然后运行rosrun vision lego-vision.py -show，又报错：

(TAMP) xjfeng@xjfeng:~/yolov5$ rosrun vision lego-vision.py -show
Loading model best.pt
YOLOv5 🚀 v7.0-305-g4456c953 Python-3.8.19 torch-1.8.0+cu111 CUDA:0 (NVIDIA GeForce GTX 1650, 3896MiB)

Fusing layers... 
Model summary: 213 layers, 7039792 parameters, 0 gradients, 15.8 GFLOPs
Adding AutoShape... 
Loading model orientation.pt
YOLOv5 🚀 v7.0-305-g4456c953 Python-3.8.19 torch-1.8.0+cu111 CUDA:0 (NVIDIA GeForce GTX 1650, 3896MiB)

Fusing layers... 
Model summary: 213 layers, 7018216 parameters, 0 gradients, 15.8 GFLOPs
Adding AutoShape... 
Starting Node Vision 1.0
Subscribing to camera images
Localization is starting.. 
Traceback (most recent call last):
  File "/home/xjfeng/catkin_ws/src/vision/scripts/lego-vision.py", line 477, in <module>
    start_node()
  File "/home/xjfeng/catkin_ws/src/vision/scripts/lego-vision.py", line 452, in start_node
    syncro = message_filters.TimeSynchronizer([rgb, depth], 1, reset=True)
TypeError: __init__() got an unexpected keyword argument 'reset'
// 翻译：类型错误：得到意外的关键字参数'reset'

尝试把多余的参数去掉：

// 打开File "/home/xjfeng/catkin_ws/src/vision/scripts/lego-vision.py", 找到line 452
//原始代码是这行：
syncro = message_filters.TimeSynchronizer([rgb, depth], 1, reset=True)
//删除reset=True
syncro = message_filters.TimeSynchronizer([rgb, depth], 1)

再次在终端输入 rosrun vision lego-vision.py -show，就没问题啦！！

(TAMP) xjfeng@xjfeng:~/yolov5$ rosrun vision lego-vision.py -show
Loading model best.pt
YOLOv5 🚀 v7.0-305-g4456c953 Python-3.8.19 torch-1.8.0+cu111 CUDA:0 (NVIDIA GeForce GTX 1650, 3896MiB)

Fusing layers... 
Model summary: 213 layers, 7039792 parameters, 0 gradients, 15.8 GFLOPs
Adding AutoShape... 
Loading model orientation.pt
YOLOv5 🚀 v7.0-305-g4456c953 Python-3.8.19 torch-1.8.0+cu111 CUDA:0 (NVIDIA GeForce GTX 1650, 3896MiB)

Fusing layers... 
Model summary: 213 layers, 7018216 parameters, 0 gradients, 15.8 GFLOPs
Adding AutoShape... 
Starting Node Vision 1.0
Subscribing to camera images
Localization is starting.. 
(Waiting for images..)