yolov5_obb踩坑

配置:

linux 20.04

cuda 10.2

GPU T4

背景:

需要用yolov5_obb解决旋转样本框检测的问题

项目地址:hukaixuan19970627/yolov5_obb: yolov5 + csl_label.(Oriented Object Detection)(Rotation Detection)(Rotated BBox)基于yolov5的旋转目标检测 (github.com)

——————————————————————————————————————

踩坑一:cuda没有完全配置完,能够有nvcc -V和nvidia-smi的对应输出,但是执行setup.py会报错。

复现步骤:

根据项目install.md进行安装,直到执行以下语句是报错


python setup.py develop#or "pip install -v -e ."

报错信息:


$ python setup.py develop
/data/anaconda3/envs/yolov5_obb/lib/python3.9/site-packages/setuptools/dist.py:770: UserWarning: Usage of dash-separated 'index-url' will not be supported in future versions. Please use the underscore name 'index_url' instead
  warnings.warn(
running develop
/data/anaconda3/envs/yolov5_obb/lib/python3.9/site-packages/setuptools/command/easy_install.py:144: EasyInstallDeprecationWarning: easy_install command is deprecated. Use build and pip and other standards-based tools.
  warnings.warn(
/data/anaconda3/envs/yolov5_obb/lib/python3.9/site-packages/setuptools/command/install.py:34: SetuptoolsDeprecationWarning: setup.py install is deprecated. Use build and pip and other standards-based tools.
  warnings.warn(
running egg_info
writing nms_rotated.egg-info/PKG-INFO
writing dependency_links to nms_rotated.egg-info/dependency_links.txt
writing top-level names to nms_rotated.egg-info/top_level.txt
reading manifest file 'nms_rotated.egg-info/SOURCES.txt'
writing manifest file 'nms_rotated.egg-info/SOURCES.txt'
running build_ext
error: [Errno 2] No such file or directory: ':/usr/local/cuda/bin/nvcc'

解决办法:

先确定 cuda 是否安装成功


nvcc -V

安装成功的话直接在命令行里输入


export CUDA_HOME=/usr/local/cuda

方法来源:(125条消息) /usr/local/cuda/bin/nvcc: No such file or directory 错误_qq_39031960的博客-CSDN博客

补充:

后来发现每次进环境都会有这个报错。直接在vim ~/.bachrc 里面把这句话添加进去,或者修改。(不熟悉vim编辑器的同学可以百度一下 linux vim)

我这里的情况是之前写的是


export CUDA_HOME=$CUDA_HOME:/usr/local/cuda

改成下图这种写法就ok了。

——————————————————————

踩坑二:g++版本过高

还是刚才的复现步骤,解决cuda问题后出现的。

报错信息:


$ python setup.py develop 
/data/anaconda3/envs/yolov5_obb/lib/python3.9/site-packages/setuptools/dist.py:770: UserWarning: Usage of dash-separated 'index-url' will not be supported in future versions. Please use the underscore name 'index_url' instead
  warnings.warn(
running develop
/data/anaconda3/envs/yolov5_obb/lib/python3.9/site-packages/setuptools/command/easy_install.py:144: EasyInstallDeprecationWarning: easy_install command is deprecated. Use build and pip and other standards-based tools.
  warnings.warn(
/data/anaconda3/envs/yolov5_obb/lib/python3.9/site-packages/setuptools/command/install.py:34: SetuptoolsDeprecationWarning: setup.py install is deprecated. Use build and pip and other standards-based tools.
  warnings.warn(
running egg_info
writing nms_rotated.egg-info/PKG-INFO
writing dependency_links to nms_rotated.egg-info/dependency_links.txt
writing top-level names to nms_rotated.egg-info/top_level.txt
reading manifest file 'nms_rotated.egg-info/SOURCES.txt'
writing manifest file 'nms_rotated.egg-info/SOURCES.txt'
running build_ext
Traceback (most recent call last):
  File "/data/yolov5_obb/utils/nms_rotated/setup.py", line 38, in <module>
    setup(
  File "/data/anaconda3/envs/yolov5_obb/lib/python3.9/site-packages/setuptools/__init__.py", line 87, in setup
    return distutils.core.setup(**attrs)
  File "/data/anaconda3/envs/yolov5_obb/lib/python3.9/site-packages/setuptools/_distutils/core.py", line 185, in setup
    return run_commands(dist)
  File "/data/anaconda3/envs/yolov5_obb/lib/python3.9/site-packages/setuptools/_distutils/core.py", line 201, in run_commands
    dist.run_commands()
  File "/data/anaconda3/envs/yolov5_obb/lib/python3.9/site-packages/setuptools/_distutils/dist.py", line 969, in run_commands
    self.run_command(cmd)
  File "/data/anaconda3/envs/yolov5_obb/lib/python3.9/site-packages/setuptools/dist.py", line 1208, in run_command
    super().run_command(command)
  File "/data/anaconda3/envs/yolov5_obb/lib/python3.9/site-packages/setuptools/_distutils/dist.py", line 988, in run_command
    cmd_obj.run()
  File "/data/anaconda3/envs/yolov5_obb/lib/python3.9/site-packages/setuptools/command/develop.py", line 34, in run
    self.install_for_development()
  File "/data/anaconda3/envs/yolov5_obb/lib/python3.9/site-packages/setuptools/command/develop.py", line 114, in install_for_development
    self.run_command('build_ext')
  File "/data/anaconda3/envs/yolov5_obb/lib/python3.9/site-packages/setuptools/_distutils/cmd.py", line 318, in run_command
    self.distribution.run_command(command)
  File "/data/anaconda3/envs/yolov5_obb/lib/python3.9/site-packages/setuptools/dist.py", line 1208, in run_command
    super().run_command(command)
  File "/data/anaconda3/envs/yolov5_obb/lib/python3.9/site-packages/setuptools/_distutils/dist.py", line 988, in run_command
    cmd_obj.run()
  File "/data/anaconda3/envs/yolov5_obb/lib/python3.9/site-packages/setuptools/command/build_ext.py", line 84, in run
    _build_ext.run(self)
  File "/data/anaconda3/envs/yolov5_obb/lib/python3.9/site-packages/setuptools/_distutils/command/build_ext.py", line 346, in run
    self.build_extensions()
  File "/data/anaconda3/envs/yolov5_obb/lib/python3.9/site-packages/torch/utils/cpp_extension.py", line 434, in build_extensions
    self._check_cuda_version(compiler_name, compiler_version)
  File "/data/anaconda3/envs/yolov5_obb/lib/python3.9/site-packages/torch/utils/cpp_extension.py", line 836, in _check_cuda_version
    raise RuntimeError(
RuntimeError: The current installed version of g++ (9.4.0) is greater than the maximum required version by CUDA 10.2 (8.0.0). Please make sure to use an adequate version of g++ (>=5.0.0, <=8.0.0).

安装gcc-7版本:


sudo apt-get install -y software-properties-common
sudo add-apt-repository ppa:ubuntu-toolchain-r/test
sudo apt update
sudo apt install g++-7 -y

Set it up so the symbolic links gcc, g++ point to the newer version: (对新版本建立软连接)


sudo update-alternatives --install /usr/bin/gcc gcc /usr/bin/gcc-7 60 \
                         --slave /usr/bin/g++ g++ /usr/bin/g++-7 
sudo update-alternatives --config gcc
gcc --version
g++ --version

# This one if you want the **all** toolchain programs (with the triplet names) to also point to gcc-7. 
# For example, this is needed if building Debian packages.
# If you are already are root (e.g. inside a docker image), remove the "sudo" below.
ls -la /usr/bin/ | grep -oP "[\S]*(gcc|g\+\+)(-[a-z]+)*[\s]" | xargs sudo bash -c 'for link in ${@:1}; do ln -s -f "/usr/bin/${link}-${0}" "/usr/bin/${link}"; done' 7

(这里代码的最后一行没有执行)

参考:Installing gcc-7 & g++-7 in Ubuntu 16.04LTS Xenial (github.com)

————————————

踩坑三:解决后再次执行setup.py 再次报错ninja -v命令执行失败

报错信息:


/data/anaconda3/envs/yolov5_obb/lib/python3.9/site-packages/torch/include/c10/core/SymInt.h(84): warning: integer conversion resulted in a change of sign

/data/anaconda3/envs/yolov5_obb/lib/python3.9/site-packages/torch/include/ATen/Context.h(25): warning: attribute "__visibility__" does not apply here

/data/anaconda3/envs/yolov5_obb/lib/python3.9/site-packages/torch/include/c10/core/SymInt.h(84): warning: integer conversion resulted in a change of sign

/data/anaconda3/envs/yolov5_obb/lib/python3.9/site-packages/torch/include/ATen/Context.h(25): warning: attribute "__visibility__" does not apply here

ninja: build stopped: subcommand failed.
Traceback (most recent call last):
  File "/data/anaconda3/envs/yolov5_obb/lib/python3.9/site-packages/torch/utils/cpp_extension.py", line 1808, in _run_ninja_build
    subprocess.run(
  File "/data/anaconda3/envs/yolov5_obb/lib/python3.9/subprocess.py", line 528, in run
    raise CalledProcessError(retcode, process.args,
subprocess.CalledProcessError: Command '['ninja', '-v']' returned non-zero exit status 1.

The above exception was the direct cause of the following exception:

Traceback (most recent call last):
  File "/data/yolov5_obb/utils/nms_rotated/setup.py", line 38, in <module>
    setup(
  File "/data/anaconda3/envs/yolov5_obb/lib/python3.9/site-packages/setuptools/__init__.py", line 87, in setup
    return distutils.core.setup(**attrs)
  File "/data/anaconda3/envs/yolov5_obb/lib/python3.9/site-packages/setuptools/_distutils/core.py", line 185, in setup
    return run_commands(dist)
  File "/data/anaconda3/envs/yolov5_obb/lib/python3.9/site-packages/setuptools/_distutils/core.py", line 201, in run_commands
    dist.run_commands()
  File "/data/anaconda3/envs/yolov5_obb/lib/python3.9/site-packages/setuptools/_distutils/dist.py", line 969, in run_commands
    self.run_command(cmd)
  File "/data/anaconda3/envs/yolov5_obb/lib/python3.9/site-packages/setuptools/dist.py", line 1208, in run_command
    super().run_command(command)
  File "/data/anaconda3/envs/yolov5_obb/lib/python3.9/site-packages/setuptools/_distutils/dist.py", line 988, in run_command
    cmd_obj.run()
  File "/data/anaconda3/envs/yolov5_obb/lib/python3.9/site-packages/setuptools/command/develop.py", line 34, in run
    self.install_for_development()
  File "/data/anaconda3/envs/yolov5_obb/lib/python3.9/site-packages/setuptools/command/develop.py", line 114, in install_for_development
    self.run_command('build_ext')
  File "/data/anaconda3/envs/yolov5_obb/lib/python3.9/site-packages/setuptools/_distutils/cmd.py", line 318, in run_command
    self.distribution.run_command(command)
  File "/data/anaconda3/envs/yolov5_obb/lib/python3.9/site-packages/setuptools/dist.py", line 1208, in run_command
    super().run_command(command)
  File "/data/anaconda3/envs/yolov5_obb/lib/python3.9/site-packages/setuptools/_distutils/dist.py", line 988, in run_command
    cmd_obj.run()
  File "/data/anaconda3/envs/yolov5_obb/lib/python3.9/site-packages/setuptools/command/build_ext.py", line 84, in run
    _build_ext.run(self)
  File "/data/anaconda3/envs/yolov5_obb/lib/python3.9/site-packages/setuptools/_distutils/command/build_ext.py", line 346, in run
    self.build_extensions()
  File "/data/anaconda3/envs/yolov5_obb/lib/python3.9/site-packages/torch/utils/cpp_extension.py", line 765, in build_extensions
    build_ext.build_extensions(self)
  File "/data/anaconda3/envs/yolov5_obb/lib/python3.9/site-packages/setuptools/_distutils/command/build_ext.py", line 468, in build_extensions
    self._build_extensions_serial()
  File "/data/anaconda3/envs/yolov5_obb/lib/python3.9/site-packages/setuptools/_distutils/command/build_ext.py", line 494, in _build_extensions_serial
    self.build_extension(ext)
  File "/data/anaconda3/envs/yolov5_obb/lib/python3.9/site-packages/setuptools/command/build_ext.py", line 246, in build_extension
    _build_ext.build_extension(self, ext)
  File "/data/anaconda3/envs/yolov5_obb/lib/python3.9/site-packages/setuptools/_distutils/command/build_ext.py", line 549, in build_extension
    objects = self.compiler.compile(
  File "/data/anaconda3/envs/yolov5_obb/lib/python3.9/site-packages/torch/utils/cpp_extension.py", line 586, in unix_wrap_ninja_compile
    _write_ninja_file_and_compile_objects(
  File "/data/anaconda3/envs/yolov5_obb/lib/python3.9/site-packages/torch/utils/cpp_extension.py", line 1487, in _write_ninja_file_and_compile_objects
    _run_ninja_build(
  File "/data/anaconda3/envs/yolov5_obb/lib/python3.9/site-packages/torch/utils/cpp_extension.py", line 1824, in _run_ninja_build
    raise RuntimeError(message) from e
RuntimeError: Error compiling objects for extension

直接检查ninja -v 也是有报错的


# ninja -v
ninja: error: loading 'build.ninja': No such file or directory

查了一下可能是调用命令有错误,直接在出错文件中,把这个检查ninja版本的代码指令改过来。改成如下。PS:vim里面可以搜索 'ninja', 只有这个地方有。

参考:subprocess.CalledProcessError: Command '['ninja', '-v']' returned non-zero exit status 1 · Issue #2360 · pytorch/vision (github.com)

————————————

踩坑四:再次执行setup.py, g++报错了


# python setup.py develop
/data/anaconda3/envs/yolov5_obb/lib/python3.9/site-packages/setuptools/dist.py:770: UserWarning: Usage of dash-separated 'index-url' will not be supported in future versions. Please use the underscore name 'index_url' instead
  warnings.warn(
running develop
/data/anaconda3/envs/yolov5_obb/lib/python3.9/site-packages/setuptools/command/easy_install.py:144: EasyInstallDeprecationWarning: easy_install command is deprecated. Use build and pip and other standards-based tools.
  warnings.warn(
/data/anaconda3/envs/yolov5_obb/lib/python3.9/site-packages/setuptools/command/install.py:34: SetuptoolsDeprecationWarning: setup.py install is deprecated. Use build and pip and other standards-based tools.
  warnings.warn(
running egg_info
writing nms_rotated.egg-info/PKG-INFO
writing dependency_links to nms_rotated.egg-info/dependency_links.txt
writing top-level names to nms_rotated.egg-info/top_level.txt
reading manifest file 'nms_rotated.egg-info/SOURCES.txt'
writing manifest file 'nms_rotated.egg-info/SOURCES.txt'
running build_ext
building '.nms_rotated_ext' extension
Emitting ninja build file /data/yolov5_obb/utils/nms_rotated/build/temp.linux-x86_64-cpython-39/build.ninja...
Compiling objects...
Allowing ninja to set a default number of workers... (overridable by setting the environment variable MAX_JOBS=N)
1.11.1.git.kitware.jobserver-1
creating build/lib.linux-x86_64-cpython-39
g++ -pthread -B /data/anaconda3/envs/yolov5_obb/compiler_compat -shared -Wl,-rpath,/data/anaconda3/envs/yolov5_obb/lib -Wl,-rpath-link,/data/anaconda3/envs/yolov5_obb/lib -L/data/anaconda3/envs/yolov5_obb/lib -L/data/anaconda3/envs/yolov5_obb/lib -Wl,-rpath,/data/anaconda3/envs/yolov5_obb/lib -Wl,-rpath-link,/data/anaconda3/envs/yolov5_obb/lib -L/data/anaconda3/envs/yolov5_obb/lib /data/yolov5_obb/utils/nms_rotated/build/temp.linux-x86_64-cpython-39/src/nms_rotated_cpu.o /data/yolov5_obb/utils/nms_rotated/build/temp.linux-x86_64-cpython-39/src/nms_rotated_cuda.o /data/yolov5_obb/utils/nms_rotated/build/temp.linux-x86_64-cpython-39/src/nms_rotated_ext.o /data/yolov5_obb/utils/nms_rotated/build/temp.linux-x86_64-cpython-39/src/poly_nms_cuda.o -L/data/anaconda3/envs/yolov5_obb/lib/python3.9/site-packages/torch/lib -L/usr/local/cuda/lib64 -lc10 -ltorch -ltorch_cpu -ltorch_python -lcudart -lc10_cuda -ltorch_cuda -o build/lib.linux-x86_64-cpython-39/nms_rotated_ext.cpython-39-x86_64-linux-gnu.so
g++: error: /data/yolov5_obb/utils/nms_rotated/build/temp.linux-x86_64-cpython-39/src/poly_nms_cuda.o: No such file or directory
error: command '/usr/bin/g++' failed with exit code 1

有方案是降低torch版本,比如cuda11.4需要搭配torch1.12,参考issue:[Torch1.11 error] src/poly_nms_cuda.cu:4:10: fatal error: THC/THC.h: No such file or directory · Issue #408 · hukaixuan19970627/yolov5_obb (github.com)

考虑项目本身也是用的yolov5,很有可能是装torch时候装了1.12,而项目只要求torch>=1.7。

重新安装torch1.10,conda会自动降低对应包的版本。


conda install pytorch==1.10.1 torchvision==0.11.2 torchaudio==0.10.1 cudatoolkit=10.2 -c pytorch

输出:


## Package Plan ##

  environment location: /data/anaconda3/envs/yolov5_obb

  added / updated specs:
    - cudatoolkit=10.2
    - pytorch==1.10.1
    - torchaudio==0.10.1
    - torchvision==0.11.2


The following packages will be downloaded:

    package                    |            build
    ---------------------------|-----------------
    pytorch-1.10.1             |py3.9_cuda10.2_cudnn7.6.5_0       768.4 MB  pytorch
    torchaudio-0.10.1          |       py39_cu102         4.5 MB  pytorch
    torchvision-0.11.2         |       py39_cu102         8.7 MB  pytorch
    ------------------------------------------------------------
                                           Total:       781.5 MB

The following NEW packages will be INSTALLED:

  libuv              pkgs/main/linux-64::libuv-1.44.2-h5eee18b_0 

The following packages will be DOWNGRADED:

  pytorch                1.12.1-py3.9_cuda10.2_cudnn7.6.5_0 --> 1.10.1-py3.9_cuda10.2_cudnn7.6.5_0 
  torchaudio                              0.12.1-py39_cu102 --> 0.10.1-py39_cu102 
  torchvision                             0.13.1-py39_cu102 --> 0.11.2-py39_cu102 

参考:

pytorch官网版本选择建议:Previous PyTorch Versions | PyTorch

本项目的requirements文档:yolov5_obb/requirements.txt at master · hukaixuan19970627/yolov5_obb (github.com)

终于!

执行


 python setup.py develop

没有报错了!(截图只截了最后部分的输出)

————————————————————————

运行训练代码时候踩坑:ImportError: /data/yolov5_obb/utils/nms_rotated/nms_rotated_ext.cpython-39-x86_64-linux-gnu.so: undefined symbol: _ZN3c1015SmallVectorBaseIjE8grow_podEPvmm

在yolov5_obb下运行训练代码


python train.py \
  --weights 'weights/yolov5n_s_m_l_x.pt' \
  --data 'data/yolov5obb_demo_split.yaml' \
  --hyp 'data/hyps/obb/hyp.finetune_dota.yaml' \
  --epochs 10 \
  --batch-size 2 \
  --img 1024 \
  --device 0

报错信息:


Traceback (most recent call last):
  File "/data/yolov5_obb/train.py", line 34, in <module>
    import val  # for end-of-epoch mAP
  File "/data/yolov5_obb/val.py", line 28, in <module>
    from models.common import DetectMultiBackend
  File "/data/yolov5_obb/models/common.py", line 23, in <module>
    from utils.datasets import exif_transpose, letterbox
  File "/data/yolov5_obb/utils/datasets.py", line 28, in <module>
    from utils.augmentations import Albumentations, augment_hsv, copy_paste, letterbox, mixup, random_perspective
  File "/data/yolov5_obb/utils/augmentations.py", line 12, in <module>
    from utils.general import LOGGER, check_version, colorstr, resample_segments, segment2box
  File "/data/yolov5_obb/utils/general.py", line 35, in <module>
    from utils.nms_rotated import obb_nms
  File "/data/yolov5_obb/utils/nms_rotated/__init__.py", line 1, in <module>
    from .nms_rotated_wrapper import obb_nms, poly_nms
  File "/data/yolov5_obb/utils/nms_rotated/nms_rotated_wrapper.py", line 4, in <module>
    from . import nms_rotated_ext
ImportError: /data/yolov5_obb/utils/nms_rotated/nms_rotated_ext.cpython-39-x86_64-linux-gnu.so: undefined symbol: _ZN3c1015SmallVectorBaseIjE8grow_podEPvmm

看了一下这个项目里面的issue,原作者回复说是没有编译好nms

删除nms_rotated_ext.cpython-39-x86_64-linux-gnu.so这个文件,再次执行安装就还是不行。

参考:提示yolov5_obb/utils/nms_rotated/nms_rotated_ext.cpython-39-x86_64-linux-gnu.so: undefined symbol: _ZN3c1015SmallVectorBaseIjE8grow_podEPvmm · Issue #261 · hukaixuan19970627/yolov5_obb · GitHub

删除/utils/nms_rotated下的nms_rotated_ext.cpython-39-x86_64-linux-gnu.so 和 build文件

再次执行 python setup.py develop

最后再train后解决!

原因分析:不删除build会跳过ninja来build work的过程,根据我粗浅的理解,这步应该就是针对linux系统做编译的过程。

没有删除build:

删除build之后:(不懂不打码有没有风险,码上再说哈哈哈)

————————————————————————

运行测试用例踩坑(接上一个问题)

再次执行训练代码之后,提示没有预训练模型文件。

这个从官网上找到对应的pt模型文件。下载过来就可以。

之后就可以正常运行了!!!

祝各位朋友好运~

  • 0
    点赞
  • 4
    收藏
    觉得还不错? 一键收藏
  • 0
    评论

“相关推荐”对你有帮助么?

  • 非常没帮助
  • 没帮助
  • 一般
  • 有帮助
  • 非常有帮助
提交
评论
添加红包

请填写红包祝福语或标题

红包个数最小为10个

红包金额最低5元

当前余额3.43前往充值 >
需支付:10.00
成就一亿技术人!
领取后你会自动成为博主和红包主的粉丝 规则
hope_wisdom
发出的红包
实付
使用余额支付
点击重新获取
扫码支付
钱包余额 0

抵扣说明:

1.余额是钱包充值的虚拟货币,按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载,可以购买VIP、付费专栏及课程。

余额充值