在NVIDIA JETSON TX2板子上部署RandLA net进行推理

吉良吉影想要平静的生活

已于 2024-03-04 15:18:52 修改

阅读量214

点赞数 2

文章标签： ubuntu 深度学习 c++

于 2024-02-28 11:10:29 首次发布

本文链接：https://blog.csdn.net/Tyrol29/article/details/136340568

版权

近期因为项目需要，整了一块TX2的板子来玩，遇到的坑都记录一下。
设备： nvidia jetson tx2
系统： ubuntu 18.04
问题1：
在执行
sudo apt-get update时报错：
N: 鉴于仓库 ‘https://repo.download.nvidia.com/jetson/common r32.5 InRelease’ 不支持 ‘amd64’ 体系结构，跳过配置文件 ‘main/binary-amd64/Packages’ 的获取。
解决方法：

sudo dpkg --remove-architecture amd64

再执行即可

问题2：
在安装某些库时出现：
E: 无法定位软件包
解决方法：
换源
根据自身系统架构选择合适的源替换掉

问题3：
在编译pcl1.8.0时遇到问题：

/home/ko/3rdparty/pcl-pcl-1.8.0/io/src/vlp_grabber.cpp: In member function ‘virtual void pcl::VLPGrabber::toPointClouds(pcl::HDLGrabber::HDLDataPacket*)’:
/home/ko/3rdparty/pcl-pcl-1.8.0/io/src/vlp_grabber.cpp:177:21: error: ‘boost::math’ has not been declared
       if (! (boost::math::isnan (xyz.x) || boost::math::isnan (xyz.y) || boost::math::isnan (xyz.z)))
                     ^~~~
/home/ko/3rdparty/pcl-pcl-1.8.0/io/src/vlp_grabber.cpp:177:51: error: ‘boost::math’ has not been declared
       if (! (boost::math::isnan (xyz.x) || boost::math::isnan (xyz.y) || boost::math::isnan (xyz.z)))
                                                   ^~~~
/home/ko/3rdparty/pcl-pcl-1.8.0/io/src/vlp_grabber.cpp:177:81: error: ‘boost::math’ has not been declared
       if (! (boost::math::isnan (xyz.x) || boost::math::isnan (xyz.y) || boost::math::isnan (xyz.z)))
                                                                                 ^~~~
/home/ko/3rdparty/pcl-pcl-1.8.0/io/src/vlp_grabber.cpp:187:26: error: ‘boost::math’ has not been declared
             && ! (boost::math::isnan (dual_xyz.x) || boost::math::isnan (dual_xyz.y) || boost::math::isnan (dual_xyz.z)))
                          ^~~~
/home/ko/3rdparty/pcl-pcl-1.8.0/io/src/vlp_grabber.cpp:187:61: error: ‘boost::math’ has not been declared
             && ! (boost::math::isnan (dual_xyz.x) || boost::math::isnan (dual_xyz.y) || boost::math::isnan (dual_xyz.z)))
                                                             ^~~~
/home/ko/3rdparty/pcl-pcl-1.8.0/io/src/vlp_grabber.cpp:187:96: error: ‘boost::math’ has not been declared
             && ! (boost::math::isnan (dual_xyz.x) || boost::math::isnan (dual_xyz.y) || boost::math::isnan (dual_xyz.z)))
                                                                                                ^~~~
io/CMakeFiles/pcl_io.dir/build.make:374: recipe for target 'io/CMakeFiles/pcl_io.dir/src/vlp_grabber.cpp.o' failed
make[2]: *** [io/CMakeFiles/pcl_io.dir/src/vlp_grabber.cpp.o] Error 1
CMakeFiles/Makefile2:232: recipe for target 'io/CMakeFiles/pcl_io.dir/all' failed
make[1]: *** [io/CMakeFiles/pcl_io.dir/all] Error 2
Makefile:162: recipe for target 'all' failed
make: *** [all] Error 2

判断是找不到对应的boost::math::isnan这个函数，在出现问题的vlp_grabber.cpp中添加

#include <boost/math/special_functions/fpclassify.hpp>

可解决

问题4：
我根据在win平台已经跑通的环境安装cudnn和tensorrt，版本为：

cudnn-windows-x86_64-8.6.0.163_cuda11-archive

TensorRT-8.6.1.6.Windows10.x86_64.cuda-11.8

由于tx2 的cuda版本为10.2，故下载了相应版本的cudnn和tensorrt
cmake 和 make 的过程都没有问题，但是代码中将onnx转换为engine的过程报错：

found GPU
device 0: NVIDIA Tegra X2
  computing power: 6.2
found file: /home/ko/workspace/Kpowerline/model/RandlanetPowerline.onnx
need to generate trtmodel
 consume 15 mins about
building : /home/ko/workspace/Kpowerline/model/PowerLine.trtmodel
段错误 (核心已转储)

查了一下资料，tx2本身自带cudnn 和 tensorrt 在以下目录

/usr/include/aarch64-linux-gnu

遂将安装的环境删了，重新运行。仍然遇到了错误：

found GPU
device 0: NVIDIA Tegra X2
  computing power: 6.2
found file: /home/ko/workspace/Kpowerline/model/RandlanetPowerline.onnx
need to generate trtmodel
 consume 15 mins about
building : /home/ko/workspace/Kpowerline/model/PowerLine.trtmodel
/home/ko/workspace/Kpowerline/model/RandlanetPowerline.onnx exist
warning: onnx2trt_utils.cpp:220: Your ONNX model has been generated with INT64 weights, while TensorRT does not natively support INT64. Attempting to cast down to INT32.
terminate called after throwing an instance of 'std::out_of_range'
  what():  Attribute not found: axes
已放弃 (核心已转储)

查询资料发现解决方法：【模型转换】onnx转tensorrt报错:Attribute not found: axes
估计是tensorrt版本较老，有的操作不支持，至此还需要对onnx文件进行修改。但是生成onnx的服务器目前在修理，暂时无法处理这个问题。

更新
需要部署的网络RandLa net原本使用tensorflow写的，部署方案有：
1.tensorflow C++动态库
2.将模型转成onnx使用onnx runtime
3.将模型转成onnx使用tensorrt

昨天尝试了在板子上编译tensorflow C++动态库，但是由于网络问题最后一步始终失败，并且原本使用的tensorflow 版本为2.6.0，官方说要求cuda版本11.0以上，最后能否运行还是未知数。
接着尝试了安装onnxrun time库，参考这篇文章，可以确定的是这种方法有人成功过，但是昨天下载了whl文件各种原因一直安装不上。
今天看jetpack里面4.6.3版本支持tensor rt 8.2版本了，于是重新刷机。毕竟如果直接tensor rt能用的话就不用改代码了。

新的周一，刷机完成，重新安装了Laslib和pcl等库，开始执行将onnx转为trtmodel，遇到以下错误
问题5：Error Code 10: Internal Error (Could not find any implementation for node {ForeignNode[layers/BatchGather/Reshape_2...layers/BatchGather/GatherV2]}.)
之前也遇到过这个错误，当时好像时版本问题，将tensor rt从8.4升级到8.6就没问题了，但是现在版本定死了8.2，升级肯定不行。查了一下资料，网上说调大worksize可以解决，但是我增加到4096MB依然无济于事，自己在测试时无意发现降低网络的maxbatchsize可以解决。原来我设置的是4，改成1之后可以顺利build_model。
另外又无意间发现，worksize太大或者太小都会报这个错误，只有调成1024能运行，不知道为什么……

吉良吉影想要平静的生活

关注

2
点赞
踩
0

收藏

觉得还不错? 一键收藏
1
评论
在NVIDIA JETSON TX2板子上部署RandLA net进行推理

N: 鉴于仓库 ‘https://repo.download.nvidia.com/jetson/common r32.5 InRelease’ 不支持 ‘amd64’ 体系结构，跳过配置文件 ‘main/binary-amd64/Packages’ 的获取。昨天尝试了在板子上编译tensorflow C++动态库，但是由于网络问题最后一步始终失败，并且原本使用的tensorflow 版本为2.6.0，官方说要求cuda版本11.0以上，最后能否运行还是未知数。遂将安装的环境删了，重新运行。
复制链接

扫一扫