Jetson的一堆设备(NANO,TX2,AGX Xavier)都是移动端CPU基于 ARM aarch64架构的孤儿设备,anaconda都用不了。
libtorch是pytorch的C++ API,部署做推理的时候比一般python代码要快不少。在Intel或AMD架构的CPU上可以直接从官网下载编译好的运行库文件,cmake起来简单方便。但是这些编译好的.so文件都不能直接在Jetson上直接链接,需要我们自己搞。
不过幸好NVIDIA官网给出了pytorch的安装文件,顺带着把libtorch的需要的库和链接文件都包括了,不用自己下载编译。
安装pytorch和torchlib
安装pytorch的时候一定要按照官方教程,造轮子安装。
下面是pytorch1.3的安装方式,libtorch是一个正在快速维护发展的工具,每个pytorch版本对应的接口有很多区别,建议安装最新版本(目前是1.3),函数更简洁并且修复了之前版本的指针溢出bug:
python3.6:
1. wget [https://nvidia.box.com/shared/static/phqe92v26cbhqjohwtvxorrwnmrnfx1o.whl](https://nvidia.box.com/shared/static/phqe92v26cbhqjohwtvxorrwnmrnfx1o.whl) -O torch-1.3.0-cp36-cp36m-linux_aarch64.whl
2. pip3 install numpy torch-1.3.0-cp36-cp36m-linux_aarch64.whl
(权限不够的话加sudo管理员权限)
下载速度慢的话,建议在本地机器上翻墙下好.whl文件,再传到板子上。
安装成功之后,在~/.local/lib/python3.6/site-packages/torch/lib
或者/usr/local/lib/python3.6/dist-packages/torch/lib
(管理员权限安装)文件夹下面可以看到熟悉的libtorch.so等一系列.so文件,这些都是我们稍后编译需要链接的文件。
##编译安装Opencv4.0 和opencv_contrib4.0
首先编写install_opencv4.0.0_Nano.sh
#!/bin/bash
if [ "$#" -ne 1 ]; then
echo "Usage: $0 <Install Folder>"
exit
fi
folder="$1"
user="nvidia"
passwd="nvidia"
echo "** Install requirement"
sudo apt-get update
sudo apt-get install -y build-essential cmake git libgtk2.0-dev pkg-config libavcodec-dev libavformat-dev libswscale-dev
sudo apt-get install -y libgstreamer1.0-dev libgstreamer-plugins-base1.0-dev
sudo apt-get install -y python2.7-dev python3.6-dev python-dev python-numpy python3-numpy
sudo apt-get install -y libtbb2 libtbb-dev libjpeg-dev libpng-dev libtiff-dev libjasper-dev libdc1394-22-dev
sudo apt-get install -y libv4l-dev v4l-utils qv4l2 v4l2ucp
sudo apt-get install -y curl
sudo apt-get update
echo "** Download opencv-4.0.0"
cd $folder
curl -L https://github.com/opencv/opencv/archive/4.0.0.zip -o opencv-4.0.0.zip
curl -L https://github.com/opencv/opencv_contrib/archive/4.0.0.zip -o opencv_contrib-4.0.0.zip
unzip opencv-4.0.0.zip
unzip opencv_contrib-4.0.0.zip
cd opencv-4.0.0/
echo "** Building..."
mkdir release
cd release/
cmake -D WITH_CUDA=ON -D CUDA_ARCH_BIN="5.3" -D CUDA_ARCH_PTX="" -D OPENCV_EXTRA_MODULES_PATH=../../opencv_contrib-4.0.0/modules -D WITH_GSTREAMER=ON -D WITH_LIBV4L=ON -D BUILD_opencv_python2=ON -D BUILD_opencv_python3=ON -D BUILD_TESTS=OFF -D BUILD_PERF_TESTS=OFF -D BUILD_EXAMPLES=OFF -D CMAKE_BUILD_TYPE=RELEASE -D CMAKE_INSTALL_PREFIX=/usr/local ..
make -j3
sudo make install
sudo apt-get install -y python-opencv python3-opencv
echo "** Install opencv-4.0.0 successfully"
然后运行脚本
./install_opencv4.0.0_Nano.sh [folder you want to install OpenCV]
配置环境
先新建一个自己的项目目录,目录下有需要编译的.cpp/.h文件,然后在这个目录下新建一个libtorch文件夹,把刚才安装生成的~/.local/lib/python3.6/site-packages/torch/lib
或/usr/local/lib/python3.6/dist-packages/torch/lib
文件夹复制进去。
下面写CMakeLists.txt
cmake_minimum_required(VERSION 3.0 FATAL_ERROR)
project(ImageSeg) #项目名称 自己随便取
include_directories(/usr/local/include)
include_directories(/usr/local/lib/python3.6/dist-packages/torch/include/torch/csrc/api/include)
include_directories(/usr/local/lib/python3.6/dist-packages/torch/include)
link_directories(/usr/local/lib)
link_directories(/usr/local/lib/python3.6/dist-packages/torch/lib)
set(CMAKE_PREFIX_PATH /usr/local/lib/python3.6/dist-packages/torch)
set(Boost_USE_MULTITHREADED ON)
set(Torch_DIR /usr/local/lib/python3.6/dist-packages/torch)
find_package(Torch REQUIRED)
find_package(OpenCV REQUIRED)
add_executable(ImageSeg /data/libtorch/run.cpp model.cpp) #需要编译的cpp文件
target_link_libraries(ImageSeg "${TORCH_LIBRARIES}")
target_link_libraries(ImageSeg "${OpenCV_LIBRARIES}")
set_property(TARGET ImageSeg PROPERTY CXX_STANDARD 11)
代码和编译
下面是我的工程代码,一个简单的语义分割模型,mm.pt
是python下pytorch导出的模型权重,可以在这里看到例子。
run.cpp:
#include <iostream>
#include <opencv2/opencv.hpp>
#include <string>
#include <torch/script.h>
#include "model.h"
int main(int argc, const char* argv[]) {
std::string image = argv[1];;
torch::jit::script::Module module = torch::jit::load("/data/libtorch/mm.pt");
// Predict
std::string ok;
ok = infer(image, module);
return 0;
}
model.h:
#ifndef INFER_H // To make sure you don't declare the function more than once by including the header multiple times.
#define INFER_H
#include <torch/torch.h>
#include <torch/script.h>
#include <iostream>
#include <vector>
#include <string>
#include <opencv2/imgcodecs.hpp>
#include <opencv2/highgui.hpp>
#include <opencv2/core/core.hpp>
#include <opencv2/opencv.hpp>
std::string infer(
std::string,
torch::jit::script::Module);
#endif
model.cpp:
#include "model.h"
#include<ctime>
clock_t start,end;
std::string infer(std::string filedir, torch::jit::script::Module module)
{
//模型转到GPU中去
module.to(at::kCUDA);
cv::Mat image = cv::imread(filedir);
std::cout << image.rows <<" " << image.cols <<" " << image.channels() << std::endl;
cv::cvtColor(image, image, cv::COLOR_BGR2RGB);
cv::Mat img_float;
image.convertTo(img_float, CV_32F, 1.0 / 255);
cv::resize(img_float,img_float, cv::Size(512,512));
auto img_tensor = torch::from_blob(img_float.data, { 1,512,512,3 }).permute({0, 3, 1, 2}).to(torch::kCUDA); //cv图像转tensor,这个函数接口较之前版本有较大区别
start=clock(); //程序开始计时
torch::Tensor output = module.forward({img_tensor}).toTuple()->elements()[0].toTensor();
// ->elements()[0] 是因为我这里模型在python中返回的结果是tuple(image,)
end=clock(); //程序结束用时
double endtime=(double)(end-start)/CLOCKS_PER_SEC;
std::cout<<"Total time:"<<endtime*1000<<"ms"<<std::endl; //ms为单位
auto out_tensor = output.argmax(1).squeeze();
out_tensor = out_tensor.mul(126).to(torch::kU8).to(torch::kCPU);
cv::Mat resultImg(512, 512, CV_8U);
std::memcpy((void *) resultImg.data, out_tensor.data_ptr(), sizeof(torch::kU8) * out_tensor.numel());
cv::resize(resultImg,resultImg, cv::Size(image.rows, image.cols));
cv::imwrite("result.jpg",resultImg); //保存语义分割结果
std::string ok = "OK";
return ok;
}
然后
mkdir build
cmake .. & make
每张图像大约3s
踩过的坑
- 之前在CmakeLists文件中加入了需要的lib和bin,就把
find_package(Torch REQUIRED)
给注释掉了,然后一直报很长的错误:undefined reference to torch::jit::load(std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> > const&)
- 编译使用libtorch的工程的时候,一定要在build文件夹的同级目录有一个libtorch文件夹,里面有我们拷贝出来的lib文件夹
- libtorch 1.3的许多接口更新,以及得到结果后一定要to(torch::kCPU)再保存图片