文章目录
jetson系列编译部署libtorch
安装
- 查看设备
cuda
版本
Copyright (c) 2005-2021 NVIDIA Corporation
Built on Sun_Feb_28_22:34:44_PST_2021
Cuda compilation tools, release 10.2, V10.2.300
Build cuda_10.2_r440.TC440_70.29663091_0
-
官网下载对应编译库Start Locally | PyTorch,jetson系列无法使用,会提示版本不符合,需要自己编译
-
根据Jetpack版本,去论坛找相应源码[PyTorch for Jetson - version 1.9.0 now available]
-
安装依赖环境
sudo apt-get update
sudo apt-get upgrade
sudo apt-get install python3-pip libopenblas-base libopenmpi-dev
pip3 install Cython
- 编译pytorch
sudo pip3 install torch-1.9.0-cp36-cp36m-linux_aarch64.whl
- 查看安装版本
pip3 list | grep torch
模型转换
- pytorch模型转化为libtorch的torchscript模型(.pth—>.pt)
def convert_model(model, path_out, img_tensor, running_mode='gpu'):
if running_mode == 'gpu' and torch.cuda.is_available():
print("gpu")
device = torch.device("cuda:0")
model = model.cuda(device) # mdoel.to(device)
model.load_state_dict(torch.load(path_model), strict=False)
img_tensor = img_tensor.to(device)
else:
print("cpu")
# device = torch.device('cpu')
model.load_state_dict(torch.load(path_model, map_location='cpu'), strict=False)
model.eval()
traced_script_module = torch.jit.trace(model, img_tensor)
traced_script_module.save(path_out)
测试
测试环境
- jetson nano 4G
- torch 1.10.0
- torchvision 0.11
- JetPack 4.6
- ubuntu 18.06.
- cuda 10.2
Makefile
TARGET := demo
SOURCE := main.cpp
OBJS := main.o
CXX := g++
LIBS := -lc10 -lc10_cuda -ltorch_cuda -ltorch -lshm -ltorch_cpu
LDFLAGS := -L/usr/local/lib/python3.6/dist-packages/torch/lib
DEFINES :=
INCLUDE := -I./ -I/usr/local/lib/python3.6/dist-packages/torch/include -I/usr/local/lib/python3.6/dist-packages/torch/include/torch/csrc/api/include
CFLAGS := -g -Wall -O3 $(DEFINES) $(INCLUDE) -fPIC
PKGS := opencv4
LIBS += `pkg-config --libs $(PKGS)`
CFLAGS += `pkg-config --cflags $(PKGS)`
CXXFLAGS:= $(CFLAGS) -DHAVE_CONFIG_H -std=c++14 -Wunused-function -Wunused-variable -Wfatal-errors
CXXFLAGS += -Wl,--no-as-needed -ltorch_cuda
.PHONY : everything objs clean veryclean rebuild
everything : $(TARGET)
all : $(TARGET)
objs : $(OBJS)
rebuild: veryclean everything
clean :
rm -rf $(OBJS)
rm -rf *.o
veryclean : clean
rm -fr $(TARGET)
%.o : %.cpp
$(CXX) $(CXXFLAGS) -c $< -o $@
$(TARGET) : $(OBJS)
$(CXX) $(CXXFLAGS) -o $@ $(OBJS) $(LDFLAGS) $(LIBS)
main.cpp
#include <iostream>
#include <opencv2/opencv.hpp>
#include <torch/torch.h>
#include <torch/script.h>
int main(){
cv::Mat img, img_tmp, img_float;
char *model_path = "test.pt";
torch::jit::script::Module module = torch::jit::load(model_path);
module.to(at::kCUDA);
char *img_path = (char *)"1.jpg";
img = cv::imread(img_path, -1);
if(img.empty()){
printf("error open img file:[%s]\n", img_path);
return -1;
}
cv::cvtColor(img, img_tmp, cv::COLOR_BGR2RGB);
cv::resize(img_tmp, img_tmp, cv::Size(120, 70));
img_tmp.convertTo(img_float, CV_32F, 1.0 / 255); // ->(0,1)
// batchsize rows, cols, channels, B,H,W,C
torch::Tensor img_tensor = torch::from_blob(img_float.data, {1, img_float.rows, img_float.cols, img_float.channels()}, torch::kFloat32);
img_tensor = img_tensor.permute({0, 3, 1, 2}); // ->B, C, H, W
float mean_[] = {0.5, 0.5, 0.5};
float std_[] = {0.5, 0.5, 0.5};
for(int i = 0; i < 3; i++){ // normalize->(-1,1)
img_tensor[0][i] = img_tensor[0][i].sub_(mean_[i]).div_(std_[i]);
}
torch::Tensor img_tensor_cuda = img_tensor.cuda();
torch::Tensor result = module.forward({img_tensor_cuda}).toTensor();
auto max_result = result.max(1, true);
auto max_ind = std::get<1>(max_result).item<float>();
std::cout << max_ind << std::endl;
std::cout << "CUDA: " << torch::cuda::is_available() << std::endl;
std::cout << "CUDNN: " << torch::cuda::cudnn_is_available() << std::endl;
std::cout << "GPU(s): " << torch::cuda::device_count() << std::endl;
return 0;
}
测试
- 使用100张128 * 72图片
CPU | 内存占比 | GPU | 耗时(s) | |
---|---|---|---|---|
libtorch | 50% | 62.6% | 46% | 24.021 |
pytorch | 390% | 55.2% | 50% | 32.384 |
what(): PyTorch is not linked with support for cuda devices
-
参考Error: “PyTorch is not linked with support for cuda devices” - C++ - PyTorch Forums
-
不忽略链接时没有用到的动态库
torch_cuda
,加入以下
CXXFLAGS += -Wl,--no-as-needed -ltorch_cuda
conversion to non-scalar type torch::jit::load(“model.pt”)
-
参考conversion to non-scalar type torch::jit::load(“model.pt”) · Issue #22382 · pytorch/pytorch · GitHub
-
输出类型改变,不再是指针类型
std::shared_ptr<torch::jit::script::Module> module = torch::jit::load(model_path);
- 需要改为以下
torch::jit::script::Module module = torch::jit::load(model_path);
cannot open shared object file: No such file or directory
- 编译可以通过,但是无法使用
- 需要将libtorch的lib库加入到环境变量
export LD_LIBRARY_PATH=$LD_LIBRARY_PATH:/usr/local/lib/python3.6/dist-packages/torch/lib