《jetson系列编译部署libtorch》


jetson系列编译部署libtorch


安装

  • 查看设备cuda版本
Copyright (c) 2005-2021 NVIDIA Corporation
Built on Sun_Feb_28_22:34:44_PST_2021
Cuda compilation tools, release 10.2, V10.2.300
Build cuda_10.2_r440.TC440_70.29663091_0
sudo apt-get update 
sudo apt-get upgrade
sudo apt-get install python3-pip libopenblas-base libopenmpi-dev
pip3 install Cython
  • 编译pytorch
sudo pip3 install torch-1.9.0-cp36-cp36m-linux_aarch64.whl
  • 查看安装版本
pip3 list | grep torch

模型转换

  • pytorch模型转化为libtorch的torchscript模型(.pth—>.pt)
def convert_model(model, path_out, img_tensor, running_mode='gpu'):
    if running_mode == 'gpu' and torch.cuda.is_available():
        print("gpu")
        device = torch.device("cuda:0")
        model = model.cuda(device)      # mdoel.to(device)
        model.load_state_dict(torch.load(path_model), strict=False)
        img_tensor = img_tensor.to(device)
    else:
        print("cpu")
        # device = torch.device('cpu')
        model.load_state_dict(torch.load(path_model, map_location='cpu'), strict=False)
    model.eval()
    traced_script_module = torch.jit.trace(model, img_tensor)
    traced_script_module.save(path_out)

测试

测试环境

  • jetson nano 4G
  • torch 1.10.0
  • torchvision 0.11
  • JetPack 4.6
  • ubuntu 18.06.
  • cuda 10.2

Makefile

TARGET  := demo
SOURCE := main.cpp
OBJS := main.o
CXX      := g++
LIBS    := -lc10 -lc10_cuda -ltorch_cuda -ltorch -lshm -ltorch_cpu
LDFLAGS := -L/usr/local/lib/python3.6/dist-packages/torch/lib
DEFINES := 
INCLUDE := -I./ -I/usr/local/lib/python3.6/dist-packages/torch/include -I/usr/local/lib/python3.6/dist-packages/torch/include/torch/csrc/api/include
CFLAGS  := -g -Wall -O3 $(DEFINES) $(INCLUDE) -fPIC
PKGS	:= opencv4
LIBS	+= `pkg-config --libs $(PKGS)`
CFLAGS	+= `pkg-config --cflags $(PKGS)`
CXXFLAGS:= $(CFLAGS) -DHAVE_CONFIG_H -std=c++14 -Wunused-function -Wunused-variable -Wfatal-errors
CXXFLAGS += -Wl,--no-as-needed -ltorch_cuda

.PHONY : everything objs clean veryclean rebuild
everything : $(TARGET)
all : $(TARGET) 
objs : $(OBJS) 
rebuild: veryclean everything
 
clean :
	rm -rf $(OBJS)
	rm -rf *.o
veryclean : clean
	rm -fr $(TARGET)
%.o : %.cpp
	$(CXX) $(CXXFLAGS) -c $< -o $@
$(TARGET) : $(OBJS)
	$(CXX) $(CXXFLAGS) -o $@ $(OBJS) $(LDFLAGS) $(LIBS)

main.cpp

#include <iostream>
#include <opencv2/opencv.hpp>
#include <torch/torch.h>
#include <torch/script.h>

int main(){
    cv::Mat img, img_tmp, img_float;

    char *model_path = "test.pt";

    torch::jit::script::Module module = torch::jit::load(model_path);
    module.to(at::kCUDA);

    char *img_path = (char *)"1.jpg";
    img = cv::imread(img_path, -1);
    if(img.empty()){
        printf("error open img file:[%s]\n", img_path);
        return -1;
    }

    cv::cvtColor(img, img_tmp, cv::COLOR_BGR2RGB);
    cv::resize(img_tmp, img_tmp, cv::Size(120, 70));
    img_tmp.convertTo(img_float, CV_32F, 1.0 / 255); // ->(0,1)
    // batchsize rows, cols, channels,   B,H,W,C
    torch::Tensor img_tensor = torch::from_blob(img_float.data, {1, img_float.rows, img_float.cols, img_float.channels()}, torch::kFloat32);
    img_tensor = img_tensor.permute({0, 3, 1, 2});  // ->B, C, H, W

    float mean_[] = {0.5, 0.5, 0.5};
    float std_[] = {0.5, 0.5, 0.5};
    for(int i = 0; i < 3; i++){ // normalize->(-1,1)
        img_tensor[0][i] = img_tensor[0][i].sub_(mean_[i]).div_(std_[i]);
    }
    torch::Tensor img_tensor_cuda = img_tensor.cuda();
    torch::Tensor result = module.forward({img_tensor_cuda}).toTensor();

    auto max_result = result.max(1, true);
    auto max_ind = std::get<1>(max_result).item<float>();
    std::cout << max_ind << std::endl;

    std::cout << "CUDA:   " << torch::cuda::is_available() << std::endl;
    std::cout << "CUDNN:  " << torch::cuda::cudnn_is_available() << std::endl;
    std::cout << "GPU(s): " << torch::cuda::device_count() << std::endl;

    return 0;
}

测试

  • 使用100张128 * 72图片
CPU内存占比GPU耗时(s)
libtorch50%62.6%46%24.021
pytorch390%55.2%50%32.384

what(): PyTorch is not linked with support for cuda devices

CXXFLAGS += -Wl,--no-as-needed -ltorch_cuda

conversion to non-scalar type torch::jit::load(“model.pt”)

std::shared_ptr<torch::jit::script::Module> module = torch::jit::load(model_path);
  • 需要改为以下
torch::jit::script::Module module = torch::jit::load(model_path);

cannot open shared object file: No such file or directory

  • 编译可以通过,但是无法使用
  • 需要将libtorch的lib库加入到环境变量
   export LD_LIBRARY_PATH=$LD_LIBRARY_PATH:/usr/local/lib/python3.6/dist-packages/torch/lib

参考链接

Start Locally | PyTorch

C++ — PyTorch 1.10.1 documentation

Library API — PyTorch master documentation

  • 1
    点赞
  • 13
    收藏
    觉得还不错? 一键收藏
  • 0
    评论

“相关推荐”对你有帮助么?

  • 非常没帮助
  • 没帮助
  • 一般
  • 有帮助
  • 非常有帮助
提交
评论
添加红包

请填写红包祝福语或标题

红包个数最小为10个

红包金额最低5元

当前余额3.43前往充值 >
需支付:10.00
成就一亿技术人!
领取后你会自动成为博主和红包主的粉丝 规则
hope_wisdom
发出的红包
实付
使用余额支付
点击重新获取
扫码支付
钱包余额 0

抵扣说明:

1.余额是钱包充值的虚拟货币,按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载,可以购买VIP、付费专栏及课程。

余额充值