一、环境配置
Jetson系列加速器如Nano、Xavier在刷机之后是自动安装了TensorRT的,通过如下命令可以检查是否安装成功:
dpkg -l | grep TensorRT
成功显示:
但是可能一开始在conda环境里加载不出来,是因为没有配置好环境。
首先找到tensor.so的位置:
sudo find / -iname "tensorrt.so"
我的位置在这里:
然后进入自己创建的conda用户环境里(/usr/local/archiconda3/envs/pytorch/lib/python3.6/site-packages)建立软连接:
ln -s /usr/lib/python3.6/dist-packages/tensorrt/tensorrt.so tensorrt.so
现在import tensorrt就不会有问题了:
文件转换还需要用到trtexec这个文件,我找了好久原来在这里:
当然它还是无法直接使用的,要在.bashrc文件里添加一下环境变量:
export PATH=/usr/src/tensorrt/bin:$PATH
记得source一下。
现在环境就基本配置好啦!
二、文件转换
在.onnx目录下运行:
.onnx转.trt
import os
import tensorrt as trt
import sys
TRT_LOGGER = trt.Logger(trt.Logger.WARNING)
runtime = trt.Runtime(TRT_LOGGER)
model_path = 'XXX.onnx'
engine_file_path = "XXX.trt"
EXPLICIT_BATCH = 1 << (int)(trt.NetworkDefinitionCreationFlag.EXPLICIT_BATCH) # batchsize=1
builder = trt.Builder(TRT_LOGGER)
network = builder.create_network(EXPLICIT_BATCH)
parser = trt.OnnxParser(network,TRT_LOGGER)
#with trt.Builder(TRT_LOGGER) as builder, builder.create_network(EXPLICIT_BATCH) as network, trt.OnnxParser(network,TRT_LOGGER) as parser:
builder.max_workspace_size = 1 << 28
builder.max_batch_size = 1
print(network)
if not os.path.exists(model_path):
print('ONNX file {} not found.'.format(model_path))
exit(0)
print('Loading ONNX file from path {}...'.format(model_path))
#with open(model_path, 'rb') as model:
model = open(model_path, 'rb')
print('Beginning ONNX file parsing')
if not parser.parse(model.read()):
print('ERROR: Failed to parse the ONNX file.')
for error in range(parser.num_errors):
print('parser.get_error(error)', parser.get_error(error))
#不加下面两行,生成的engine为None
last_layer = network.get_layer(network.num_layers - 1)
network.mark_output(last_layer.get_output(0))
network.get_input(0).shape = [1, 3, 680, 680] #此处记得修改成自己的inputsize
print('Completed parsing of ONNX file')
engine = builder.build_cuda_engine(network)
with open(engine_file_path, "wb") as f:
f.write(engine.serialize())
print('save trt success!!')
.onnx转.engine
trtexec --onnx=XXX.onnx --saveEngine=XXX.trt
可以在后面添加--int8或者--fp16指定精度。