本文主要记录pytorch模型转onnx,在用onnx-tensorrt转tensorrt,并用tensorrt文件做前向推理的过程。
一、pytorch2onnx
pytorch对onnx支持比较好,能直接用torch.onnx.export导出onnx文件
- pytorch2onnx:实现了pytorch模型到onnx的转换,这里简单使用了torchvision中的resnet18
- test_pytorch2onnx:测试pytorch模型和onnx模型输出结果是否一致,判断pytorch到onnx的转换是否成功。这里需要用到onnxruntime,直接pip安装即可:
pip install onnx # 可选
pip install onnxruntime
PS: 一些经典模型可以直接用torch.onnx.export实现pytorch到onnx的转换,但是一些自定义的模型,onnx不一定支持对应的op,这种情况就需要自己写对应op的转换程序。后续再1持续补充这方面的工作。
import torch
import numpy as np
import torchvision
import onnxruntime
def to_numpy(tensor):
return tensor.detach().cpu().numpy() if tensor.requires_grad else tensor.cpu().numpy()
def pytorch2onnx(output_onnx = 'resnet18.onnx'):
torch_model = torchvision.models.resnet18(pretrained=True)
torch_model.eval() # 必须将模型设置为推理模式,不然测试结果不一致
# Input to the model
x = torch.randn(1, 3, 224, 224, requires_grad=True)
# Export the model
torch.onnx.export(torch_model, # model being run
x, # model input (or a tuple for multiple inputs)
output_onnx, # where to save the model (can be a file or file-like object)
verbose=True,
export_params=True, # store the trained parameter weights inside the model file
opset_version=10, # the ONNX version to export the model to
do_constant_folding=True, # whether to execute constant folding for optimization
input_names = ['input'], # the model's input names
output_names = ['output'], # the model's output names
# dynamic_axes={'input' : {0 : 'batch_size'}, # variable lenght axes
# 'output' : {0 : 'batch_size'}}
)
def test_pytorch2onnx(output_onnx = 'resnet18.onnx'):
x = torch.randn(1, 3, 224, 224, requires_grad=True)
torch_model = torchvision.models.resnet18(pretrained=True)
torch_model.eval() # 必须将模型设置为推理模式,不然测试结果不一致
torch_out = torch_model(x)
ort_session = onnxruntime.InferenceSession(output_onnx)
# compute ONNX Runtime output prediction
ort_inputs = {ort_session.get_inputs()[0].name: to_numpy(x)}
ort_outs = ort_session.run(None, ort_inputs)
# compare ONNX Runtime and PyTorch results
np.testing.assert_allclose(to_numpy(torch_out), ort_outs[0], rtol=1e-03, atol=1e-05)
print("Exported model has been tested with ONNXRuntime, and the result looks good!")
return (to_numpy(torch_out), ort_outs[0])
if __name__ == '__main__':
pytorch2onnx(output_onnx = 'resnet18.onnx')
torch_out, ort_outs = test_pytorch2onnx(output_onnx = 'resnet18.onnx')
二、安装onnx2trt(onnx-tensorrt)
- 系统:
Ubuntu18.04、TensorRT6.0.1.5、cuda10.1、cudnn7.6.5 - 本人下载的对应tensorrt6.0.x.x的版本
依照onnx-tensorrt官方步骤下载到本地
git clone -b 6.0 --single-branch https://github.com/onnx/onnx-tensorrt.git
git submodule init
git submodule update #更新子模块
cd onnx-tensorrt
mkdir build
cmake .. -DTENSORRT_ROOT=<tensorrt_install_dir>
或是
cmake .. -DTENSORRT_ROOT=<tensorrt_install_dir> -DGPU_ARCHS="61"
eg: cmake .. -DTENSORRT_ROOT=/home/cym/programfiles/TensorRT-6.x.x.x -DGPU_ARCHS="61"
make -j8
sudo make install
安装完成后输入 onnx2trt命令 确认一下
error 1
很多教程没有这一步,这样会报错
/path/onnx-tensorrt/third_party/onnx does not contain a CMakeLists.txt file.
git submodule init
git submodule update #更新子模块
三、onnx转tensorrt(官方github使用教程)
Executable usage
ONNX models can be converted to serialized TensorRT engines using the onnx2trt executable:
onnx2trt my_model.onnx -o my_engine.trt
ONNX models can also be converted to human-readable text:
onnx2trt my_model.onnx -t my_model.onnx.txt
See more usage information by running:
onnx2trt -h
四、TensorRT做前向推理(python)
- 这里会依赖{TensorRT安装目录}/samples/python/common.py,记得加载一下。可直接将common.py直接复制到自己的工作目录下。
- 加载.trt文件用作前向推理。
- 测试pytorch模型与trt序列化模型输出是否一致。(注意保存输入一致,预处理一致)
步骤:
- 加载engine
- 给输入输出,模型分配空间
- 把待推理数据赋值给inputs
- 执行推理,拿到输出。
import torch
import torchvision
from tensorrt import tensorrt as trt
import common
import numpy as np
import cv2
def to_numpy(tensor):
return tensor.detach().cpu().numpy() if tensor.requires_grad else tensor.cpu().numpy()
def pad2square_cv2(image):
h,w,c = image.shape
dim_diff = np.abs(h-w)
pad1,pad2= dim_diff//2 ,dim_diff-dim_diff//2
if h<=w:
image = cv2.copyMakeBorder(image,pad1,pad2,0,0,cv2.BORDER_CONSTANT,value=0)
else:
image = cv2.copyMakeBorder(image,0,0,pad1,pad2,cv2.BORDER_CONSTANT,value=0)
return image
def get_sample(img_path='./data/pics/test.jpg'):
img = cv2.imread(img_path)
img = pad2square_cv2(img)
img = img/255
img = cv2.resize(img,(224,224))
img = img.transpose((2,0,1))
print(img.shape)
img = np.reshape(img,(-1,))
return img
def load_engine(trt_path):
TRT_LOGGER = trt.Logger(trt.Logger.WARNING)
# 反序列化引擎
with open(trt_path, "rb") as f, trt.Runtime(TRT_LOGGER) as runtime:
return runtime.deserialize_cuda_engine(f.read())
trt_engine = './resnet18.trt'
engine = load_engine(trt_engine) # 加载engine
inputs, outputs, bindings, stream = common.allocate_buffers(engine) # 给输入输出,模型分配空间
# img = get_sample(img_path='/home/cym/PycharmProjects/pytorch2tensorrt/000000.jpg')
img_origin = torch.randn(1, 3, 224, 224)
img = np.reshape(img_origin,(-1,))
with engine.create_execution_context() as context:
np.copyto(inputs[0].host, img) # 把待推理数据赋值给inputs
# 执行推理,拿到输出
res = common.do_inference(context, bindings=bindings, inputs=inputs,
outputs=outputs, stream=stream)
model = torchvision.models.resnet18(pretrained=True)
model.eval()
torch_out = model(img_origin)
print(torch_out[0].sum(), res[0].sum())
np.testing.assert_allclose(to_numpy(torch_out[0]), res[0], rtol=1e-03, atol=1e-05)
print("Exported model has been tested with ONNXRuntime, and the result looks good!")
未完待续……