ptq和qat后导出的onnx模型转换为 tensorRT 的int8模型 注意事项
1. 带有QDQ节点的onnx模型, 无法转换为 fp16精度的 trt 模型, 仅仅可以用于转换 int8精度的 trt模型;
2. onnx 导出保存信息:
export failure: Exporting the operator fake_quantize_per_channel_affine to ONNX opset version 12 is not supported. Support for this operator was added in version 13, try exporting with this version
导出需要显式量化的层, 需要采用:
opset_version=13
torch.onnx.export(
model.cpu() if dynamic_shape else model, # --dynamic only compatible with cpu
im.cpu() if dynamic_shape else im,
onnx_filename,
verbose=False,
opset_version=13, # opset>=13 才开始支持 qdq 节点;
training= torch.onnx.TrainingMode.EVAL, # 不写参数也可以转换成功TRT模型
do_constant_folding=True, # 不写参数也可以转换成功TRT模型, 因为默认值即为 True
input_names=['images'],
output_names=['output'],
dynamic_axes={
'images': {0: 'batch'},
'output': {0: 'batch'} # shape(1,25200,85)
} if dynamic_shape else None)
- 转换脚本:
#!/bin/bash
export CUDA_VISIBLE_DEVICES=0
export LD_LIBRARY_PATH=\
/mnt/TRT/TensorRT-8.6.0.12/targets/x86_64-linux-gnu/lib:\
/usr/local/cuda/lib64:\
/mnt/TRT/public/lean_trt8.2_cudnn8.2_protobuf11.4_cuda_11.0-11.5/cudnn8.2.2.26/lib:\
:$LD_LIBRARY_PATH
current_dir=$(cd `dirname $0`;pwd)
echo ${current_dir}
dst="/mnt/TRT/TensorRT-8.6.0.12/bin"
cd ${dst}
./trtexec \
--onnx=${current_dir}/yolov5_onnx/yolov5n_ptq_detect_dynamic_notDoConstant.onnx \
--saveEngine=${current_dir}/yolov5_onnx/yolov5n_ptq_detect_dynamic_notDoConstant.trt \
--buildOnly \
--minShapes=images:1x3x672x672 \
--optShapes=images:4x3x672x672 \
--maxShapes=images:8x3x672x672 \
--int8 \
--fp16