TensorRT onnx转engine报Assertion failed: dims.nbDims == 4 || dims.nbDims == 5

最新推荐文章于 2025-03-07 15:12:04 发布

修炼之路

最新推荐文章于 2025-03-07 15:12:04 发布

阅读量2.4k

点赞数 2

分类专栏： triton-inference-server服务部署文章标签：深度学习神经网络自动驾驶

本文链接：https://blog.csdn.net/sinat_29957455/article/details/119954294

版权

triton-inference-server服务部署专栏收录该内容

9 篇文章

订阅专栏

在尝试使用TensorRT 7.2.2.3版本将ONNX模型glintr100.onnx转换为engine时遇到错误，错误信息涉及到INT64权重不被支持以及维度推理失败。通过升级到TensorRT 21.03-py3版本的Docker镜像，成功解决了转换问题。转换命令包含了模型的最小、最优和最大输入形状以及工作区大小和精度设置。

摘要生成于 C知道，由 DeepSeek-R1 满血版支持，前往体验 >

错误信息

在使用TensorRT将onnx转为engine的时候报错，错误信息如下

[08/27/2021-15:27:08] [I] === Model Options ===
[08/27/2021-15:27:08] [I] Format: ONNX
[08/27/2021-15:27:08] [I] Model: glintr100.onnx
[08/27/2021-15:27:08] [I] Output:
[08/27/2021-15:27:08] [I] === Build Options ===
[08/27/2021-15:27:08] [I] Max batch: explicit
[08/27/2021-15:27:08] [I] Workspace: 2048 MiB
[08/27/2021-15:27:08] [I] minTiming: 1
[08/27/2021-15:27:08] [I] avgTiming: 8
[08/27/2021-15:27:08] [I] Precision: FP32+FP16
[08/27/2021-15:27:08] [I] Calibration: 
[08/27/2021-15:27:08] [I] Refit: Disabled
[08/27/2021-15:27:08] [I] Safe mode: Disabled
[08/27/2021-15:27:08] [I] Save engine: glintr100.onnx_dynamic.engine
[08/27/2021-15:27:08] [I] Load engine: 
[08/27/2021-15:27:08] [I] Builder Cache: Enabled
[08/27/2021-15:27:08] [I] NVTX verbosity: 0
[08/27/2021-15:27:08] [I] Tactic sources: Using default tactic sources
[08/27/2021-15:27:08] [I] Input(s)s format: fp32:CHW
[08/27/2021-15:27:08] [I] Output(s)s format: fp32:CHW
[08/27/2021-15:27:08] [I] Input build shape: input=1x3x112x112+4x3x112x112+8x3x112x112
[08/27/2021-15:27:08] [I] Input calibration shapes: model
[08/27/2021-15:27:08] [I] === System Options ===
[08/27/2021-15:27:08] [I] Device: 0
[08/27/2021-15:27:08] [I] DLACore: 
[08/27/2021-15:27:08] [I] Plugins:
[08/27/2021-15:27:08] [I] === Inference Options ===
[08/27/2021-15:27:08] [I] Batch: Explicit
[08/27/2021-15:27:08] [I] Input inference shape: input=4x3x112x112
[08/27/2021-15:27:08] [I] Iterations: 10
[08/27/2021-15:27:08] [I] Duration: 3s (+ 200ms warm up)
[08/27/2021-15:27:08] [I] Sleep time: 0ms
[08/27/2021-15:27:08] [I] Streams: 1
[08/27/2021-15:27:08] [I] ExposeDMA: Disabled
[08/27/2021-15:27:08] [I] Data transfers: Enabled
[08/27/2021-15:27:08] [I] Spin-wait: Disabled
[08/27/2021-15:27:08] [I] Multithreading: Disabled
[08/27/2021-15:27:08] [I] CUDA Graph: Disabled
[08/27/2021-15:27:08] [I] Separate profiling: Disabled
[08/27/2021-15:27:08] [I] Skip inference: Disabled
[08/27/2021-15:27:08] [I] Inputs:
[08/27/2021-15:27:08] [I] === Reporting Options ===
[08/27/2021-15:27:08] [I] Verbose: Disabled
[08/27/2021-15:27:08] [I] Averages: 10 inferences
[08/27/2021-15:27:08] [I] Percentile: 99
[08/27/2021-15:27:08] [I] Dump refittable layers:Disabled
[08/27/2021-15:27:08] [I] Dump output: Disabled
[08/27/2021-15:27:08] [I] Profile: Disabled
[08/27/2021-15:27:08] [I] Export timing to JSON file: 
[08/27/2021-15:27:08] [I] Export output to JSON file: 
[08/27/2021-15:27:08] [I] Export profile to JSON file: 
[08/27/2021-15:27:08] [I] 
[08/27/2021-15:27:08] [I] === Device Information ===
[08/27/2021-15:27:08] [I] Selected Device: GeForce RTX 3090
[08/27/2021-15:27:08] [I] Compute Capability: 8.6
[08/27/2021-15:27:08] [I] SMs: 82
[08/27/2021-15:27:08] [I] Compute Clock Rate: 1.725 GHz
[08/27/2021-15:27:08] [I] Device Global Memory: 24265 MiB
[08/27/2021-15:27:08] [I] Shared Memory per SM: 100 KiB
[08/27/2021-15:27:08] [I] Memory Bus Width: 384 bits (ECC disabled)
[08/27/2021-15:27:08] [I] Memory Clock Rate: 9.751 GHz
[08/27/2021-15:27:08] [I] 
----------------------------------------------------------------
Input filename:   glintr100.onnx
ONNX IR version:  0.0.6
Opset version:    11
Producer name:    pytorch
Producer version: 1.7
Domain:           
Model version:    0
Doc string:       
----------------------------------------------------------------
[08/27/2021-15:27:10] [W] [TRT] onnx2trt_utils.cpp:220: Your ONNX model has been generated with INT64 weights, while TensorRT does not natively support INT64. Attempting to cast down to INT32.
[08/27/2021-15:27:10] [E] [TRT] (Unnamed Layer* 369) [Shuffle]: at most one dimension may be inferred
ERROR: onnx2trt_utils.cpp:1517 In function scaleHelper:
[8] Assertion failed: dims.nbDims == 4 || dims.nbDims == 5
[08/27/2021-15:27:10] [E] Failed to parse onnx file
[08/27/2021-15:27:10] [E] Parsing model failed
[08/27/2021-15:27:10] [E] Engine creation failed
[08/27/2021-15:27:10] [E] Engine set up failed
&&&& FAILED TensorRT.trtexec # trtexec --onnx=glintr100.onnx --minShapes=input:1x3x112x112 --optShapes=input:4x3x112x112 --maxShapes=input:8x3x112x112 --workspace=2048 --saveEngine=glintr100.onnx_dynamic.engine --fp16

转换命令如下：

trtexec --onnx=glintr100.onnx --minShapes=input:1x3x112x112 --optShapes=input:4x3x112x112 --maxShapes=input:8x3x112x112 --workspace=2048 --saveEngine=glintr100.onnx_dynamic.engine --fp16

环境信息

TensorRT版本：7.2.2.3
系统：Ubuntu16.04

解决办法

是因为TensorRT版本的原因，所以通过更换版本可以解决这个问题。这个7.2.2.3的版本是我从NVIDIA官网下载的TensorRT，由于我的TensorRT需要与我triton-server的版本匹配，所以我从NVIDIA上pull一个对应版本的镜像

#拉取tensorrt镜像
docker pull nvcr.io/nvidia/tensorrt:21.03-py3
#激活镜像
docker run --gpus all -it --rm -v model:/model nvcr.io/nvidia/tensorrt:21.03-py3
#将onnx模型转换为engine
cd /model
trtexec --onnx=glintr100.onnx --minShapes=input:1x3x112x112 --optShapes=input:4x3x112x112 --maxShapes=input:8x3x112x112 --workspace=2048 --saveEngine=glintr100.onnx_dynamic.engine --fp16

这个版本镜像的TensorRT能够转换成功

参考：