【解决方案】onnx 转 TensorRT (.plan, .trt) 报错 Error[10]: Could not find any implementation 或 OutOfMemory 问题

最新推荐文章于 2024-05-31 14:47:12 发布

多恩Stone

最新推荐文章于 2024-05-31 14:47:12 发布

阅读量912

点赞数 5

文章标签：算法运维 python

本文链接：https://blog.csdn.net/weixin_44212848/article/details/137286847

版权

通过以下 bash 将动态输入输出的 .onnx 模型转化为 .plan 过程中，出现了 Error[10] Could not find any implementation for node {ForeignNode 报错

trtexec --onnx=swinir_real_sr_large_model_dynamic_sim_folded.onnx --saveEngine=model-folded.plan --timingCacheFile=model-folded.cache --minShapes=input:1x3x36x36 --optShapes=input:2x3x512x512 --maxShapes=input:3x3x1024x1024 --verbose

具体报错如下

[04/02/2024-06:02:45] [E] Error[10]: Could not find any implementation for node {ForeignNode[(Unnamed Layer* 89) [Constant] + (Unnamed Layer* 90) [Shuffle].../layers.0/patch_unembed/Transpose + /layers.0/patch_unembed/Reshape]}.
[04/02/2024-06:02:45] [E] Error[10]: [optimizer.cpp::computeCosts::3869] Error Code 10: Internal Error (Could not find any implementation for node {ForeignNode[(Unnamed Layer* 89) [Constant] + (Unnamed Layer* 90) [Shuffle].../layers.0/patch_unembed/Transpose + /layers.0/patch_unembed/Reshape]}.)
[04/02/2024-06:02:45] [E] Engine could not be created from network
[04/02/2024-06:02:45] [V] [TRT] Deleting timing cache: 5 entries, served 0 hits since creation.
[04/02/2024-06:02:45] [E] Building engine failed
[04/02/2024-06:02:45] [E] Failed to create engine from model or file.
[04/02/2024-06:02:45] [E] Engine set up failed

尝试各种方法后，最终发现主要问题其实是 Out of memory，但这个报错提供的信息是说不支持 node （Could not find any implementation for node ），非常非常具有误导性！

尝试方法1: polygraphy surgeon

使用 polygraphy surgeon 将 input_model.onnx 中的一些常量进行折叠，得到 folded_model.onnx

polygraphy surgeon sanitize --fold-constants input_model.onnx  -o folded_model.onnx

来源：https://github.com/NVIDIA/TensorRT/issues/3357

尝试方法2: onnx simplifier

使用 onnx-simplifier 对 onnx 模型进行简化，将 input_onnx_model.onnx 简化为 output_onnx_model.onnx

# 先下载
pip3 install -U pip && pip3 install onnxsim
# 或者这样下载
pip install onnx-simplifier
# 将 input_onnx_model.onnx 简化为 output_onnx_model.onnx
onnxsim input_onnx_model.onnx output_onnx_model.onnx

成功简化后的提示，如果模型比较大，可能得等一段时间

Simplifying...
Finish! Here is the difference:
┏━━━━━━━━━━━━━━━━━━━━┳━━━━━━━━━━━━━━━━┳━━━━━━━━━━━━━━━━━━┓
┃                    ┃ Original Model ┃ Simplified Model ┃
┡━━━━━━━━━━━━━━━━━━━━╇━━━━━━━━━━━━━━━━╇━━━━━━━━━━━━━━━━━━┩
│ Add                │ 1955           │ 547              │
│ Cast               │ 2760           │ 0                │
│ Concat             │ 1913           │ 404              │
│ Constant           │ 25917          │ 834              │
│ ConstantOfShape    │ 2485           │ 18               │
│ Conv               │ 36             │ 36               │
│ Div                │ 381            │ 231              │
│ Equal              │ 2052           │ 19               │
│ Erf                │ 54             │ 54               │
│ Expand             │ 2430           │ 85               │
│ Gather             │ 2877           │ 601              │
│ LayerNormalization │ 110            │ 110              │
│ LeakyRelu          │ 24             │ 24               │
│ MatMul             │ 324            │ 324              │
│ Mod                │ 4              │ 4                │
│ Mul                │ 2110           │ 165              │
│ Not                │ 54             │ 2                │
│ Pad                │ 1              │ 1                │
│ Range              │ 1944           │ 64               │
│ Reshape            │ 2776           │ 605              │
│ Resize             │ 2              │ 2                │
│ ScatterND          │ 486            │ 17               │
│ Shape              │ 6613           │ 236              │
│ Slice              │ 2335           │ 271              │
│ Softmax            │ 54             │ 54               │
│ Sub                │ 58             │ 5                │
│ Transpose          │ 345            │ 293              │
│ Unsqueeze          │ 4194           │ 472              │
│ Where              │ 2052           │ 21               │
│ Model Size         │ 125.4MiB       │ 114.7MiB         │
└────────────────────┴────────────────┴──────────────────┘

官方文档：https://pypi.org/project/onnx-simplifier/

最终解决方案

降低模型的显存占用！

减小最大输入的尺寸和 batch size：这个没啥好说的，只能自己多试试
降低模型精度：一般用 fp16 或者 int8 就可以
–noTF32 Disable tf32 precision (default is to enable tf32, in addition to fp32)
–fp16 Enable fp16 precision, in addition to fp32 (default = disabled)
–int8 Enable int8 precision, in addition to fp32 (default = disabled)
–fp8 Enable fp8 precision, in addition to fp32 (default = disabled)

关于模型精度还不错的博客：

https://zhuanlan.zhihu.com/p/673708074
https://blog.csdn.net/szxcv9876/article/details/104589125

多恩Stone

关注

5
点赞
踩
7

收藏

觉得还不错? 一键收藏
3
评论
【解决方案】onnx 转 TensorRT (.plan, .trt) 报错 Error[10]: Could not find any implementation 或 OutOfMemory 问题

【解决】将动态输入输出的 .onnx 模型转化为 .plan 过程中，出现了 Error[10] Could not find any implementation for node {ForeignNode 报错。
复制链接

扫一扫