torch_scatter::scatter_max 转onnx再转tensorrt踩坑记录

EEPI

已于 2024-07-01 17:42:45 修改

阅读量505

点赞数 4

文章标签：人工智能深度学习自动驾驶

于 2024-06-27 14:02:08 首次发布

本文链接：https://blog.csdn.net/eepii/article/details/140012025

版权

官方文档由很多有效建议：https://pytorch.ac.cn/docs/stable/onnx_torchscript.html

1 torch_scatter::scatter_max转onnx

1.1 报错位置

torch.onnx.errors.UnsupportedOperatorError: ONNX export failed on an operator with unrecognized namespace torch_scatter::scatter_max. If you are trying to export a custom operator, make sure you registered it with the right domain and version.

1.2 报错位置分析

在图神经网络中，并没有显式地调用scatter函数，因此是图神经网络内部存在调用，需要进行排查。

1.2.1 aggregate函数会调用

对于图神经网络而言，信息传递依赖message -> aggregate -> update的流程。aggregate的含义是将其他节点传过来的信息进行聚合，这一步用到了scatter。

1.2.2 max_pool_x函数会调用

在图神经网络处理完后，经过最大池化层，函数max_pool_x内部也用到了scatter。

2 onnx转tensorRT

2.1 No importer registered for op: NonZero

onnx中存在NonZero算子，算子查看方法：

import onnx
model_path = "xx.onnx"
model = onnx.load(model_path)
# method 1
for node in model.graph.node:
    print(f"Node Name: {node.name}")
    print(f"Op Type: {node.op_type}") # check NonZero op, etc
    print(f"Input(s): {node.input}")
    print(f"Output(s): {node.output}")
    print(f"Attributes: {node.attribute}")
    print("\n")
# method 2
print(onnx.helper.printable_graph(model.graph))

该算子会返回非零值的索引，索引长度可变，而TensorRT需要提前固定大小，不可以改变，因此不可以使用。
注意：最新的TensorRT已经可以使用了。

2.2 NonZero造成的原因

有3种可能：

tensor mask： idx = tensor_a > 0
torch.where(condition)。注意：torch.where(condition, a, b)是可以的。
torch.nonzero

2.3 如何查看TensorRT为什么不支持NonZero算子

上github查看各个版本的TensorRT支持的算子，在docs/operators.md中

https://github.com/onnx/onnx-tensorrt

可以清楚地看到，10.1版本已经支持NonZero算子了，但是切换分支到8.4，就没有支持。
另，tensorRT版本中EA表示early access，GA表示general availability。一般推荐使用GA版本。

2.4 NonZero解决的办法

方法选择

将该算子替换成其他计算方法。
自定义算子。在TensorRT中实现NonZero，该过程非常复杂，参考：https://blog.csdn.net/weixin_45878768/article/details/128149343

This version of TensorRT does not support BOOL types for Where operators.

3 BaseData不能用torch.jit.script导出

torch.jit.frontend.NotSupportedError: Compiled functions can't take variable number of arguments or use keyword-only arguments with defaults:
File "torch_geometric/data/data.py", line 100
    def __cat_dim__(self, key: str, value: Any, *args, **kwargs) -> Any:
                                                        ~~~~~~~ <--- HERE

当使用BaseData类时，torch_geometric中的data无法满足导出需求，故无法导出。

MessagePassing没有支持isinstance

torch_sparce中的sparse，torch_scatter中的scatter都用了isinstance函数，这些转onnx都不支持

torch.onnx.errors.UnsupportedOperatorError: Exporting the operator 'prim::isinstance' to ONNX opset version 16 is not supported.

EEPI

关注

4
点赞
踩
9

收藏

觉得还不错? 一键收藏
0
评论
torch_scatter::scatter_max 转onnx再转tensorrt踩坑记录

方法选择。
复制链接

扫一扫