需求是这样的,在做PointPillars模型的加速的时候我注意到网络的检测头部分小型操作很多,加速效果不明显。此外,3D检测模型的NMS部分通常是作为后处理的一部分来单独实现,TensorRT并没有直接支持3D NMS的导出。本着学习的目的,我将PointPillars模型中的检测头(单头)和3D NMS两部分合并到一个TensorRT Plugin,实现端到端的推理。其最终效果如下右图所示,自定义的NMS3D Plugin包含了整个后处理部分。
如何在onnx的输出后面增加NMS3D节点?
这一步涉及到修改onnx模型,可借助TensorRT自带的小工具ONNX GraphSurgeon来完成。它可以增加或者移除某些onnx节点,修改名字或者维度等等。ONNX GraphSurgeon工具的安装也很简单,先安装nvidia-pyindex,然后再安装onnx-graphsurgeon。
pip install nvidia-pyindex
pip install onnx-graphsurgeon
然后再是修改计算图的操作,我这里给出两种实现方式仅供参考。
方法一:
# Here we'll register a function to do all the subgraph-replacement heavy-lifting.
# NOTE: Since registered functions are entirely reusable, it may be a good idea to
# refactor them into a separate module so you can use them across all your models.
@gs.Graph.register()
def add_nms3d(self, inputs, outputs):
# Disconnect output nodes of all input tensors
for inp in inputs:
inp.outputs.clear()
### Disconnet input nodes of all output tensors
for out in outputs:
out.inputs.clear()
attrs = collections.OrderedDict()
attrs['anchor_sizes'] = anchor_sizes
attrs['anchor_bottom_heights'] = anchor_bottom_heights
# Insert the new node.
return self.layer(op="NMS3D", inputs=inputs, outputs=outputs, name="nms3d", attrs=attrs)
def simplify_onnx():
model = onnx.load("pointpillar_raw.onnx")
graph = gs.import_onnx(onnx_model)
tmap = graph.tensors()
inputs = [tmap['cls_preds'],tmap['box_preds'],tmap['dir_cls_preds']]
outputs = [gs.Variable(name="nms3d_output", dtype=np.float32, shape=(1,100,9))]
graph.add_nms3d(inputs, outputs)
graph.outputs = outputs
graph.cleanup()
graph.toposort()
onnx.save(model_simplify, "pointpillar_simplify.onnx")
print("export ok...")
方法二:
def simplify_onnx():
#model = onnx.load("pointpillar_raw.onnx")
model = onnx.load("pointpillar_fcn_max_nchw_cudapp.onnx")
while len(model.graph.output):
model.graph.output.remove(model.graph.output[0])
model.graph.output.extend([
onnx.helper.make_tensor_value_info('nms3d_output', onnx.TensorProto.FLOAT, [1,100,9]),
])
attrs = collections.OrderedDict()
attrs['anchor_sizes'] = anchor_sizes
attrs['anchor_bottom_heights'] = anchor_bottom_heights
graph = gs.import_onnx(model)
tmap = graph.tensors()
inputs = [tmap['cls_preds'],tmap['box_preds'],tmap['dir_cls_preds']]
outputs = [tmap['nms3d_output']]
nms3d_layer = graph.layer(op="NMS3D", inputs=inputs, outputs=outputs, name="nms3d", attrs=attrs)
graph.cleanup()
graph.toposort()
onnx_module = gs.export_onnx(graph)
onnx.save(onnx_module, "pointpillar_simplify.onnx")
print("export ok...")
【参考文献】
TensorRT详细入门指北,如果你还不了解TensorRT,过来看看吧! - 知乎
安装onnx-graphsurgeon_人类高质量算法工程师的博客-CSDN博客