TensorRT accelerate yolov3 by converting to onnx

Versions

Python3.6
TensorRT 5.0.2.6

Progress

First of all, here is a great introduction on TensorRT and how it works.

Float32

The official tutorial (sample) on how to accelerate yolv3 can be found in the TensorRT-5.0.2.6/samples/python/yolov3_onnx directory. It is easy to use, however, there might be issues that need to be solved.

  1. yolov3_to_onnx only works with Python2
    Of course a Python2 environment can be set up for this, but if Python3 is preferred, try to add this line after line 51:
remainder = remainder.decode("utf-8")
Int8

In order to use Int8 inference, calibration is needed.
A general example on writing a calibrator can be found here
Add builder.int8_mode=True in function get_engine and initialize the calibrator as shown by the example above, call it int8_calibrator, then add builder.int8_calibrator = int8_calibrator

Issues:

  1. trt.infer.EntropyCalibrator doesn’t exist
    For TensorRT5, the API has changed. Replace it with trt.IInt8EntropyCalibrator
    solves the issue.
  2. int(ptr) ptr can’t be a PyCapsule
    Need to convert ptr from PyCapsule to int. Refer to this link
def convert_capsule_to_int(capsule):
    ctypes.pythonapi.PyCapsule_GetPointer.restype = ctypes.c_void_p
    ctypes.pythonapi.PyCapsule_GetPointer.argtypes = [ctypes.py_object, ctypes.c_char_p]
    return ctypes.pythonapi.PyCapsule_GetPointer(capsule, None)

Performance

On GTX1070Ti, input size 608x608,
with a Pytorch implementation, the inference time per image is about 35ms;
with TensorRT float32 inference, the inference time per image is about 28ms;
with TensorRT int8 inference, the inference time per image is about 15 ms

More Issue

With Int8, 100 images are used for calibration. However, the detection result becomes not as good. That is, the detection accuracy is reduced a lot.
Need further experiments on this to see how to improve speed while keeping accuracy.

=====================================================================

Update

Calibration issue solved. The problem is the data pre-processing should be kept the same for both calibration and detection.

More TODO

Test for different batch sizes;
Evaluate accuracy on datasets

  • 0
    点赞
  • 0
    收藏
    觉得还不错? 一键收藏
  • 1
    评论
评论 1
添加红包

请填写红包祝福语或标题

红包个数最小为10个

红包金额最低5元

当前余额3.43前往充值 >
需支付:10.00
成就一亿技术人!
领取后你会自动成为博主和红包主的粉丝 规则
hope_wisdom
发出的红包
实付
使用余额支付
点击重新获取
扫码支付
钱包余额 0

抵扣说明:

1.余额是钱包充值的虚拟货币,按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载,可以购买VIP、付费专栏及课程。

余额充值