源码:https://github.com/shouxieai/tensorRT_Pro
参考博客:YOLOv7-PTQ量化部署
需要的环境:TendorRT、cuda、cuDNN、OpenCV 和 Protobuf。
protobuf 安装配置
在tensorRT_Pro的CmakeLists.txt里写到“protobuf需要用特定版本”,在github的分支里可以找到并下载protobuf 3.11.x:
https://github.com/protocolbuffers/protobuf/blob/3.11.x/src/README.md
按照readme写的方式进行安装。
sudo apt-get install autoconf automake libtool curl make g++ unzip
cd protobuf-3.11.x
./autogen.sh
./configure
make
make check
sudo make install
sudo ldconfig # refresh shared library cache.
安装完毕,使用protoc --version
查看版本。
tensorRT_pro 安装配置
修改 CmakeLists.txt 中 TendorRT、cuda、cuDNN、OpenCV 和 Protobuf 这几个库的路径,HAS_PYTHON
设置为OFF
,然后通过 cmake 和 make 编译。
准备好模型和校准图片之后,yolo 模型的推理代码主要在 src/application/app_yolo.cpp 文件中,主要修改以下几点:
- app_yolo.cpp 177 行,TRT::Mode 修改为 INT8,“yolov7” 改成 “best”
- app_yolo.cpp 25 行,新增 voclabels 数组,添加 voc 数据集的类别名称
- app_yolo.cpp 100 行,cocolabels 修改为 voclabels
- app_yolo.cpp 149 行,“inference” 修改为 “calib_data” 指定校准图片的路径
修改完成后make yolo
生成 engine 模型。
如果遇到报错 Assertion failed: scales.is_weights() && “Resize scales must be an initializer!”
$ make yolo
[ 18%] Built target plugin_list
[100%] Built target pro
[2024-02-21 08:18:56][info][app_yolo.cpp:134]:===================== test YoloV7 FP32 best0125 ==================================
[2024-02-21 08:18:56][info][trt_builder.cpp:474]:Compile FP32 Onnx Model 'best0125.onnx'.
[2024-02-21 08:19:03][warn][trt_builder.cpp:33]:NVInfer: CUDA lazy loading is not enabled. Enabling it can significantly reduce device memory usage and speed up TensorRT initialization. See "Lazy Loading" section of CUDA documentation https://docs.nvidia.com/cuda/cuda-c-programming-guide/index.html#lazy-loading
[2024-02-21 08:19:04][error][trt_builder.cpp:30]:NVInfer: /app/tensorRT_Pro-main/src/tensorRT/onnx_parser/ModelImporter.cpp:736: While parsing node number 197 [Resize -> "onnx::Concat_395"]:
[2024-02-21 08:19:04][error][trt_builder.cpp:30]:NVInfer: /app/tensorRT_Pro-main/src/tensorRT/onnx_parser/ModelImporter.cpp:737: --- Begin node ---
[2024-02-21 08:19:04][error][trt_builder.cpp:30]:NVInfer: /app/tensorRT_Pro-main/src/tensorRT/onnx_parser/ModelImporter.cpp:738: input: "input.204"
input: "onnx::Resize_394"
input: "onnx::Resize_601"
output: "onnx::Concat_395"
name: "Resize_197"
op_type: "Resize"
attribute {
name: "coordinate_transformation_mode"
s: "asymmetric"
type: STRING
}
attribute {
name: "cubic_coeff_a"
f: -0.75
type: FLOAT
}
attribute {
name: "mode"
s: "nearest"
type: STRING
}
attribute {
name: "nearest_mode"
s: "floor"
type: STRING
}
[2024-02-21 08:19:04][error][trt_builder.cpp:30]:NVInfer: /app/tensorRT_Pro-main/src/tensorRT/onnx_parser/ModelImporter.cpp:739: --- End node ---
[2024-02-21 08:19:04][error][trt_builder.cpp:30]:NVInfer: /app/tensorRT_Pro-main/src/tensorRT/onnx_parser/ModelImporter.cpp:741: ERROR: /app/tensorRT_Pro-main/src/tensorRT/onnx_parser/builtin_op_importers.cpp:3500 In function importResize:
[8] Assertion failed: scales.is_weights() && "Resize scales must be an initializer!"
[2024-02-21 08:19:04][error][trt_builder.cpp:519]:Can not parse OnnX file: best0125.onnx
[2024-02-21 08:19:04][error][yolo.cpp:197]:Engine best0125.FP32.trtmodel load failed
[2024-02-21 08:19:04][error][app_yolo.cpp:54]:Engine is nullptr
[100%] Built target yolo
这是因为pytorch版本高了,使得导出的resize节点的scales不是一个initializer。你可以执行以下代码来修改实现目的,或者降低pytorch版本到1.8.2、1.9、1.10等是可以的。
import onnx
import onnx.helper as helper
model = onnx.load("yolov5s.onnx")
def find_node_for_output(nodes, name):
for i, n in enumerate(nodes):
if name in n.output:
return i, n
return None, None
nodes = model.graph.node
inits = model.graph.initializer
remove_nodes = []
for i, node in enumerate(nodes):
if node.op_type == "Resize":
idx, identity = find_node_for_output(nodes, node.input[2])
if identity is not None:
remove_nodes.append(idx)
node.input[2] = identity.input[0]
remove_nodes = sorted(remove_nodes,reverse=True)
for i in remove_nodes:
del nodes[i]
onnx.save(model, "output.onnx")
降低torch版本至1.8.2之后成功导出。解决方法来自 https://github.com/shouxieai/tensorRT_Pro/issues/134