TensorRT INT8 量化YOLO模型—— python

AI大权

已于 2025-01-23 14:38:57 修改

阅读量725

点赞数 5

分类专栏：计算机视觉文章标签： YOLO python 模型量化 TensorRT

于 2025-01-23 14:31:33 首次发布

本文链接：https://blog.csdn.net/old_power/article/details/145323069

版权

TensorRT 是 NVIDIA 提供的高性能深度学习推理库，支持 INT8 量化以加速模型推理。以下是使用 TensorRT 对 YOLO 模型（如 YOLOv5、YOLOv8 或 YOLOv11）进行 INT8 量化的具体步骤：

TensorRT INT8 量化的具体步骤

1. 准备工作

环境要求：
- 安装 CUDA 和 cuDNN。
- 安装 TensorRT（建议使用与 CUDA 版本匹配的 TensorRT 版本）。
- 安装 PyTorch 和 ONNX（用于模型转换）。

依赖安装：

pip install torch torchvision onnx onnxruntime tensorrt

校准数据集：
- 准备一个小型校准数据集（通常 100-1000 张图片即可），用于 TensorRT 的 INT8 校准。

2. 将 YOLO 模型导出为 ONNX 格式

TensorRT 支持从 ONNX 格式的模型进行量化。首先需要将 YOLO 模型导出为 ONNX 格式。

示例代码（以 YOLOv5 为例）：

import torch
from models.experimental import attempt_load

# 加载 YOLO 模型
model = attempt_load('yolov5s.pt', map_location=torch.device('cpu'))

# 设置模型为推理模式
model.eval()

# 定义输入张量（batch_size, channels, height, width）
dummy_input = torch.randn(1, 3, 640, 640)

# 导出为 ONNX 格式
torch.onnx.export(
    model,                      # 模型
    dummy_input,                # 输入张量
    "yolov5s.onnx",             # 输出文件名
    opset_version=11,           # ONNX 版本
    input_names=["images"],     # 输入节点名称
    output_names=["output"],    # 输出节点名称
    dynamic_axes={
   "images": {
   0: "batch_size"}, "output": {
   0: "batch_size"}}

最低0.47元/天解锁文章