【tensorrt】——batch推理对比

关键词:tensorrt, int8, float16,batch推理

该测试结果有问题,正确的测试请移步:【tensorrt】——trtexec动态batch支持与batch推理耗时评测

int8量化,这篇文章中nvidia tensorrt的int8推理在batch大的时候有推理速度的提升,这里实测一下。

  1. 采用float16精度的ddrnet23模型,tensorrt的python api进行推理。可以看到采用batch的推理方式并没有什么提升。

with batch:1, inference time:0.0089 s
with batch:2, inference time:0.0078 s
with batch:3, inference time:0.0076 s
with batch:4, inference time:0.0074 s
with batch:5, inference time:0.0075 s
with batch:6, inference time:0.0072 s
with batch:7, inference time:0.0075 s
with batch:8, inference time:0.0073 s
with batch:9, inference time:0.0077 s
with batch:10, inference time:0.0080 s
with batch:11, inference time:0.0089 s
with batch:12, inference time:0.0090 s
with batch:13, inference time:0.0089 s
with batch:14, inference time:0.0105 s
with batch:15, inference time:0.0087 s
with batch:16, inference time:0.0083 s
with batch:17, inference time:0.0079 s
with batch:18, inference time:0.0080 s
with batch:19, inference time:0.0080 s
with batch:20, inference time:0.0079 s
with batch:21, inference time:0.0079 s
with batch:22, inference time:0.0079 s
with batch:23, inference time:0.0078 s
with batch:24, inference time:0.0078 s

  1. 采用int8精度的hrnet_ocrw18

with batch:1, inference time:0.0109 s
with batch:2, inference time:0.0088 s
with batch:3, inference time:0.0081 s
with batch:4, inference time:0.0078 s
with batch:5, inference time:0.0076 s
with batch:6, inference time:0.0074 s
with batch:7, inference time:0.0077 s
with batch:8, inference time:0.0075 s
with batch:9, inference time:0.0075 s
with batch:10, inference time:0.0083 s
with batch:11, inference time:0.0081 s
with batch:12, inference time:0.0080 s
with batch:13, inference time:0.0080 s
with batch:14, inference time:0.0082 s
with batch:15, inference time:0.0085 s
with batch:16, inference time:0.0080 s
with batch:17, inference time:0.0083 s
with batch:18, inference time:0.0082 s
with batch:19, inference time:0.0083 s
with batch:20, inference time:0.0082 s
with batch:21, inference time:0.0084 s
with batch:22, inference time:0.0089 s
with batch:23, inference time:0.0091 s
with batch:24, inference time:0.0089 s
with batch:25, inference time:0.0084 s
with batch:26, inference time:0.0079 s
with batch:27, inference time:0.0079 s
with batch:28, inference time:0.0081 s
with batch:29, inference time:0.0086 s
with batch:30, inference time:0.0086 s
with batch:31, inference time:0.0084 s

总结:
在int8和float16上实测是没有什么提升的。

1. 从:https://blog.csdn.net/zhou_438/article/details/112823818,可以看到batch size到32以上后单张推理才有提升
2. 从这里可以看到 batch_size1,2 也是没有变换的

  • 0
    点赞
  • 3
    收藏
    觉得还不错? 一键收藏
  • 8
    评论
您好!对于使用YoloV7和TensorRT进行推理的问题,我可以为您提供一些基本的指导。以下是一个简单的步骤: 1. 首先,您需要从YoloV7的库或项目中获取预训练的权重文件和配置文件。 2. 使用TensorRT的Python API,您可以将YoloV7模型转换为TensorRT引擎。首先,您需要创建一个TensorRT的Builder对象,然后设置一些重要的参数,例如最大批处理大小(max batch size)和最大工作空间大小(max workspace size)。接下来,您需要创建一个TensorRT网络(network),并将YoloV7模型加载到网络中。最后,使用Builder对象将网络转换为TensorRT引擎。 3. 在推理之前,您需要准备输入数据。根据YoloV7的要求,输入数据通常是图像。您可以使用OpenCV或其他图像处理库加载和预处理输入图像。 4. 在TensorRT引擎上执行推理。通过创建一个TensorRT的执行上下文(execution context),您可以将输入数据传递给引擎,并获取输出结果。输出结果通常是检测到的对象的边界框(bounding boxes)和分类信息。 5. 对于检测到的对象,您可以根据需要进行后处理,例如非最大抑制(non-maximum suppression),以过滤掉重叠的边界框或设置阈值来筛选掉低置信度的检测结果。 这只是一个大致的流程,具体的实现细节可能会因您所使用的库和环境而有所不同。希望这些信息能对您有所帮助!如需进一步指导,请提供更具体的问题。
评论 8
添加红包

请填写红包祝福语或标题

红包个数最小为10个

红包金额最低5元

当前余额3.43前往充值 >
需支付:10.00
成就一亿技术人!
领取后你会自动成为博主和红包主的粉丝 规则
hope_wisdom
发出的红包
实付
使用余额支付
点击重新获取
扫码支付
钱包余额 0

抵扣说明:

1.余额是钱包充值的虚拟货币,按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载,可以购买VIP、付费专栏及课程。

余额充值