TensorRT教程12：使用PythonAPI部署推理（重点）

最新推荐文章于 2024-02-23 09:36:02 发布

米斯特龙_ZXL

最新推荐文章于 2024-02-23 09:36:02 发布

阅读量2.2k

点赞数 3

分类专栏： TensorRT教程文章标签： python 计算机视觉人工智能深度学习神经网络

本文链接：https://blog.csdn.net/weixin_41562691/article/details/119084205

版权

TensorRT教程专栏收录该内容

20 篇文章 117 订阅

订阅专栏

使用PythonAPI部署推理（重点）

step1：创建runtime

step2：反序列化创建engine

step3：创建context

step4：获取输入输出索引

step5：创建buffers

step6：为输入输出开辟GPU显存

step7：创建cuda流

step8：从CPU到GPU----拷贝input数据

step9：异步推理

step10：从GPU到CPU----拷贝output数据

step10：同步cuda流

step11：释放资源

#导入模块
import tensorrt as trt
import pycuda.autoinit  #负责数据初始化，内存管理，销毁等
import pycuda.driver as cuda  #GPU CPU之间的数据传输
import numpy as np
from PIL import Image
import matlotlib.pyplot as plt
import os

#step1:创建logger：日志记录器
logger = trt.Logger(trt.Logger.WARNING)

#step2:创建runtime并反序列化生成engine
with open(“sample.engine”, “rb”) as f, trt.Runtime(logger) as runtime:
    engine = runtime.deserialize_cuda_engine(f.read())

#step3:分配CPU锁页内存和GPU显存
h_input = cuda.pagelocked_empty(trt.volume(context.get_binding_shape(0)), dtype=np.float32)
h_output = cuda.pagelocked_empty(trt.volume(context.get_binding_shape(1)), dtype=np.float32)
d_input = cuda.mem_alloc(h_input.nbytes)
d_output = cuda.mem_alloc(h_output.nbytes)
#step4:创建cuda流
stream = cuda.Stream()

#step5:创建context并进行推理
with engine.create_execution_context() as context:
    # Transfer input data to the GPU.
    cuda.memcpy_htod_async(d_input, h_input, stream)
    # Run inference.
    context.execute_async_v2(bindings=[int(d_input), int(d_output)], stream_handle=stream.handle)
    # Transfer predictions back from the GPU.
    cuda.memcpy_dtoh_async(h_output, d_output, stream)
    # Synchronize the stream
    stream.synchronize()
    # Return the host output. 
    return h_output

米斯特龙_ZXL

关注

3
点赞
踩
14

收藏

觉得还不错? 一键收藏
打赏
4
评论
TensorRT教程12：使用PythonAPI部署推理（重点）

使用PythonAPI部署推理（重点）step1：创建runtimestep2：反序列化创建enginestep3：创建contextstep4：获取输入输出索引step5：创建buffersstep6：为输入输出开辟GPU显存step7：创建cuda流step8：从CPU到GPU----拷贝input数据step9：异步推理step10：从GPU到CPU----拷贝output数据step10：同步cuda流step11：释放资源#导入模块import tensorrt as
复制链接

扫一扫