单进程推理成功之后,直接多进程包装,会出现以下错误:
Tensorrt ERROR: CUDA initialization failure with error 3
解决步骤:
- 显式初始化cuda driver,每个进程都要进行cuda初始化,即:import pycuda.driver as cuda,cuda.init(),而不是直接import pycuda.autoinit
- 在初始化tensorrt之前,初始化pycuda上下文,即:在初始化trt的最开头加入self.cfx= cuda.Device(0).make_context()
- 在推理代码inference方法的最开头加入self.cfx.push(),最末尾加入self.cfx.pop()
我自己的案例:重点看中文注释,注意位置
import pycuda.driver as cuda
import tensorrt as trt
from multiprocessing import Process
def init(): # 1. 子进程开始初始化cuda driver
cuda.init()
class HostDeviceMem(object):
def __init__(self, host_mem, device_mem):
self.host = host_mem
self.device = device_mem
def __str__(self):
return "Host:\n" + str(self.host) + "\nDevice:\n" + str(self.device)
def __repr__(self):
return self.__str__()
TRT_LOGGER = trt.Logger(trt.Logger.ERROR)
class INFER_FAST(object):
def __init__(self):
self.cfx=