在A10显卡上用TensorRT做模型量化时,报如下的错误。
[W] [TRT] Calibration Profile is not defined. Running calibration with Profile 0
[I] calib data processed : 0/4680batch
[E] [TRT] 1: [calibrator.cpp::add::779] Error Code 1: Cuda Runtime (an illegal memory access was encountered)
[E] [TRT] [executionContext.cpp::commonEmitDebugTensor::1258] Error Code 1: Cuda Runtime (an illegal memory access was encountered)
[E] [TRT] [executionContext.cpp::executeInternal::610] Error Code 1: Cuda Runtime (an illegal memory access was encountered)
[F] [TRT] [defaultAllocator.cpp::free::85] Error Code 1: Cuda Runtime (an illegal memory access was encountered)
[F] [TRT] [defaultAllocator.cpp::free::85] Error Code 1: Cuda Runtime (an illegal memory access was encountered)
[F] [TRT] [defaultAllocator.cpp::free::85] Error Code 1: Cuda Runtime (an illegal memory access was encountered)
terminate called after throwing an instance of 'nvinfer1::CudaRuntimeError'
what(): an illegal memory access was encountered
Aborted
这是因为架构不匹配,A10要加上86
因此修改我的makefile,下面是我的makefile片段,本来SMS ?= 60 61 62 70 72 75,我在后面增加了86.
# Gencode arguments
SMS ?= 60 61 62 70 72 75 86
ifeq ($(GENCODE_FLAGS),)
# Generate SASS code for each SM architecture listed in $(SMS)
$(foreach sm,$(SMS),$(eval GENCODE_FLAGS += -gencode arch=compute_$(sm),code=sm_$(sm)))
参考文献: