darknet动态库集成到公司软件时报错：CUDNN_STATUS_BAD_PARAM

最新推荐文章于 2023-09-25 09:53:41 发布

haimianjie2012

最新推荐文章于 2023-09-25 09:53:41 发布

阅读量2.2k

点赞数 1

分类专栏：高性能计算HPC # darknet 文章标签： CUDA cudnn 深度学习 darknet

本文链接：https://blog.csdn.net/haimianjie2012/article/details/103526304

版权

darknet 同时被 2 个专栏收录

68 篇文章 7 订阅

订阅专栏

高性能计算HPC

40 篇文章 18 订阅

订阅专栏

更多文章参考：自己动手实现darknet预测分类动态库

报错：

[5956]  MPTLOG 12256 cuDNN status Error in: file: convolutional_layer.c : cudnn_convolutional_setup() : line: 237 : build time: Dec 13 2019 - 11:54:32 status:3

[5956] MPTLOG 12256 cuDNN status Error in: file: convolutional_layer.c : cudnn_convolutional_setup() : line: 237 : build time: Dec 13 2019 - 11:54:32 status:3

status=3:CUDNN_STATUS_BAD_PARAM

报错代码：

#if((CUDNN_MAJOR*10 + CUDNN_MINOR) >= 72)   // cuDNN >= 7.2
    CHECK_CUDNN(cudnnSetConvolutionMathType(l->convDesc, CUDNN_TENSOR_OP_MATH_ALLOW_CONVERSION));
#endif

代码理解：

For the supported GPUs, the Tensor Core operations will be triggered for convolution functions only when cudnnSetConvolutionMathType() is called on the appropriate convolution descriptor by setting the mathType to CUDNN_TENSOR_OP_MATH or CUDNN_TENSOR_OP_MATH_ALLOW_CONVERSION.

对于支持的GPU,只有在对适当的卷积描述符调用cudnnSetConvolutionMathType（）时，才会为卷积函数触发张量核心操作，方法是将mathType设置为CUDNN_Tensor_OP_MATH或CUDNN_Tensor_OP_MATH_ALLOW_CONVERSION。

3.180. cudnnSetConvolutionMathType()
cudnnStatus_t cudnnSetConvolutionMathType(
    cudnnConvolutionDescriptor_t    convDesc,
    cudnnMathType_t                 mathType)
This function allows the user to specify whether or not the use of tensor op is permitted in the library routines associated with a given convolution descriptor.

Returns
CUDNN_STATUS_SUCCESS
The math type was set successfully.

CUDNN_STATUS_BAD_PARAM
Either an invalid convolution descriptor was provided or an invalid math type was specified.

此函数允许用户指定，在与给定卷积描述符相关联的库例程中是否允许使用tensor op。

提供了无效的卷积描述符或指定了无效的数学类型时，返回CUDNN_STATUS_BAD_PARAM。

A new mode CUDNN_TENSOR_OP_MATH_ALLOW_CONVERSION is added to cudnnMathType_t. The computation time for FP32 tensors can be reduced by selecting this mode.
The functions cudnnRNNForwardInference(), cudnnRNNForwardTraining(), cudnnRNNBackwardData(), and cudnnRNNBackwardWeights() will now perform down conversion of FP32 input/output only when CUDNN_TENSOR_OP_MATH_ALLOW_CONVERSION is set.
Improved the heuristics for cudnnGet*Algorithm() functions.

A new mode CUDNN_TENSOR_OP_MATH_ALLOW_CONVERSION is added to cudnnMathType_t. The computation time for FP32 tensors can be reduced by selecting this mode.选择这种模式可以减少FP32张量的计算时间。
The functions cudnnRNNForwardInference(), cudnnRNNForwardTraining(), cudnnRNNBackwardData(), and cudnnRNNBackwardWeights() will now perform down conversion of FP32 input/output only when CUDNN_TENSOR_OP_MATH_ALLOW_CONVERSION is set.
函数cudnnrnforwardinference（）、cudnnrnforwardtraining（）、cudnnrnbackwarddata（）和cudnnrnbackwardweights（）只有在设置了CUDNN_TENSOR_OP_MATH_ALLOW_CONVERSION时,执行FP32输入/输出的下转换。

Following issues and limitations exist in this release:

本版本中存在以下问题和限制：

When tensor cores are enabled in cuDNN 7.3.0, the wgrad calculations will perform an illegal memory access when K and C values are both non-integral multiples of 8. This will not likely produce incorrect results, but may corrupt other memory depending on the user buffer locations. This issue is present on Volta & Turing architectures.
当在cuDNN 7.3.0中启用张量核时，当K和C值都是8的非整数倍时，wgrad计算将执行非法内存访问。这可能不会产生不正确的结果，但可能会损坏其他内存，具体取决于用户缓冲区的位置。这个问题出现在Volta和Turing架构上。
Using cudnnGetConvolution*_v7 routines with cudnnConvolutionDescriptor_t set to CUDNN_TENSOR_OP_MATH_ALLOW_CONVERSION leads to incorrect outputs. These incorrect outputs will consist only of CUDNN_TENSOR_OP_MATH_ALLOW_CONVERSION cases, instead of also returning the performance results for both DEFAULT_MATH and CUDNN_TENSOR_OP_MATH_ALLOW_CONVERSION cases.
使用cudnnGetConvolution*uv7例程，并将cudnnConvolutionDescriptor设置为CUDNN_TENSOR_OP_MATH_ALLOW_CONVERSION将导致不正确的输出。这些不正确的输出将仅包含CUDNN_TENSOR_OP_MATH_ALLOW_CONVERSION cases，而不是同时返回DEFAULT_MATH和CUDNN_TENSOR_OP_MATH_ALLOW_CONVERSION cases的性能结果。

如果VS，C/C++-->预编译头--》去掉CUDNN后，运行不报错；

但是去掉CUDNN后，预测出现nan问题。

因为报错代码只是加速卷积运算，所以注释掉该代码问题解决。

#if((CUDNN_MAJOR*10 + CUDNN_MINOR) >= 72)   // cuDNN >= 7.2
    //CHECK_CUDNN(cudnnSetConvolutionMathType(l->convDesc, CUDNN_TENSOR_OP_MATH_ALLOW_CONVERSION));
#endif

可能错误原因：

1.其他设备在用GPU时会报错

2.参数不对，访问了无效数据

官网文档：https://docs.nvidia.com/deeplearning/sdk/cudnn-archived/cudnn_701/cudnn-user-guide/index.html

CUDNN_STATUS_BAD_PARAM

An incorrect value or parameter was passed to the function.

To correct: ensure that all the parameters being passed have valid values.

因为单独测试时不报错，跟公司软件一起测试也不报错，但是在公司软件内部调用时报错，所以参数应该没有问题，估计是其他设备在使用GPU报错

参考文献：

https://devblogs.nvidia.com/tensor-ops-made-easier-in-cudnn/

Caffe与cudnn 6.0 的兼容性问题 CUDNN_STATUS_BAD_PARAM

caffe报错：cudnn.hpp:86] Check failed: status == CUDNN_STATUS_SUCCESS (3 vs. 0) CUDNN_STATUS_BAD_PARAM 原因

Check failed: status == CUDNN_STATUS_SUCCESS (3 vs. 0) CUDNN_STATUS_BAD_PARAM

haimianjie2012

关注

1
点赞
踩
0

收藏

觉得还不错? 一键收藏
打赏
4
评论
darknet动态库集成到公司软件时报错：CUDNN_STATUS_BAD_PARAM

更多文章参考：自己动手实现darknet预测分类动态库报错：[5956] MPTLOG 12256 cuDNN status Error in: file: convolutional_layer.c : cudnn_convolutional_setup() : line: 237 : build time: Dec 13 2019 - 11:54:32 status:3[595...
复制链接

扫一扫