报错:Assertion failed (localThreads[0] * localThreads[1] * localThreads[2] <= kernelWorkGroupSize) in void cv::ocl::openCLVerifyKernel(const cv::ocl::Context*, cl_kernel, std::size_t*)
报错定位:cl_operations.cpp中的void openCLExecuteKernel(Context *ctx, const cv::ocl::ProgramEntry* source, string kernelName, size_t globalThreads[3], size_t localThreads[3],vector< pair<size_t, const void *> > &args, int channels, int depth, const char *build_options)
解决方法:打印在此处打印kernekName,按照上一篇文章方式并重新编译opencv,得到报错kernel,然后到ocl源码对应的cpp文件修改相应的localThreads数组。
补充两句:
每一个kernel中localThreads数组会指定一个work items。具体该数组的值有如下要求:
CV_Assert( localThreads[0] <= ctx->getDeviceInfo().maxWorkItemSizes[0] );
CV_Assert( localThreads[1] <= ctx->getDeviceInfo().maxWorkItemSizes[1] );
CV_Assert( localThreads[2] <= ctx->getDeviceInfo().maxWorkItemSizes[2] );
CV_Assert( localThreads[0] * localThreads[1] * localThreads[2] <= kernelWorkGroupSize );
CV_Assert( localThreads[0] * localThreads[1] * localThreads[2] <= ctx->getDeviceInfo().maxWorkGroupSize );
根据具体GPU型号,阈值会不一样。