opencl获取gpu信息_OpenCL的从GPU读取可变大小结果缓冲区

最新推荐文章于 2022-09-30 16:43:59 发布

当年流水

最新推荐文章于 2022-09-30 16:43:59 发布

阅读量248

点赞数

文章标签： opencl获取gpu信息

本文链接：https://blog.csdn.net/weixin_32334209/article/details/111959599

版权

I have one searching OpenCL 1.1 algorithm which works well with small amount of data:

1.) build the inputData array and pass it to the GPU

2.) create a very big resultData container (e.g. 200000 * sizeof (cl_uint) ) and pass this one too

3.) create the resultSize container (inited to zero) which can be access via atomic operation (at least I suppose this)

When one of my workers has a result it copies that into the the resultData buffer and increments the resultSize in an atomic inc operation (until the buffer is full).

Let me write a code example (opencl code):

lastPosition = atomic_add(resultBufferSize, 5);

while (lastPosition > RESULT_BUFFER_SIZE)

{

lastPosition = atomic_add(resultBufferSize, 5);

}

And on the host side I read the buffer and set resultBufferSize to zero:

resultBufferSize = 0;

oclErr |= clEnqueueWriteBuffer(gpuAcces.getCqCommandQueue(), cm_resultBufferSize, CL_TRUE, 0, sizeof(cl_uint), (void*)&resultBufferSize, 0, NULL, NULL);

Now my problem is:

I have much more results than the resultData can store. And anyway I have no idea about the size of the result (e.g. how many paths I can find).

My idea:

time to time I would empty ( or process) the container on the host side and reset the resultSize when the buffer is full and the workers would wait in a while loop.

I liked this idea because I can process the data parallel on the host too.

But I was not able to implement any solution yet for this:

1.) NVIDIA cannot work with endless while or at least I cannot use it. When I try use endless loop the card crashed.

2.) barrier() anf mem_fence() can manage sync issue but not this one

Do you have any robust idea how I can handle not fix result sizes (e.g. during searching problems)? I almost pretty sure there must be a good patterns but I cannot find it.

Is there any sleep in NVIDIA opencl? Because I would put it into the endless loop maybe this can help a bit me.

I guess the variable result is an old issue and there must be good patterns.

I had a similar issue in my earlier post (but the context was different).

解决方案

You have not clearly indicated that you are using Windows as OS but I assume it since you have the VS2013 tag in your question.

The Nvidia card does not crash. On Windows you have Timeout Detection & Recovery (TDR) in the WDDM driver which restarts GPU drivers if they become unresponsive. You can disable this "feature" with Nsight easily. However, be aware that this may cause problems with your desktop environment, so make sure to write a kernel that will end in a tolerable amount of time. Then you can run your very long kernels even on Windows with Nvidias OpenCL implementation.