1.为什么不直接使用pycuda.autoinit?
import pycuda.autoinit
答:自动初始化很多时候不好使,比如多线程等。
2.能正常运行,退出时候报错,什么原因?
PyCUDA ERROR: The context stack was not empty upon module cleanup.
A context was still active when the context stack was being
cleaned up. At this point in our execution, CUDA may already
have been deinitialized, so there is no way we can finish
cleanly. The program will be aborted now.
Use Context.pop() to avoid this problem.
Aborted (core dumped)
答:没有正常退出,需要先删除上下文
ctx = cuda.Device(0).make_context()
...
ctx.pop()
完整代码如下
# import pycuda.autoinit
import pycuda.driver as cuda
import numpy
from pycuda.compiler import SourceModule
# export PATH=/usr/local/cuda/bin:$PATH
cuda.init()
def dott(array_a, array_b):
#ctx.push()
mod = SourceModule("""
__global__ void dot(float *dest, float *a, float *b)
{
const int i = threadIdx.x;
dest[i] = a[i] * b[i];
}
""")
dot = mod.get_function("dot")
dest = numpy.zeros_like(array_a)
dot(cuda.Out(dest), cuda.In(array_a), cuda.In(array_b),block=(len(a),1,1), grid=(1,1))
#
return dest
if __name__ == "__main__":
a = numpy.random.normal(size=40).astype(numpy.float32)
b = numpy.random.normal(size=40).astype(numpy.float32)
ctx = cuda.Device(0).make_context()
print(dott(a, b))
ctx.pop()
# del ctx