随机数在一般编程中都会涉及,特别是在仿真过程中。我之前有几篇文章介绍了如何通过CPU函数调用产生随机数,这里我介绍如何在CUDA中产生随机数。在CUDA中,主要使用cuRAND库中的函数来产生随机数,其链接文档为https://docs.nvidia.com/cuda/curand/index.html 。
通过文档可知,cuRAND库提供两种方式的随机数生成:host与device。关于具体如何生成随机数,我们可参考文档中的Host API Example 和Device API Examples。其中Host API Example的例子非常简单,这里不具体描述,唯一需要注意的点是,在VS中可能会出现error LNK2019的错误,此时需要在项目属性中添加依赖库,具体可以网上搜索答案。本文主要介绍如何参考Device API Examples的例子,来具体实现在核函数中产生随机数。为了便于介绍,我们首先给出了具体的代码实现:
#include"cuda_runtime.h"
#include"device_launch_parameters.h"
#include"curand_kernel.h"// this lib shoulb be included
#include<ctime>
#include<iostream>
#include<random>
using namespace std;
//-------------------generate random numbers-------//
__device__ float generate(curandState *globalState, int ind)
{
curandState localState = globalState[ind];
float RANDOM = curand_uniform(&localState);// uniform distribution
globalState[ind] = localState;
return RANDOM;
}
__global__ void setup_kernel(curandState *state, unsigned long seed)
{
int ix = threadIdx.x + blockIdx.x*blockDim.x;
int iy = threadIdx.y + blockIdx.y*blockDim.y;
int idx = iy * blockDim.x*gridDim.x + ix;
curand_init(seed, idx, 0, &state[idx]);// initialize the state
}
//-------------This is our kernel function where the random numbers generated------//
__global__ void our_kernel(curandState *globalState,int nx,int ny)
{
int ix = threadIdx.x + blockIdx.x*blockDim.x;
int iy = threadIdx.y + blockIdx.y*blockDim.y;
int idx = iy * blockDim.x*gridDim.x + ix;
if (ix < nx&&iy < ny)
{
int k = generate(globalState, idx) * 100000;
printf("%d\n", k);
}
}
int main()
{
int nx = 5;
int ny = 2;// generate nx*ny random numbers
int blockx = 32;
int blocky = 1;
dim3 block(blockx, blocky);//(32,1)
int gridx = (nx + block.x - 1) / block.x;
int gridy = (ny + block.y - 1) / block.y;
dim3 grid(gridx,gridy); //(1,10)
int N = gridx*gridy*blockx*blocky;// the number of states
//--------------------//
curandState* devStates;
cudaMalloc(&devStates, N * sizeof(curandState));
srand(time(0));
int seed = rand();
// Initialize the states
setup_kernel <<<grid, block>>> (devStates, seed);
our_kernel << <grid, block >> > (devStates,nx,ny);
cudaDeviceReset();
return 0;
}
从上面代码可以看出,主要的有两点:一是首先使用setup_kernel来初始化,二是使用generate来产生随机数。最需要注意的是,setup_kernel和our_kernel运行的维度一致<<<grid, block>>>。并且,devStates的维度要相同,这样确保devStates为每一个thread都生成对应的state.
最后给出上述代码结果图:
本文详细介绍了如何在CUDA编程环境中利用cuRAND库生成随机数。通过`setup_kernel`初始化随机数状态,并在核函数`our_kernel`中调用`generate`产生随机数。代码示例展示了如何配置CUDA的kernel,以及如何处理设备内存以确保每个线程拥有独立的随机数生成状态。
346

被折叠的 条评论
为什么被折叠?



