Host API
要使用cuRAND host API ,需要引用头文件<curand.h>
,步骤如下
- Create a new generator of the desired type with
curandCreateGenerator()
. - Set the generator options ;for example, use
curandSetPseudoRandomGeneratorSeed()
to set the seed. - Allocate memory on the device with
cudaMalloc()
. - Generate random numbers with
curandGenerate()
or another generation function. - Clean up with
curandDestroyGenerator()
.
如果想在CPU生成随机数,在step1调用curandCreateGeneratorHost()
,并在step3申请host 内存接受结果,其他步骤不变。
在同一时间创建多个generators是合法的,每个generators生成的随机数序列跟设置参数有关,设置参数相同,生成的随机数序列也会相同,无论是在GPU还是CPU。
不能向运行在GPU上的generator传递host memory pointer,也不能向运行在CPU上的generator传递device memory pointer.
Generator Types
curandStatus_t curandCreateGenerator(curandGenerator_t* generator , curandRngType_t rng_type);
-
generator
Pointer to generator
-
rng_type
Type of generator to create
通过传递一个类型来创建 Random number generators。
伪随机数生成器(pseudorandom number generators):
- CURAND_RNG_PSEUDO_XORWOW,
- CURAND_RNG_PSEUDO_MRG32K3A,
- CURAND_RNG_PSEUDO_MTGP32,
- CURAND_RNG_PSEUDO_PHILOX4_32_10
- CURAND_RNG_PSEUDO_MT19937
只能用于Host API,sm_35架构以上
拟随机数生成器(quasi random number generator):
- CURAND_RNG_QUASI_SOBOL32,
- CURAND_RNG_QUASI_SCRAMBLED_SOBOL32,
- CURAND_RNG_QUASI_SOBOL64
- CURAND_RNG_QUASI_SCRAMBLED_SOBOL64
Generator Options
curandStatus_t curandSetPseudoRandomGeneratorSeed(curandGenerator_t generator, unsigned long long seed);
curandStatus_t curandSetQuasiRandomGeneratorDimensions(curandGenerator_t generator, unsigned int num_dimensions );
curandStatus_t curandSetGeneratorOffset(curandGenerator_t generator, unsigned long long offset);
curandStatus_t curandSetGeneratorOrdering(curandGenerator_t generator,curandOrdering_t order);
curandStatus_t curandSetStream(curandGenerator_t generator,cudaStream_t stream);
Random number generators一旦创建好了,就可以用general options seed, offset, and order来定义generators。
- Seed
是一个64-bit integar,用来初始化伪随机数生成器的起始状态,相同的seed经常生成相同的随机数序列 - Offset
offset参数用来跳过随机数序列的开头,第一个随机数从序列的第offset个开始取,这使得多次运行同一程序,从相同随机数序列生成的随机数不重叠。is not available for theCURAND_RNG_PSEUDO_MTGP32
andCURAND_RNG_PSEUDO_MT19937
generators - Order
用来选择结果如何在全局内存中排序- CURAND_ORDERING_PSEUDO_DEFAULT
- CURAND_ORDERING_PSEUDO_LEGACY
- CURAND_ORDERING_PSEUDO_BEST
- CURAND_ORDERING_PSEUDO_SEEDED
- CURAND_ORDERING_QUASI_DEFAULT
用于拟随机数的选项
Return Values
All cuRAND host library calls have a return value of curandStatus_t
. Calls that succeed without errors return CURAND_STATUS_SUCCESS
.
Generation Functions
curandStatus_t curandGenerate(curandGenerator_t generator,unsigned int *outputPtr, size_t num);
curandStatus_t curandGenerateLongLong(curandGenerator_t generator,unsigned long long *outputPtr, size_t num);
curandGenerate()
用于生成32位unsigned int型 pseudo- or quasi-random for XORWOW
, MRG32k3a
, MTGP32
, MT19937
, Philox_4x32_10
and SOBOL32
generators,其中每一位都是随机的。
curandGenerateLongLong()
用于生成64位unsigned long long类型的随机数with the SOBOL64 generators.
curandStatus_t curandGenerateUniform(curandGenerator_t generator, float *outputPtr, size_t num);
用于生成 ( 0.0 , 1.0 ] (0.0,1.0] (0.0,1.0]的均匀分布float型随机数,0是取不到的。
curandStatus_t curandGenerateNormal(curandGenerator_t generator, float *outputPtr, size_t n, float mean, float stddev);
用于生成均值为mean,标准差为stddev的正态分布float型随机数。
curandStatus_t curandGenerateNormal(curandGenerator_t generator, float *outputPtr, size_t n,float mean, float stddev);
用于生成均值为mean,标准差为stddev的对数正态分布float型随机数。
curandStatus_t curandGeneratePoisson(curandGenerator_t generator, unsigned int *outputPtr, size_t n,double lambda);
用于生成泊松分布随机数
curandStatus_t curandGenerateUniformDouble(curandGenerator_t generator, double *outputPtr, size_t num);
用于生成双精度均匀分布随机数
curandStatus_t curandGenerateNormalDouble(curandGenerator_t generator,double *outputPtr, size_t n,double mean, double stddev);
用于生成双精度正态分布随机数
curandStatus_t curandGenerateLogNormalDouble(curandGenerator_t generator,double *outputPtr, size_t n, double mean, double stddev);
用于生成双精度对数正态分布随机数
Device API
要使用cuRAND Device API ,需要引用头文件<curand_kernel.h>
,包括 pseudorandom generationfor and quasirandom generation。
init
__device__ void curand_init(unsigned long long seed, unsigned long long sequence,unsigned long long offset, curandState_t *state);
__device__ unsigned int curand (curandState_t *state);
在调用curand_init()
后,curand()
返回一系列伪随机数。如果state相同,则curand()
生成相同的随机数序列。curand_init()
函数使用给定的种子、序列号和序列内的偏移量设置调用者分配的初始状态。不同的种子保证产生不同的起始状态和不同的序列。相同的种子总是产生相同的状态和相同的序列。
对于最高质量的并行伪随机数生成,每个实验都应该分配一个唯一的种子。在一个实验中,每个计算线程都应该分配一个唯一的序列号。
Distributions
__device__ float curand_uniform(curandState_t *state);
__device__ float curand_normal(curandState_t *state);
__device__ float curand_log_normal(curandState_t *state, float mean, float stddev);
__device__ unsigned int curand_poisson(curandState_t *state, double lambda);
__device__ double curand_uniform_double(curandState_t *state);
__device__ double curand_log_normal_double(curandState_t *state, double mean, double stddev);
Quasirandom Sequences
state初始化
__device__ void curand_init(unsigned int *direction_vectors, unsigned int offset,curandStateSobol32_t *state);
__device__ void curand_init(unsigned int *direction_vectors,unsigned int scramble_c, unsigned int offset,curandStateScrambledSobol32_t *state);
__device__ void curand_init(unsigned long long *direction_vectors, unsigned long long offset,curandStateSobol64_t *state);
__device__ void curand_init(unsigned long long *direction_vectors,unsigned long long scramble_c, unsigned long long offset,curandStateScrambledSobol64_t *state);
curand_init()
用来初始化quasirandom number generator 的state,没有seed
参数,有一个额外参数scramble_c
,which is the initial value of the scrambled sequence。
对于curandStateSobol32_t
类型和 curandStateScrambledSobol32_t
类型 ,the direction vectors are an array of 32 unsigned integer values.
对于curandStateSobol64_t
类型和curandStateScrambledSobol64_t
类型, the direction vectors are an array of 64 unsigned long long values.
生成随机数
__device__ unsigned int curand(curandStateSobol32_t *state);
__device__ float curand_uniform(curandStateSobol32_t *state);
__device__ float curand_normal(curandStateSobol32_t *state);
__device__ float curand_log_normal(curandStateSobol32_t *state,float mean, float stddev);
__device__ unsigned int curand_poisson(curandStateSobol32_t *state, double lambda);
__device__ double curand_uniform_double (curandStateSobol32_t *state);
__device__ double curand_normal_double (curandStateSobol32_t *state);
__device__ double curand_log_normal_double (curandStateSobol32_t *state,double mean,double stddev);
Skip-Ahead
__device__ void
skipahead(unsigned long long n, curandState_t *state);
__device__ void
skipahead(unsigned int n, curandStateSobol32_t *state);
Using this function is equivalent to calling curand()
n times without using the return value, but it is much faster.
__device__ void skipahead_sequence(unsigned long long n, curandState_t *state);
This function is the equivalent of calling curand()
n
⋅
2
67
n⋅2^{67}
n⋅267 times without using the return value and is much faster.