float, small and random

最新推荐文章于 2024-06-23 22:18:44 发布

laschweinski

最新推荐文章于 2024-06-23 22:18:44 发布

阅读量1.1k

点赞数

文章标签： random float generator function integer performance

这篇是关于随机数生成以及它的一些优化的. 转自

http://www.iquilezles.org/www/articles/sfrand/sfrand.htm

随机数在游戏中也会很频繁的应用到. 当大量使用的时候，他的效率也有可能成为游戏效率的一个瓶颈，本文依据浮点数

在计算机中的表示方法通过移位和或将整形缩小到指定范围，提高随机数的效率.

While creating the 195/95/256 64 kilobyte demo me and my colleages learned something very important: never subestimate how slow a "int 2 float / float 2 int" data conversion can be. The problem came when debugging the software synthetiser made by Marc (Gortu). After several test and profiles, we found something really atonishing: one of the bottlenecks of the synth was the noise generator. That one was basically creating an integer pseudo-random number (using the same generator as VC) and then casting it to float and scaling it down to +-1.0 range. Here goes the original code:

float sfrand( int *seed )

    seed[0] = 0x00269ec3 + seed[0]*0x000343fd;

    int a = (seed[0] >> 16) & 32767;

    return( -1.0f + (2.0f/32767.0f)*(float)a );

这段简述了下传统的即系统的随机数产生方式，其实都是基于伪随机数的，种子设好后 后续的所有随机数事实上就

已经确定了. 本函数把随机数缩小到（-1，1）这个范围内.

具体的可以参看wiki http://www.cs.utsa.edu/~wagner/laws/rng.html

First of all, note that the function takes a seed as parameter. As you probably know the regular Ansi C rand() function doesn't take any argument, what really sucks a lot, cause you cannot use the function in a multithreaded enviroment, like a raytracer or a demo where both sound synthetizer and renderer need random numbers. So, I usually pass the seed as parameter, that you can see as a "context" for the function: you should keep one for each thread, probably on stak. Anyway.Apparently the code is already quite simple. However, as I said, the cast from int to float was simply killing performance. It's quite well known that the fild instrucction is quite slow, but we never expected it to be a boottleneck! That's the reason I started to experiment on creating a random number generator able to directly give random floats within the range of -1 to 1. The idea was really simple, and worked quite well: first step was to create a 32 bit random bits integer (because a float is made of 32 bits also). Since the original algorithm of VC only gives 15 random bits, I made a fast investigation on the net to find thatseed[0] *= 16807;was not only doing the job a lot better thanseed[0] = 0x00269ec3 + seed[0]*0x000343fd

but also faster. I just had to take care not to initalize "mirand" to 0. In fact, the (16807,0) pair of values for the congruential random generator creates 32 random bits. So, the only thing I had to do was to interpret those 32 bit as a float value, and I should get a random float value, with no conversion at all!

But still a small issue remained - how to make that float be in the correct range? I tried tweaking the appropiate bits in the exponent of my float (according to IEEE 754 standard, that is used in PCs and Macintosh machines, have a look athttp://en.wikipedia.org/wiki/IEEE_floating-point_standard). So I carrefully selected the bits to be modified. Have a look below to the layout of the bits on the floating point format:

   33        22                          0

   10        32                          0

   seee eeee efff ffff ffff ffff ffff ffff

where

s = sign bit
e = exponent
f = fractional part of the mantisa

value = s * 2^(e-127) * m, where m = 1.f, and thus 1<=m<2

The main idea is to realize that we already have a random fractional part, what means we have a random mantisa between 1 and 2. We could just fix our exponent to 127, the sign bit to cero and that way I would get a random floating point number between 1 and 2. I could afterwards scale (by 2.0) and offset it (by -3.0) to make it fit in the segment [-1,1). But, I realized it can be done a bit better and avoid the scaling by directly generating a float random number between 2 and 4. For that exponent must be forced to be 128, so that the output value is

value = s * 2 * m

that belongs to the range [2,4). So, first operation to do to the 32 randon bits is to mask the sign and exponent bits with

   seee eeee efff ffff ffff ffff ffff ffff

   0000 0000 0111 1111 1111 1111 1111 1111

or 0x007fffff in hexadecimal. Then just the exponent is set to 128 (10000000 in binary), with the next bit pattern


   seee eeee efff ffff ffff ffff ffff ffff

   0100 0000 0000 0000 0000 0000 0000 0000

or 0x40000000 in hexadecimal. So, finally, the complete signed floating point random generator looks like:

float sfrand( int *seed )

    float res;

    seed[0] *= 16807;

    *((unsigned int *) &res) = ( ((unsigned int)seed[0])>>9 ) | 0x40000000;

    return( res-3.0f );

这样使得把整形只通过移位和|直接转换成了浮点数，效率肯定比之前快了很多。但其实问题的关键还是在于理解浮点数

在计算机中的表示法

Some compilers might complain about creating a pointer to an int data that points to a location reserved for a float. If that's the case, you can workaround it by using a union (thx Reinder Nijhoff for the sugestion). The code, or a function that returns a random number bewtween 0 and 1 would then look like this:

float frand( int *seed )

    union

        float fres;

        unsigned int ires;

};

    seed[0] *= 16807;

    ires = ((((unsigned int)seed)>>9 ) | 0x3f800000);

    return fres - 1.0f;

} versions cannot be simpler and faster, perfect for a 4k or 64k intro! (well, some tricks can be done to this when translating to assembly, of course). I made some measures to test performance, and I got 4 times the performance of the old generator. Regarding the quality, this generator beats a normal integer random value with rand() and then scaling it as the original function shown here does. Remember this one generates 23 random bits instead of 15. I also checked the density distribution, and in fact I found it to be perfectly uniform, and quite better than the old version. So, what else can we ask to our improved random number generator?