马特赛特旋转演算法(Mersenne Twister)

马特赛特旋转演算法(Mersenne Twister)是一种广泛用于软件系统的伪随机数生成算法,其优点包括长时间周期和良好的统计性质。然而,它也存在缺点,如初始化复杂和不适用于实时需求。该算法有多种替代选择,如更现代的随机数生成器。本文将探讨算法的细节、Python实现,并通过可视化展示其工作原理。
摘要由CSDN通过智能技术生成

Some of Mersenne Twister

认识一下马特赛特旋转演算法

Mersenne Twister算法是一个伪随机数发生器(PRNG),其主要作用是生成伪随机数,它是目前为止应用最广泛的伪随机生成器,此算法的名字起于周期长度通常取Mersenne质数这样一个事实。马特赛特旋转演算法是Makoto Matsumoto (松本)和Takuji Nishimura (西村)于1997年开发的,基于有限二进制字段上的矩阵线性再生。可以快速产生高质量的伪随机数,修正了古老随机数产生算法的很多缺陷。最普遍使用的此算法版本是基于梅森质数219937−1,它的标准继承MT19937使用32位字长;此外还存在另一种使用64位字长的继承算法MT19937-64,它产生完全不同的序列。

The Mersenne Twister is a pseudorandom number generator (PRNG). It is by far the most widely used general-purpose PRNG.[1] Its name derives from the fact that its period length is chosen to be a Mersenne prime.

The Mersenne Twister was developed in 1997 by Makoto Matsumoto (ja) (松本 眞) and Takuji Nishimura (西村 拓士).[2] It was designed specifically to rectify most of the flaws found in older PRNGs. It was the first PRNG to provide fast generation of high-quality pseudorandom integers.

The most commonly used version of the Mersenne Twister algorithm is based on the Mersenne prime 219937−1. The standard implementation of that, MT19937, uses a 32-bit word length. There is another implementation that uses a 64-bit word length, MT19937-64; it generates a different sequence.

Adoption in software systems

软件系统中的使用

The Mersenne Twister is the default PRNG for the following software systems:
在以下系统中马特赛特演算法是默认的伪随机数发生器:

    Microsoft Visual C++,[3] Microsoft Excel,[4] GAUSS,[5] GLib,[6] GNU Multiple Precision Arithmetic Library,[7] GNU Octave,[8] GNU Scientific Library,[9] gretl,[10] IDL,[11] Julia,[12] CMU Common Lisp,[13] Embeddable Common Lisp,[14] Steel Bank Common Lisp,[15] Maple,[16] MATLAB,[17] Free Pascal,[18] PHP,[19] Python,[20][21] R,[22] Ruby,[23] SageMath,[24] Scilab,[25] Stata.[26] It is also available in Apache Commons,[27] in standard C++ (since C++11),[28][29] and in Mathematica.[30] Add-on implementations are provided in many program libraries, including the Boost C++ Libraries,[31] the CUDA Library,[32] and the NAG Numerical Library.[33]

马特赛特旋转演算法是SPSS中伪随机生成器之一:另外一种生成器只对旧的程序保持兼容性,相对的马特赛特算法被认为更加可行。马特赛特算法也与SAS中的一个伪随机生成器类似,但是另外一个被认为很老旧而且已经不被赞成使用。
The Mersenne Twister is one of two PRNGs in SPSS: the other generator is kept only for compatibility with older programs, and the Mersenne Twister is stated to be "more reliable".[34] The Mersenne Twister is similarly one of the PRNGs in SAS: the other generators are older and deprecated.[35]

Advantages

优点

The commonly used version of Mersenne Twister, MT19937, which produces a sequence of 32-bit integers, has the following desirable properties:
使用广泛的马特赛特算法MT19937会产生一个32位整数序列,它拥有以下令人满意的特性:

It has a very long period of 219937 − 1. While a long period is not a guarantee of quality in a random number generator, short periods (such as the 232 common in many older software packages) can be problematic.[36]
马特赛特算法拥有非常长的219937 − 1周期,尽管龙周期并不是随机数字生成的保证,但是短周期(比如在许多旧的软件包中使用的通用232算法)问题更多。

It is k-distributed to 32-bit accuracy for every 1 ≤ k ≤ 623 (请看下面的定义).
对于1 ≤ k ≤ 623它是很精确的32位K分布。
It passes numerous tests for statistical randomness, including the Diehard tests.
它为统计随机性传递很多的测试,包括极端顽固测试

Disadvantages

缺点

The large state space comes with a performance cost: the 2.5 KiB state buffer will place a load on the memory caches. In 2011, Saito & Matsumoto proposed a version of the Mersenne Twister to address this issue. The tiny version, TinyMT, uses just 127 bits of state space.[37]

By today's standards, the Mersenne Twister is somewhat slow, [38] unless the SFMT implementation is used (see section below).

It passes most, but not all, of the stringent TestU01 randomness tests.[39] Because it is based on simple linear (xor) operations, it fails tests based on linear complexity after relatively few bits of output, despite its extremely large state.[citation needed] Passing the output through a simple hash function can remedy this weakness.

Multiple Mersenne Twister instances that differ only in seed value (but not other parameters) are not generally appropriate for Monte-Carlo simulations that require independent random number generators, though there exists a method for choosing multiple sets of parameter values.[40][41]

It can take a long time to start generating output that passes randomness tests, if the initial state is highly non-random—particularly if the initial state has many zeros. A consequence of this is that two instances of the generator, started with initial states that are almost the same, will usually output nearly the same sequence for many iterations, before eventually diverging. The 2002 update to the MT algorithm has improved initialization, so that beginning with such a state is very unlikely.[42]

k-distribution
K分布

A pseudorandom sequence xi of w-bit integers of period P is said to be k-distributed to v-bit accuracy if the following holds.

Let truncv(x) denote the number formed by the leading v bits of x, and consider P of the kv-bit vectors

    ( trunc v ( x i ) , trunc v ( x i + 1 ) , . . . , trunc v ( x i + k − 1 ) ) ( 0 ≤ i < P ) {\displaystyle ({\text{trunc}}_{v}(x_{i}),\,{\text{trunc}}_{v}(x_{i+1}),\,...,\,{\text{trunc}}_{v}(x_{i+k-1}))\quad (0\leq i<P)} ({\text{trunc}}_{v}(x_{i}),\,{\text{trunc}}_{v}(x_{i+1}),\,...,\,{\text{trunc}}_{v}(x_{i+k-1}))\quad (0\leq i<P).

Then each of the 2kv possible combinations of bits occurs the same number of times in a period, except for the all-zero combination that occurs once less often.

Alternatives

替代选择

The algorithm in its native form is not cryptographically secure. The reason is that observing a sufficient number of iterations (624 in the case of MT19937, since this is the size of the state vector from which future iterations are produced) allows one to predict all future iterations.

A pair of cryptographic stream ciphers based on output from the Mersenne Twister has been proposed by Matsumoto, Nishimura, and co-authors. The authors claim speeds 1.5 to 2 times faster than Advanced Encryption Standard in counter mode.[43]

An alternative generator, WELL ("Well Equidistributed Long-period Linear"), offers quicker recovery, and equal randomness, and nearly equal speed.[44] Marsaglia's xorshift generators and variants are the fastest in this class.[45]

Algorithmic detail
算法细节

这里写图片描述
Visualisation of generation of pseudo-random 32-bit integers using a Mersenne Twister. The ‘Extract number’ section shows an example where integer 0 has already been output and the index is at integer 1. ‘Generate numbers’ is run when all integers have been output.

For a w-bit word length, the Mersenne Twister generates integers in the range [0, 2w−1].

The Mersenne Twister algorithm is based on a matrix linear recurrence over a finite binary field F2. The algorithm is a twisted generalised feedback shift register[46] (twisted GFSR, or TGFSR) of rational normal form (TGFSR(R)), with state bit reflection and tempering. The basic idea is to define a series x i {\displaystyle x_{i}} x_{i} through a simple recurrence relation, and then output numbers of the form x i T {\displaystyle x_{i}T} x_{i}T, where T {\displaystyle T} T is an invertible F2 matrix called a tempering matrix.

The general algorithm is characterized by the following quantities (some of these explanations make sense only after reading the rest of the algorithm):

w: word size (in number of bits)
n: degree of recurrence
m: middle word, an offset used in the recurrence relation defining the series x, 1 ≤ m < n
r: separation point of one word, or the number of bits of the lower bitmask, 0 ≤ r ≤ w - 1
a: coefficients of the rational normal form twist matrix
b, c: TGFSR(R) tempering bitmasks
s, t: TGFSR(R) tempering bit shifts
u, d, l: additional Mersenne Twister tempering bit shifts/masks

with the restriction that 2nw − r − 1 is a Mersenne prime. This choice simplifies the primitivity test and k-distribution test that are needed in the parameter search.

The series x is defined as a series of w-bit quantities with the recurrence relation:

x k + n := x k + m ⊕ ( ( x k u ∣∣ x k + 1 l ) A ) k = 0 , 1 , … {\displaystyle x_{k+n}:=x_{k+m}\oplus \left(({x_{k}}^{u}\mid \mid {x_{k+1}}^{l})A\right)\qquad \qquad k=0,1,\ldots } {\displaystyle x_{k+n}:=x_{k+m}\oplus \left(({x_{k}}^{u}\mid \mid {x_{k+1}}^{l})A\right)\qquad \qquad k=0,1,\ldots }

where ∣ ∣ {\displaystyle \mid \mid } {\displaystyle \mid \mid } denotes concatenation of bit vectors (with upper bits on the left), ⊕ {\displaystyle \oplus } \oplus the bitwise exclusive or (XOR), x k u {\displaystyle {x_{k}}^{u}} {x_{k}}^{u} means the upper w − r {\displaystyle w-r} w-r bits of x k {\displaystyle x_{k}} x_{k}, and x k + 1 l {\displaystyle x_{k+1}^{l}} x_{k+1}^{l} means the lower r {\displaystyle r} r bits of x k + 1 {\displaystyle x_{k+1}} x_{k+1}. The twist transformation A is defined in rational normal form as:

A = ( 0 I w − 1 a w − 1 ( a w − 2 , … , a 0 ) ) {\displaystyle A={\begin{pmatrix}0&I_{w-1}\\a_{w-1}&(a_{w-2},\ldots ,a_{0})\end{pmatrix}}} A={\begin{pmatrix}0&I_{w-1}\\a_{w-1}&(a_{w-2},\ldots ,a_{0})\end{pmatrix}}

    with In − 1 as the (n − 1) × (n − 1) identity matrix. The rational normal form has the benefit that multiplication by A can be efficiently expressed as: (remember that here matrix multiplication is being done in F2, and therefore bitwise XOR takes the place of addition)

x A = { x ≫ 1 x 0 = 0 ( x ≫ 1 ) ⊕ a x 0 = 1 {\displaystyle {\boldsymbol {x}}A={\begin{cases}{\boldsymbol {x}}\gg 1&x_{0}=0\\({\boldsymbol {x}}\gg 1)\oplus {\boldsymbol {a}}&x_{0}=1\end{cases}}} {\boldsymbol {x}}A={\begin{cases}{\boldsymbol {x}}\gg 1&x_{0}=0\\({\boldsymbol {x}}\gg 1)\oplus {\boldsymbol {a}}&x_{0}=1\end{cases}}

where x0 is the lowest order bit of x.

As like TGFSR(R), the Mersenne Twister is cascaded with a tempering transform to compensate for the reduced dimensionality of equidistribution (because of the choice of A being in the rational normal form). Note that this is equivalent to using the matrix A where A = T−1AT for T an invertible matrix, and therefore the analysis of characteristic polynomial mentioned below still holds.

As with A, we choose a tempering transform to be easily computable, and so do not actually construct T itself. The tempering is defined in the case of Mersenne Twister as

y := x ⊕ ((x >> u) & d)
y := y ⊕ ((y << s) & b)
y := y ⊕ ((y << t) & c)
z := y ⊕ (y >> l)

where x is the next value from the series, y a temporary intermediate value, z the value returned from the algorithm, with <<, >> as the bitwise left and right shifts, and & as the bitwise and. The first and last transforms are added in order to improve lower-bit equidistribution. From the property of TGFSR, s + t ≥ ⌊ w / 2 ⌋ − 1 {\displaystyle s+t\geq \lfloor w/2\rfloor -1} s+t\geq \lfloor w/2\rfloor -1 is required to reach the upper bound of equidistribution for the upper bits.

The coefficients for MT19937 are:

(w, n, m, r) = (32, 624, 397, 31)
a = 9908B0DF16
(u, d) = (11, FFFFFFFF16)
(s, b) = (7, 9D2C568016)
(t, c) = (15, EFC6000016)
l = 18

Note that 32-bit implementations of the Mersenne Twister generally have d = FFFFFFFF16. As a result, the d is occasionally omitted from the algorithm description, since the bitwise and with d in that case has no effect.

The coefficients for MT19937-64 are:[47]

(w, n, m, r) = (64, 312, 156, 31)
a = B5026F5AA96619E916
(u, d) = (29, 555555555555555516)
(s, b) = (17, 71D67FFFEDA6000016)
(t, c) = (37, FFF7EEE00000000016)
l = 43

Initialization

初始化

As should be apparent from the above description, the state needed for a Mersenne Twister implementation is an array of n values of w bits each. To initialize the array, a w-bit seed value is used to supply x0 through xn − 1 by setting x0 to the seed value and thereafter setting

xi = f × (xi-1 ⊕ (xi-1 >> (w-2))) + i

for i from 1 to n-1. The first value the algorithm then generates is based on xn, not based on x0. The constant f forms another parameter to the generator, though not part of the algorithm proper. The value for f for MT19937 is 1812433253 and for MT19937-64 is 6364136223846793005.[48]

Comparison with classical GFSR
和经典GFSR的对比

In order to achieve the 2nw − r − 1 theoretical upper limit of the period in a TGFSR, φB(t) must be a primitive polynomial, φB(t) being the characteristic polynomial of

B = ( 0 I w ⋯ 0 0 ⋮ I w ⋮ ⋱ ⋮ ⋮ ⋮ 0 0 ⋯ I w 0 0 0 ⋯ 0 I w − r S 0 ⋯ 0 0 ) ← m -th row {\displaystyle B={\begin{pmatrix}0&I_{w}&\cdots &0&0\\\vdots &&&&\\I_{w}&\vdots &\ddots &\vdots &\vdots \\\vdots &&&&\\0&0&\cdots &I_{w}&0\\0&0&\cdots &0&I_{w-r}\\S&0&\cdots &0&0\end{pmatrix}}{\begin{matrix}\\\\\leftarrow m{\hbox{-th row}}\\\\\\\\\end{matrix}}} B={\begin{pmatrix}0&I_{w}&\cdots &0&0\\\vdots &&&&\\I_{w}&\vdots &\ddots &\vdots &\vdots \\\vdots &&&&\\0&0&\cdots &I_{w}&0\\0&0&\cdots &0&I_{w-r}\\S&0&\cdots &0&0\end{pmatrix}}{\begin{matrix}\\\\\leftarrow m{\hbox{-th row}}\\\\\\\\\end{matrix}}

S = ( 0 I r I w − r 0 ) A {\displaystyle S={\begin{pmatrix}0&I_{r}\\I_{w-r}&0\end{pmatrix}}A} S={\begin{pmatrix}0&I_{r}\\I_{w-r}&0\end{pmatrix}}A

The twist transformation improves the classical GFSR with the following key properties:

The period reaches the theoretical upper limit 2nw − r − 1 (except if initialized with 0)
Equidistribution in n dimensions (e.g. linear congruential generators can at best manage reasonable distribution in five dimensions)

Pseudocode

伪代码

The following piece of pseudocode implements the general Mersenne Twister algorithm. The constants w, n, m, r, a, u, d, s, b, t, c, l, and f are as in the algorithm description above. It is assumed that int represents a type sufficient to hold values with w bits:
下面一段伪代码实现了通用的马特赛特算法,常量w, n, m, r, a, u, d, s, b, t, c, l, 和 f如上边算法中描述的一样。这里假设整型代表了一种带有w位值得类型后缀。 
 // Create a length n array to store the state of the generator
 int[0..n-1] MT
 int index := n+1
 const int lower_mask = (1 << r) - 1 // That is, the binary number of r 1's
 const int upper_mask = lowest w bits of (not lower_mask)

 // Initialize the generator from a seed
 function seed_mt(int seed) {
     index := n
     MT[0] := seed
     for i from 1 to (n - 1) { // loop over each element
         MT[i] := lowest w bits of (f * (MT[i-1] xor (MT[i-1] >> (w-2))) + i)
     }
 }

 // Extract a tempered value based on MT[index]
 // calling twist() every n numbers
 function extract_number() {
     if index >= n {
         if index > n {
           error "Generator was never seeded"
           // Alternatively, seed with constant value; 5489 is used in reference C code[49]
         }
         twist()
     }

     int y := MT[index]
     y := y xor ((y >> u) and d)
     y := y xor ((y << s) and b)
     y := y xor ((y << t) and c)
     y := y xor (y >> l)

     index := index + 1
     return lowest w bits of (y)
 }

 // Generate the next n values from the series x_i 
 function twist() {
     for i from 0 to (n-1) {
         int x := (MT[i] and upper_mask)
                   + (MT[(i+1) mod n] and lower_mask)
         int xA := x >> 1
         if (x mod 2) != 0 { // lowest bit of x is 1
             xA := xA xor a
         }
         MT[i] := MT[(i + m) mod n] xor xA
     }
     index := 0
     }

Python implementation

Python实现

This python implementation hard-codes the constants for MT19937:
Python对于实现MT19937的硬编码
def _int32(x):
    # Get the 32 least significant bits.
    return int(0xFFFFFFFF & x)

class MT19937:

    def __init__(self, seed):
        # Initialize the index to 0
        self.index = 624
        self.mt = [0] * 624
        self.mt[0] = seed  # Initialize the initial state to the seed
        for i in range(1, 624):
            self.mt[i] = _int32(
                1812433253 * (self.mt[i - 1] ^ self.mt[i - 1] >> 30) + i)

    def extract_number(self):
        if self.index >= 624:
            self.twist()

        y = self.mt[self.index]

        # Right shift by 11 bits
        y = y ^ y >> 11
        # Shift y left by 7 and take the bitwise and of 2636928640
        y = y ^ y << 7 & 2636928640
        # Shift y left by 15 and take the bitwise and of y and 4022730752
        y = y ^ y << 15 & 4022730752
        # Right shift by 18 bits
        y = y ^ y >> 18

        self.index = self.index + 1

        return _int32(y)

    def twist(self):
        for i in range(624):
            # Get the most significant bit and add it to the less significant
            # bits of the next number
            y = _int32((self.mt[i] & 0x80000000) +
                       (self.mt[(i + 1) % 624] & 0x7fffffff))
            self.mt[i] = self.mt[(i + 397) % 624] ^ y >> 1

            if y % 2 != 0:
                self.mt[i] = self.mt[i] ^ 0x9908b0df
        self.index = 0
        #test
Then MT19937(seed).extract_number() returns the random number, where seed is the initial seed.
MT19937(种子).extract_number()函数返回一个随机数字,在这里种子是一个最初的种子。
Mersenne Twister梅森旋转算法)是一种广泛使用的伪随机生成算法,它具有良好的随机性和周期性性。下面详细说明Mersenne Twister算法的工作原理: 1. **初始化种子**: Mersenne Twister算法需要一个种子值来开始生成随机数序列。种子值可以是任意整数,通常使用当前时间戳作为种子。 2. **初始化状态**: 初始种子值通过一个称为“初始化状态”的过程,将其转换为一个内部状态数组。这个数组通常有 624 个元素,并且可以存储 32 位整数。 3. **填充状态数组**: 初始状态数组只是种子值的简单转换,接下来需要通过填充状态数组来生成更多的随机数。填充状态数组的过程是使用一个称为“梅森旋转”的操作,将当前状态数组中的元素与一些位运算和异或操作相结合,得到新的状态值。 4. **生成随机数**: 填充状态数组后,可以从状态数组中提取随机数。通常情况下,每次需要一个随机数时,会从状态数组中选择一个元素作为输出,并对其进行一系列的变换操作,以产生最终的随机数。 5. **重复填充和生成过程**: 当需要更多随机数时,将重复填充状态数组和生成随机数的过程。每次填充状态数组都会更新状态数组中的元素,从而保持随机性。 Mersenne Twister算法的关键之处在于其内部状态数组的维护和梅森旋转操作的使用。这些操作保证了生成的随机数序列具有良好的随机性和周期性性。 在Python中,Mersenne Twister算法是random模块的默认随机数生成器。可以使用`random`模块中的各种函数和方法来生成随机数。例如: ```python import random random_number = random.random() # 生成0到1之间的随机浮点数 print(random_number) ``` 除了`random()`函数之外,`random`模块还提供了其他一些生成随机数的函数,如`randint()`(生成指定范围内的整数)、`uniform()`(生成指定范围内的随机浮点数)等。 总结来说,Mersenne Twister算法是一种常用的伪随机生成算法,它通过内部状态数组和梅森旋转操作来生成高质量的随机数。在Python中,可以使用random模块来方便地调用这个算法生成各种类型的随机数。
评论
添加红包

请填写红包祝福语或标题

红包个数最小为10个

红包金额最低5元

当前余额3.43前往充值 >
需支付:10.00
成就一亿技术人!
领取后你会自动成为博主和红包主的粉丝 规则
hope_wisdom
发出的红包
实付
使用余额支付
点击重新获取
扫码支付
钱包余额 0

抵扣说明:

1.余额是钱包充值的虚拟货币,按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载,可以购买VIP、付费专栏及课程。

余额充值