The Mersenne Twister
The Mersenne Twister is a 623-dimensionally equidistributed uniform pseudorandom number generator. This is how it works.
The paper first defines a $k$-distribution as a very reasonable definition of randomness. Read the paper first.
The Linear Recurrence
Equation 2.1 in M&N 1998 is the linear recurrence $$ \mathbf{x}_{k + n} := \mathbf{x}_{k + m} \oplus (\mathbf{x}_k^u | \mathbf{x}^l_{k + 1}) A $$ ($k = 0, 1, \ldots$). The paper says the following.
We have several constants: an integer $n$, which is the degree of the recurrence, an integer $r$ (hidden in the definition of $\mathbf{x}_k^u$), $0 \leq r \leq w - 1$, an integer $m$, $1 \leq m \leq n$, and a constant $w \times w$ matrix $A$ with entries in $\mathbb{F}_2$.
From the paper, $w$ is the “word length”, so $\mathbf{x}_i \, \in \, \mathbb{F}_2^w$ are “word vectors” of length $w$.
On the right, we have this $\mathbf{x}_k^u | \mathbf{x}^l_{k + 1}$ term. This is the concatonation of $\mathbf{x}_k^u$ with $\mathbf{x}^l_{k + 1}$. $\mathbf{x}_k^u$ is the “upper $w - r$ bits” of $\mathbf{x}_k$ and $\mathbf{x}^l_{k + 1}$ is the “lower $r$ bits” of $\mathbf{x}_{k + 1}$, i.e. denoting $$ \mathbf{x}_k = (x_{w - 1}, \ldots, x_{r}, x_{r - 1}, \ldots, x_0), $$ then $$ \mathbf{x}_k^u = (x_{w - 1}, \ldots, x_{r}), \qquad \mathbf{x}_k^l = (x_{r - 1}, \ldots, x_0). $$
Given $n$ initial seeds $\mathbf{x}_0, \ldots, \mathbf{x}_{n - 1}$, we can generate $$ \mathbf{x}_{n} := \mathbf{x}_{m} \oplus (\mathbf{x}_0^u | \mathbf{x}^l_{1}) A $$ (where $k = 0$). The only constants here are $n$, $m$, $r$ and $A$.
$A$ is the sparse matrix $$ A = \begin{pmatrix} 0 & 1 & 0 & \cdots & 0 \\ 0 & 0 & 1 & & 0 \\ \vdots & \vdots & & \ddots & \vdots \\ 0 & 0 & & & 1 \\ a_{w - 1} & a_{w - 2} & a_{w - 3} & \cdots & a_0 \end{pmatrix} $$ where we denote the bottom row as $\mathbf{a} = (a_{w - 1}, \ldots, a_0)$.
For $\mathbf{x} = (x_{w - 1}, \ldots, x_0)$, $$ \mathbf{x}A = (x_0 a_{w - 1}, x_{w - 1} + x_0 a_{w - 2}, \ldots, x_1 + x_0 a_0). $$
Notice that if $x_0 = 0$, this simplifies to $$ \mathbf{x}A = (0, x_{w - 1}, \ldots, x_1) $$ which amounts to a right bitshift, i.e. $\mathbf{x}A = \mathbf{x} \gg 1$. If $x_0 = 1$ instead, then $$ \mathbf{x}A = (a_{w - 1}, x_{w - 1} + a_{w - 2}, \ldots, x_1 + a_0) $$ and this reduces to an addition of $\mathbf{a}$, i.e. $\mathbf{x}A = (\mathbf{x} \gg 1) \oplus \mathbf{a}$. (Recall that this is an xor, since we are working in $\mathbb{F}_2$.)
For MT19937, the parameters are $w = 32$, $n = 624$, $m = 397$, $r = 31$ and $\mathbf{a} = \mathtt{0x9908B0DF}$ (M&N 1998, Table II).
Initialisation
There is an undocumented parameter $f$ which is used to initialise the internal state as $$ \mathbf{x}_i = f \cdot (\mathbf{x}_{i - 1} \oplus (\mathbf{x}_{i - 1} \gg (w - 2))) + i. $$ This is given the value $f = 1812433253$.
Tempering
The linear operator $T$ is a $w \times w$ invertible matrix. We define “tempering” to be the operation $$ \mathbf{x} \mapsto \mathbf{z} = \mathbf{x} T. $$ $T$ consists of the following bitwise operations. $$\begin{aligned} \mathbf{y} &:= \mathbf{x} \oplus (\mathbf{x} \gg u), \\ \mathbf{y} &:= \mathbf{y} \oplus \bigl((\mathbf{y} \ll s) \otimes \mathbf{b}\bigr), \\ \mathbf{y} &:= \mathbf{y} \oplus \bigl((\mathbf{y} \ll t) \otimes \mathbf{c}\bigr), \\ \mathbf{z} &:= \mathbf{y} \oplus (\mathbf{y} \gg l) \end{aligned}$$ (equations 2.2 to 2.5). These parameters are $u = 11$, $s = 7$, $\mathbf{b} = \mathtt{0x9D2C5680}$, $t = 15$, $\mathbf{c} = \mathtt{0xEFC60000}$ and $l = 18$ (M&N 1998, Table II).
An Implementation
This is the Mersenne Twister in Python. The linear recurrence occurs in _twist
and the tempering in temper
.
#!/usr/bin/env python3
"""The Mersenne Twister"""
class MT19937:
u = 11
s, b = 7, 0x9d2c5680
t, c = 15, 0xefc60000
l = 18
def __init__(self, seed: int) -> None:
self.index = 624
self.state = [0] * 624
self.state[0] = seed & 0xffffffff
for i in range(1, 624):
self.state[i] = 0x6c078965 \
* (self.state[i - 1] ^ self.state[i - 1] >> 30) \
+ i & 0xffffffff
@staticmethod
def temper(y: int) -> int:
y ^= y >> MT19937.u
y ^= y << MT19937.s & MT19937.b
y ^= y << MT19937.t & MT19937.c
y ^= y >> MT19937.l
return y
def random(self) -> int:
if self.index >= 624:
self._twist()
y = self.temper(self.state[self.index])
self.index += 1
return y
def _twist(self) -> None:
for i in range(624):
y = (self.state[i] & 0x80000000) \
+ (self.state[(i + 1) % 624] & 0x7fffffff)
self.state[i] = self.state[(i + 397) % 624] ^ y >> 1
if y % 2 != 0:
self.state[i] ^= 0x9908b0df
self.index = 0
def main() -> None:
x = MT19937(5489) # Seed from mt19937ar.c
for _ in range(1000):
print(x.random())
if __name__ == "__main__":
main()
Untempering
The Mersenne Twister is not cryptographically secure since the transformation $T$ is bijective. Due to the existence of $T^{-1}$, we can create an “untempering” function. If we can record $n$ consecutive outputs of the Mersenne Twister and untemper them, this will give us the internal state of the random number generator and allow us to predict all future values.
#!/usr/bin/env python3
"""Clone an MT19937 RNG from its output"""
from mt import MT19937
def untemper(y: int) -> int:
y ^= y >> MT19937.l
y ^= y << MT19937.t & MT19937.c
for _ in range(7):
y ^= y << MT19937.s & MT19937.b
for _ in range(3):
y ^= y >> MT19937.u
return y
def clone_mt(mt: MT19937) -> MT19937:
mt_clone = MT19937(0)
for i in range(624):
mt_clone.state[i] = untemper(mt.random())
return mt_clone
def main() -> None:
mt = MT19937(5489)
mt_clone = clone_mt(mt)
for _ in range(2000):
assert mt_clone.random() == mt.random()
print(mt_clone)
if __name__ == "__main__":
main()
See m21.py
and m23.py
on my GitHub account.