# Too Big to Fail

For many PRNGs, the more state bits you give them, the deeper statistical tests need to go to discover their flaws. We'll explore what this phenomenon means, looking at one of the earliest PRNGs ever made, John von Neumann's “Middle Square” method.

### Basic Middle Square

John von Neumann first described his *Middle Square* method in 1949. The idea is this, take a number with *n*
digits, square it to get a number with *2n* digits, and then take the
middle *n* as your next value. Repeat as necessary.

Code for a 64-bit implementation is shown below:

```
uint64_t middle_square() {
static uint64_t state = 0x58c7ca4eb6f0e618;
uint128_t squared = state;
squared *= squared;
state = squared >> 32;
return state;
}
```

If you test this with TestU01's Crush test, we see good results:

```
========= Summary results of Crush =========
Version: TestU01 1.2.3
Generator: Middle Square 64-32
Number of statistics: 144
Total CPU time: 00:50:47.42
All tests were passed
```

Yet this generator is universally recognized as terrible, and it does badly fail Big Crush. And sometimes, in fact much of the time, when we pick other seeds, it doesn't do nearly so well with Crush either. What is going on?

### Exploring Tiny Middle Square

To understand a PRNG more deeply, it's often useful to look at a
smaller version (few bits, fewer states). Thus, instead of 64-bit
Middle Square, let's look at *6-bit* Middle Square and draw out a graph
of its state space. We'll use a node for every state, and an edge for
transitions from one state to the next.

From this graph, you can see that it has three cycles, it will either
get stuck continuously repeating `0`

, `8`

, or `27`

. For some starting
points we'll head into a pathological cycle quickly, for others it'll
take a bit of time.

It's a similar story if we go up to eight bits:

Once again, we eventually end up in pathological short cycles.

The zero short cycle is, in hindsight, fairly obvious, but it is unfortunate that many other numbers map to themselves.

### Effecting a Small Repair — Meddle Square

It's tempting to see if we can come up with a simple tweak to Middle Square to avoid some of the problems with short cycles. We'll make a tiny tweak to the algorithm. After we square, we complement all the bits; clearly we cannot loop around at zero now. Here are the results for six bits,

and for eight bits,

In these examples it does seem to have avoided tiny cycles, so perhaps we should declare this tinkering effort an improvement, a modest success. But the problem with tinkering is that this new generator seems to have little in the way of principle governing its design. I just hacked at the code a little and somewhat arbitrarily threw an extra operation and hoped for the best. Empirically, this tweak does help; for an arbitrarily chosen seed, the 64-bit version is much less likely to fall into a very short cycle and thereby fail TestU01's Crush benchmark, but we need to step back a bit and think about what we have here.

When discussing terrible PRNGs, Donald Knuth wrote (in
*The Art of Computer Programming*, volume 2 (3rd ed.): Seminumerical Algorithms,
Chapter 3, Random Numbers, page 6):

The moral of this story is that random numbers should not be generated with a method chosen at random. Some theory should be used.

Even without a theory, just looking at these graphs should tell us two
important and worrisome things. Some states (those lower down the
graph) are clearly more likely to be reached than others, thus the
generator *cannot possibly be uniform, it must be biased*. We should
also see that we are at risk of immediate pathological behavior if we
happen to quickly end up in a small cycle.

### Answers from Theory — Random Mappings

There actually *is* a theory for this kind of generator. It is the
theory of
random mappings,
which is discussed in the book
*Analytic Combinatorics*
by Philippe Flajolet and Robert Sedgewick (section VII.3.3 pages 462-467). They say

Random number generators.Many (pseudo) random number generators operate by iterating a given function φ over a finite domaine E ; usually, E is a large integer interval [0 .. N − 1]. Such a scheme produces a pseudo-random sequence u

_{0}, u_{1}, u_{2}, . . ., where u_{0}is the “seed” and u_{n+1}= φ(u_{n}).Particular strategies are known for the choice of φ, which ensure that the “period” (the maximum of ρ = λ + μ, where λ is the distance to cycle and μ is the cycle’s length) is of the order of N: this is for instance granted by linear congruential generators and feedback register algorithms; see Knuth’s authoritative discussion in [379, Ch. 3]. By contrast, a randomly chosen function φ has expected O(√N) cycle time (Figure VII.7, p. 462), so that it is highly likely to give rise to a poor generator. As the popular adage says: “A

randomrandom number generator is bad!”. Accordingly, one can make use of the results of Figure VII.7 and Example VII.10 in order to compare statistical properties of a proposed random number generator to properties of a random function, and discard the former if there is a manifest closeness.

Based on the theory of random mappings, these are the expected values
for various parameters of the state graph, assuming *N* states:

Property | Expected Value |
---|---|

Number of components | (log N)/2 |

Component size |
N/3 |

Number cyclic nodes | √(π N/2) |

Path length to reach cycle (λ) | √(π N/8) |

Cycle length (μ) | √(π N/8) |

Rho length (ρ = λ + μ) | √(π N/2) |

Number of nodes of in-degree k |
N e^{−k}/k! |

One thing this theory tells us is that we can expect cycle size and period to improve as the number of states increases. Thus, for our Meddle Square algorithm, we can surmise that:

- When
*N*is 2^{64}, we can expect to produce 2^{32.3}numbers before we start to see repeats, and when repetition does occur it should be on a cycle of size 2^{31.3}. Thus a 64-bit random mapping should expect to pass TestU01's Crush (but not BigCrush). - When
*N*is 2^{72}(which can also be computed using only 128-bit math), we can expect to produce 2^{36.3}numbers before we start to see repeats, and when repetition does occur it should be on a cycle of size 2^{35.3}. So a 72-bit random mapping should expect to pass TestU01's BigCrush. - When
*N*is 2^{96}(which requires us to use 192-bit math to do the squaring, but not fully general 192-bit math because the input to the squaring process is only a 96-bit value), we can expect to produce 2^{48.3}numbers before we start to see repeats. That's enough for more than 3.75 petabytes of random data (2.5 petabytes if we return an 8-byte 64-bit value rather than a 12-byte 96-bit value). That's two orders of magnitude larger than the standard 32 TB PractRand test. - When
*N*is 2^{128}(which requires us to use 256-bit math to do the squaring, but once again the input to the squaring process is smaller, only a 128-bit value), we can expect to produce 2^{64.3}numbers before we start to see repeats. That's enough for 320 exabytes of random data, vastly more than any PractRand test will ever consume.

Actual tests do indeed support these expectations. In my experiments, I find that Meddle Square (and other similar variations) pass each of the statistical test suites exactly where we would expect them to from the theory of random mappings.

On the other hand, the same theory can tell us other things, too,
like “with non zero asymptotic probability, the tallest tree in a
functional graph is not rooted on the longest cycle”. In other words,
the risk that the state-space graph will be an ugly or have unexpectedly
pathological shape is always there. The untweaked Middle Square
algorithm, for example, gets more and more picky about good seeds as
it gets bigger. It's hard to be fully confident just *how* unlikely it
is that we might end up on a small cycle.

And we still have the issue that the generator is biased. It may be
hard to *detect* as the generator gets bigger, but we know from the
structure of the state space that the bias is there.

### Conclusion

A 128-bit variation of Jon von Neumnann's Middle Square passes TestU01's BigCrush at 72 bits and PractRand at 96 bits.

Here's how I interpret these results:

- If someone shows you a PRNG with a state space of 96 bits or higher and tells you that it passes statistical tests, you should not be impressed at its statistical quality. At that size, you should expect PRNGs to be “too big to fail” statistical tests, even if their core algorithm is decidedly mediocre.
- If someone shows you a large–state-space generator and tells you
that it passes
*most*statistical tests, but fails some, you should be even less impressed. - The basic mathematical properties of period and uniformity matter. If someone introduces a new PRNG and they don't show that the generator is uniform/unbiased/1-dimensionally equidistributed (all ways of saying essentially the same thing) or discuss its period, you should be worried.
- If a generator is a random mapping, there is going to be some risk
of bad states and short cycles. If someone offers you such a
such a generator, you should ask how well they have characterized
the probability of pathological behavior. Do they
*know*the chance is infinitesimal, or are they just crossing their fingers and hoping; “no trouble... yet”.

But you could take a different standpoint. You could say:

- If you take something crappy, and you make it so big you can't
*see*that it's crappy any more, maybe it's*not*crappy. Maybe it's just fine.

If you're a user of SplitMix, you ought to think about these issues.
The SplitMix paper doesn't adequately characterize the behavior of its
`split()`

operation, but `split()`

seeds the PRNG with its own output,
forming a random mapping. The SplitMix paper doesn't discuss the
consequences of `split()`

for period, uniformity, or the shape of
SplitMix's state space, but now you're aware that these topics absolutely
needed to be addressed. With `split()`

, you have a random mapping
with all of the issues that raises.