Exploring Xoshiro's Close-Repeats Flaw

In a recent post, I noted that the Vigna and Blackman's new Xoshiro family of PRNGs is prone to some implausible repeats. I hinted then that I'd write a longer article providing a more in-depth discussion of the issues. This is that article.

We'll begin with a slightly deeper exploration of zeroland (which is how I discovered the issue), see how that leads us to find Xoshiro's near-repeats problems, and, finally, determine just how significant the problem is.

Preliminaries

We'll begin in zeroland, improbable clusters of adjacent generator states where most of the bits are zero (a.k.a. low hamming-weight states). Zeroland issues affect most LFSR-based PRNGs to some degree or another and much of what I'll be showing is broadly known by people who have studied LFSRs. We'll examine zeroland behavior because it provides some context for our later discussions and also because although it zeroland issues well known, it seems they are rarely directly illustrated.

Zeroland issues are typically discounted by proponents of LFSRs. Even though the state space of a LFSR-based PRNG may be littered with deeply improbable glitches, often these state spaces are so vast (219937 states in the case of the Mersenne Twister) that the odds of actually accidentally stumbling on a bad state is vanishingly small. We can see this as an instance of being too big to fail; if the state space is large enough, it can dwarf an enormity of flaws to make encountering them vanishingly unlikely.

Exploring Zeroland

Back when I first wrote about Vigna's Xoshiro PRNGs, I mentioned that like many LFSR-based generators, their construction means that they will have a lot of self-similarity as they cross almost–all–zeros states.

Single-Bit Crossings

In my original post, I just gave a hex dump of Xoshiro crossing a low–hamming-weight state, but is clearer to visualize the issue in the form of a randogram, like those shown below:

250th bit of Xoshiro256 state set   244th bit of Xoshiro256 state set

Compared to Other PRNGs

Although zeroland issues affect many generators based on simple bitwise operations, as we saw previously, the effects are much more modest for generators like Jenkins's JSF PRNG, shown below:

1st bit of JSF state set   128th bit of JSF state set

It's a similar story for Blackman's gjrand PRNG, shown below (the fourth word in the state is a counter so we see it cross zero in this case; the part that should look random is the other three words):

128th bit of gjrand state set   no bits of gjrand state set

Xoshiro256 also looks a little worse than the zero crossings for Blackman & Vigna's previous generator, Xoroshiro, which also looks decidedly nonrandom, but seems to recover more quickly:

250th bit of Xoroshiro256 state set   244th bit of Xoroshiro256 state set

Plotting Hamming Weights

If we plot out the hamming weight (i.e., number of bits set) of Xoshro256's internal states for the two randograms shown earlier, we get

Hamming weights of internal states

Probability

Let's look more closely at the hamming weights in the middle section of the plot, where less than a quarter of the bits are set. For the first example the weights are

64, 45, 42, 33, 30, 20, 15, 
7, 3, 2, 1, 3, 2, 5, 7, 
9, 16, 14, 26, 29, 35, 26, 44, 52, 51, 45, 58

If we add up the total bits set for these 27 states, out of 27 × 256 = 6912 state bits, we have 684 set bits and 6228 zero bits. If state bits were random, the probability of seeing this number of ones to zeros is the same as the probability of tossing a coin 6912 times and only seeing 684 heads, thus we can use a binomial distribution to provide the necessary probabilities. The probability is about 2-3698.98.

It's a similar story for our other example. There are 26 hamming weights with a quarter or fewer the bits set:

60, 54, 41, 37, 19, 15, 
7, 3, 2, 1, 3, 2, 5, 7, 
9, 16, 13, 24, 26, 33, 27, 46, 53, 57, 47, 64

Of these 26 × 256 = 6656 bits, only 671 are set. The probability of this many or fewer bits being set is 2-3523.05.

In contrast, over its full period, Xoshiro256's state space forms a sequence of 256 × 2256 = 2256+8 = 2264 bits; thus it would be reasonable to see an event with a probability of 2-264 (e.g., the probability of seeing only 2672 of 6912 bits set). What we see in these examples is vastly more improbable than that.

It's worth noting, however, that we're observing the internal states of the generator. Often the internal states of PRNGs don't look random. Thus we need to look at the behavior of its output functions.

Output Functions

Xoshiro is provided with two output functions: a faster one whose low-order bits fail linearity tests, and a slower one that attempts to more comprehensively scramble the output. We'll see how well each masks the above issues.

Xoshiro256+

In the “plus” output function, the state is divided into four 64-bit words, and the output is s[0]+s[3]. Let's look at how that performs:

250th bit of Xoshiro256 state set, plus output function   244th bit of Xoshiro256 state set, plus output function

Hamming weights of internal states, plus output function

If we look at the hamming weights in the central section where about a quarter or fewer of the bits are set, the weights of the 23 outputs are

16, 15, 12, 10, 6, 5, 2, 1, 1, 1, 1, 1, 3, 3, 2, 6, 6, 7, 9, 13, 10, 11, 17

Thus we have 158 of 1472 output bits set. The odds of observing 158 or fewer bits set in 1472 random bits is about 2-752.74.

For the second example, there are 21 outputs with particularly low weights,

15, 12, 6, 5, 2, 1, 1, 1, 1, 1, 3, 3, 2, 6, 6, 7, 7, 14, 9, 14, 17

Thus we have 133 of 1344 bits set, which has a probability of about 2-722.73.

This behavior is too improbable for a 256-bit generator.

Xoshiro256**

Even the “starstar” output function does not sufficiently mask the issue:

250th bit of Xoshiro256 state set, starstar output function   244th bit of Xoshiro256 state set, starstar output function

Hamming weights of internal states, starstar output function

It has a smaller stretch of having its output with less than a quarter of the bits set. In the first example, it's only 13 outputs.

8, 14, 4, 4, 0, 0, 4, 4, 4, 4, 8, 11, 8

Out of 832 output bits, 73 are ones. The odds of of seeing 73 or fewer one bits out of 832 is about 2-479.38.

For the second example, the middle 13 outputs have hamming weights

12, 15, 4, 4, 0, 0, 4, 4, 4, 4, 8, 12, 8

of these 832 bits, 79 are set, with a probability of 2-459.53.

Again, this is wildly unlikely for a generator with Xoshiro256**'s period.

Other Low–Hamming-Weight States

There are only 256 states where a single bit is set, but what about other small numbers of bits? For example, there are 368,532,802,176 states with six bits set. Below are two of them:

Xoshiro256 crossing 6 bits set   Xoshiro256 crossing 6 bits set

Hamming weights of internal states, plus output function

The central part of the first example has hamming weights

46, 34, 13, 10, 11, 6, 7, 10, 9, 19, 20, 27, 26, 41

with a probability of about 2-2175.16. The second has hamming weights

52, 46, 34, 19, 10, 8, 6, 9, 7, 11, 15, 21, 30, 32, 51

with a probability of about 2-2151.33.

Xoshiro256+

Xoshiro256+ crossing 6 bits set in internal state   Xoshiro256+ crossing 6 bits set in internal state

Hamming weights of internal states, plus output function

The central part of the first example has hamming weights

14, 10, 5, 4, 3, 2, 3, 5, 4, 6, 6, 11, 9, 12

with a probability of about 2-466.35. The second has hamming weights

17, 11, 7, 3, 3, 2, 4, 3, 5, 8, 10, 15, 12

with a probability of about 2-395.46.

Xoshiro256**

Xoshiro256** crossing 6 bits set in internal state   Xoshiro256** crossing 6 bits set in internal state

Hamming weights of internal states, plus output function

The central part of the first example has hamming weights

17, 4, 4, 2, 2, 4, 4, 4, 13, 14

with a probability of about 2-331.47. The second has hamming weights

15, 4, 8, 4, 4, 4, 8, 4, 8, 16, 18

with a probability of about 2-311.78.

Conclusions about Zeroland

Bizarrely improbable behavior in zeroland is a known and expected issue for LFSR-based PRNGs, so Xoshiro's behavior isn't so strange. Personally, I don't much like this kind of behavior, but statistically, even if there are billions and billions of bad zeroland states, the odds of stumbling on such a state by chance in a state space of 115 quattuorvigintillion states is vanishingly small.

Repeats Flaws in Xoshiro256**

When I was poking around with Xoshiro256**'s low–hamming-weight states, I wondered what else was going on besides the low hamming weights, and so I also looked at the actual output values. Here are our two examples. The first set of outputs is

 Ham | Value
-----+---------------------
  31 | 0x93b2e2bf8bb10099
  23 | 0x00e10b416802fec2
  34 | 0x25cfdb5c2002ffb5
  23 | 0x800b40070802fd3f
  24 | 0x5fa00b4001682d2d
   8 | 0x16800b4000000000
  14 | 0x05a002d001680240
   4 | 0x000000000000002d <--
   4 | 0x000000000000002d <--
   0 | 0x0000000000000000 <
   0 | 0x0000000000000000 <
   4 | 0x000000000000002d <--
   4 | 0x000000000000002d <--
   4 | 0x000000000000002d <--
   4 | 0x0005a00000000000
   8 | 0x00000000b400002d
  11 | 0x4005a00000001689
   8 | 0x02d1680000000000
  21 | 0x4005a05ae10016b6
  18 | 0x10016800b40b4156
  23 | 0x6b85c25ae100002d
  16 | 0xd0002d1761000120
  33 | 0x5f89a05ae6ab42be
  22 | 0x0018205766ab4168

If we look at a histogram of all the bytes above, we see that many of the bytes are repeated

Reps | Value
-----+--------
  84 | 0x00
  10 | 0x2d
   6 | 0xa0
   6 | 0x68
   6 | 0x40
   6 | 0x02
   5 | 0x0b
   4 | 0x05
   4 | 0x01
   3 | 0xe1
   3 | 0x5a
   3 | 0x41
   3 | 0x20
   3 | 0x16

In these 192 bytes of output, we have 84 repeats of 0x00 (and the probability of seeing that many or more is about 2-486.88). On the other hand, the 10 repeats of 0x2d have a probability of about 2-32.36.

But what I noticed most was that over the course of seven outputs, the same value was repeated five times. That seemed really improbable. The chance of a 5-in-7 repeat is about 2-252.68, which is very unlikely, but we'd expect to see this kind of unlikely thing occur about 10 times over the period of Xoshiro256**. (But a combined 5-&-2-in-7 repeat has a probability of about 2-316.67, making it sufficiently improbable that it should not have been observed.)

There are 256 states where only one bit is set, so I checked to see which ones had 5-in-7 repeats. I found 17 of them, shown below:

Bad Xoshiro256** repeat crossing one bit set   Bad Xoshiro256** repeat crossing one bit set   Bad Xoshiro256** repeat crossing one bit set   Bad Xoshiro256** repeat crossing one bit set   Bad Xoshiro256** repeat crossing one bit set   Bad Xoshiro256** repeat crossing one bit set

Bad Xoshiro256** repeat crossing one bit set   Bad Xoshiro256** repeat crossing one bit set   Bad Xoshiro256** repeat crossing one bit set   Bad Xoshiro256** repeat crossing one bit set   Bad Xoshiro256** repeat crossing one bit set   Bad Xoshiro256** repeat crossing one bit set

Bad Xoshiro256** repeat crossing one bit set   Bad Xoshiro256** repeat crossing one bit set   Bad Xoshiro256** repeat crossing one bit set   Bad Xoshiro256** repeat crossing one bit set   Bad Xoshiro256** repeat crossing one bit set

If these were the only 5-in-7 repeats in Xoshiro256, it would not be wildly improbable behavior (because when we expect to see 10, there is a 2.7% chance of seeing 17 or more). But these were not the only 5-in-7 repeats. Further investigation revealed many many more. Here are the internal states for four such examples:

Bad Xoshiro256 repeat, example 1   Bad Xoshiro256 repeat, example 2

Bad Xoshiro256** repeat, example 1   Bad Xoshiro256** repeat, example 2

And the associated output from Xoshiro256** for the above states:

Bad Xoshiro256** repeat, example 1   Bad Xoshiro256** repeat, example 2   Bad Xoshiro256 repeat, example 1   Bad Xoshiro256 repeat, example 2

We can see that some examples are associated with somewhat low hamming weights, not all.

Counting the 5-in-7 Repeats in Xoshiro256**

Using the same techniques I use to predict output from Xoshiro256**, it's straightforward to find Xoshiro256 states that give rise to of 5-in-7 repeats. After generating a few thousand of them (and writing a small program to output them), I stopped, having reached an improbability of worse than 2-67496.

Empirical Testing at Small Sizes

When I discovered that there seemed to be vastly more 5-in-7 repeats than expected, I contacted David Blackman, one of the coauthors of Xoshiro256. My first step was to ask him if he could provide smaller versions of Xoshiro256 so that I could escape from too-big-to-fail territory and be able to test smaller variants to determine the full extent of the issue.

I wrote a short test program that looked for repeats in a small window and ran it on Xoshiro32** (which has 8-bit output). Here are the results

----------------------+-------+------------
Repeat kind           | Count | Expected
----------------------+-------+------------
5-repeat, width 5     | 0     | 1
5-repeat, width 6     | 0     | 3.98438
5-repeat, width 7     | 294   | 9.92203
5-repeat, width 8     | 18    | 19.7665
5-repeat, width 9     | 40    | 34.4563
5-repeat, width 10    | 62    | 54.9148
5-repeat, width 11    | 95    | 82.0504
5-repeat, width 12    | 137   | 116.757
5-repeat, width 13    | 193   | 159.914
5-repeat, width 14    | 236   | 212.385
5-repeat, width 15    | 315   | 275.022
5-repeat, width 16    | 375   | 348.661
----------------------+-------+------------
6-repeat, width 6     | 0     | 0.00390625
6-repeat, width 7     | 0     | 0.019455
6-repeat, width 8     | 0     | 0.0581369
6-repeat, width 9     | 3     | 0.135123
6-repeat, width 10    | 6     | 0.26919
6-repeat, width 11    | 1     | 0.482649
6-repeat, width 12    | 3     | 0.801273
6-repeat, width 13    | 5     | 1.25423
6-repeat, width 14    | 3     | 1.87399
6-repeat, width 15    | 4     | 2.6963
6-repeat, width 16    | 6     | 3.76007
----------------------+-------+------------
7-repeat, width 7     | 0     | 1.52588e-05
7-repeat, width 8     | 0     | 9.11951e-05
7-repeat, width 9     | 0     | 0.000317936
7-repeat, width 10    | 0     | 0.000844518
7-repeat, width 11    | 1     | 0.00189274
7-repeat, width 12    | 0     | 0.0037707
7-repeat, width 13    | 0     | 0.00688594
7-repeat, width 14    | 0     | 0.0117584
7-repeat, width 15    | 0     | 0.0190327
7-repeat, width 16    | 0     | 0.0294908
----------------------+-------+------------

Similarly, here are the results of running the same test program on Xoshiro36** which has 9-bit output:

----------------------+-------+------------
Repeat kind           | Count | Expected
----------------------+-------+------------
5-repeat, width 5     | 0     | 1
5-repeat, width 6     | 0     | 3.99219
5-repeat, width 7     | 644   | 9.96098
5-repeat, width 8     | 26    | 19.883
5-repeat, width 9     | 51    | 34.7274
5-repeat, width 10    | 88    | 55.4553
5-repeat, width 11    | 97    | 83.0204
5-repeat, width 12    | 119   | 118.369
5-repeat, width 13    | 304   | 162.439
5-repeat, width 14    | 223   | 216.163
5-repeat, width 15    | 325   | 280.463
5-repeat, width 16    | 407   | 356.256
----------------------+-------+------------
6-repeat, width 6     | 0     | 0.00195312
6-repeat, width 7     | 0     | 0.00974655
6-repeat, width 8     | 0     | 0.0291825
6-repeat, width 9     | 3     | 0.0679596
6-repeat, width 10    | 11    | 0.135654
6-repeat, width 11    | 0     | 0.2437
6-repeat, width 12    | 2     | 0.405373
6-repeat, width 13    | 5     | 0.635771
6-repeat, width 14    | 8     | 0.951794
6-repeat, width 15    | 11    | 1.37213
6-repeat, width 16    | 5     | 1.91723
----------------------+-------+------------
7-repeat, width 7     | 0     | 3.8147e-06
7-repeat, width 8     | 0     | 2.28435e-05
7-repeat, width 9     | 0     | 7.9796e-05
7-repeat, width 10    | 0     | 0.000212374
7-repeat, width 11    | 1     | 0.000476908
7-repeat, width 12    | 0     | 0.000951953
7-repeat, width 13    | 0     | 0.00174184
7-repeat, width 14    | 0     | 0.00298018
7-repeat, width 15    | 0     | 0.00483333
7-repeat, width 16    | 1     | 0.00750382
----------------------+-------+------------

There are no valid constants to make Xoshiro40**, but for Xoshiro44**, we get

----------------------+-------+------------
Repeat kind           | Count | Expected
----------------------+-------+------------
5-repeat, width 5     | 0     | 1
5-repeat, width 6     | 0     | 3.99805
5-repeat, width 7     | 2061  | 9.99024
5-repeat, width 8     | 24    | 19.9707
5-repeat, width 9     | 29    | 34.9317
5-repeat, width 10    | 62    | 55.8634
5-repeat, width 11    | 86    | 83.7542
5-repeat, width 12    | 144   | 119.59
5-repeat, width 13    | 219   | 164.357
5-repeat, width 14    | 239   | 219.035
5-repeat, width 15    | 322   | 284.607
5-repeat, width 16    | 398   | 362.05
----------------------+-------+------------
6-repeat, width 6     | 0     | 0.000488281
6-repeat, width 7     | 0     | 0.00244021
6-repeat, width 8     | 0     | 0.00731707
6-repeat, width 9     | 3     | 0.0170648
6-repeat, width 10    | 4     | 0.034113
6-repeat, width 11    | 1     | 0.0613734
6-repeat, width 12    | 2     | 0.102239
6-repeat, width 13    | 1     | 0.160583
6-repeat, width 14    | 2     | 0.240757
6-repeat, width 15    | 4     | 0.34759
6-repeat, width 16    | 2     | 0.486388
----------------------+-------+------------
7-repeat, width 7     | 0     | 2.38419e-07
7-repeat, width 8     | 0     | 1.42981e-06
7-repeat, width 9     | 0     | 5.0019e-06
7-repeat, width 10    | 0     | 1.33319e-05
7-repeat, width 11    | 1     | 2.99821e-05
7-repeat, width 12    | 0     | 5.99349e-05
7-repeat, width 13    | 0     | 0.000109827
7-repeat, width 14    | 0     | 0.000188183
7-repeat, width 15    | 0     | 0.000305648
7-repeat, width 16    | 0     | 0.000475221
----------------------+-------+------------

Each time we increase the output bit size by one bit, we increase the runtime of the program by a factor of sixteen, so it quickly becomes impractical to test larger sizes, although I could test fractions of the periods for larger sizes and saw similar results.

One thing was clear from these results: there is definitely a problem with 5-in-7 repeats across all sizes of Xoshiro**, and the number of repeats seemed to closely correlate with the output range of the generator. I could thus extrapolate that there would be more than 264 repeats for Xoshiro256**.

It also highlighted that (although there might only be a single instance), there would be a 7-in-11 repeat. Knowing to look for it, I checked the full-sized version and found that if we set Xoshiro256** up with the internal state

{ 0x3f9fffcc01f9ff3f, 0xc1e7e0ffff87ffc0,
  0xc187ff33fe7fff00, 0x3e67e0c0018600c0 }

it will output

0xe1467ff573fa6384
0xfd3000002cffea97 <--
0xb13059ffffffece0
0xfd3000002cffea97 <--
0x02cfffffd3000360 <
0xfd3000002cffea97 <--
0x02cfffffd3000360 <
0xfd3000002cffea97 <--
0xfd3000002cffea97 <--
0xfd3000002cffea97 <--
0xfd30000000000117
0xfd3000002cffea97 <--
0xfd3059ffd3000117
0x0000000000000000
0xb34c0059ffffe980
0xfde3ffbc5e34c117
0x9a1c00438b377a97
0xf2002d0021c2bdf7

For 64-bit output, the probability of seeing a 7-in-11 repeat is about 2-377, so it ought to be vanishingly unlikely that we would ever observe such a repeat over Xoshiro256**'s period. And here we actually saw a 7-in-11 repeat combined with a 2 repeat, which has a combined probability of about 2-438.44.

Theoretical Analysis of the Repeats Issue

While I was conducting my tests, Blackman conducted his own investigation. In correspondence with me, he confirmed that the behavior was “quite clearly a departure from random behaviour” but took things a step further than I had with my extrapolation-based approach.

Using linear-algebra techniques beyond the scope of this post, Blackman determined that over the full period of Xoshiro256**, there are more than 264 + 219 of these 5-in-7 repeats, confirming my extrapolation.

How unlikely is it that we'd see more than 264 of something when we're expecting ten? The probability of this behavior occurring by chance is worse than 2-1022.5, which is a hard-to-fathom improbability. It is more likely that if, on each of the 100 billion planets in the Milky Way, we were to arrange to have 250,000 monkeys at 250,000 typewriters, typing randomly, all 25 quadrillion monkeys produce Shakespeare's Hamlet on their first try.

Conclusions

On the one hand, from a practical perspective, having vastly, vastly more close repeats than it should isn't likely to be an issue that users will ever detect in practice. Xoshiro256's large state space means it is too big to fail any statistical tests that look for close repeats. Although we can write a program that shines a spotlight on these flaws, such programs do not represent typical use cases.

Although I'm not aware of other generation schemes with the kind of close-repeats issues that the Xoshiro family exhibits, when it comes to zeroland behavior there are other popular generation schemes that have far larger zeroland expanses, the Mersenne Twister being a notable example.

Despite its potential to behave in wildly improbable ways, I still use the Mersenne Twister quite regularly—the convenience of using a PRNG built-into C++'s standard library outweighs the infinitesimal chance of any of its flaws being a problem in practice. So we should not imagine that these flaws render Xoshiro256 unusable.

But on the other hand, Vigna and Blackman's paper contains significant quantity of mathematical theory that attempts to show that that their new Xoshiro generation scheme is of high quality. The issues shown in this post reveal that (as has happened numerous times in the past!) although mathematical theory can often show specific flaws, there are no known mathematical techniques to show that a PRNG scheme is actually good. As things currently stand, a new analysis can always reveal a flaw not accounted for by prior theory. In this case, a mix of practice and theory has shown that the structure of Xoshiro's state space is poorer than that of many competing generation schemes, including Blackman's gjrand and perhaps even Vigna and Blackman's earlier Xoroshiro scheme (which has smaller zeroland expanses and does not appear to have similar close-repeat problems), and its output functions are unable to completely conceal these issues.