A Quick Look at Xoshiro256**

On May 4, David Blackman and Sebastiano Vigna announced new members of the Xoroshiro family and a new test for random number generators (based on the z9 test from gjrand) that their previous work fails, all described in a new paper. They claim to have now developed an “all-purpose, rock-solid generator”. In this post, having had less than a day to review their work, I'll present a few preliminary thoughts on this news, mostly looking at their best new generator, xoshiro256**.

Flaws in Prior Work

One good thing about Blackman & Vigna's new paper is that they finally seem to be conceding that Xoroshiro128+ is a flawed generator. They have also updated the constants used in Xoroshiro128+ to create a new version of this PRNG (somewhat confusingly labeled “Version 1.0”), with different output, that gives some improvements to its performance in their new statistical test (which it still fails). Vigna appears to have also edited the Wikipedia page for Xoroshiro128+ to reflect the updated comments and caveats in the updated source and delete some of the other content from the page.

One thing Vigna doesn't seem to have conceded is that his interleaving-only approach to testing with TestU01 is insufficient, and that when a more comprehensive testing approach is used, Xoroshiro128+ fails TestU01.

But enough about the old, what about the new…? Have they really now come up with an all-purpose, rock-solid generator?

Some Good Things about xoshiro256**

Blackman & Vigna claim that xoshiro256** is fast. I haven't tested its speed, but I have every reason to believe that it's quick. The four-dimensional equidistribution seems like a neat property, too.

Some Bad Things about xoshiro256**

So what about the bad things? There are a few things that should be no surprise, but there is one thing that surprised me a good deal.

Typical LFSR Issues

Being based on a (G)LFSR, this PRNG will always have the “bad initializations” problem. In particular, you must avoid initializing it with an all-zeros state, but LFSRs always have “zeroland issues” to some degree.

Here's an example of the underlying xoshiro256 generator struggling as it crosses zeroland; the 32 lines below show how the generator moves from one state to the next. Each line shows the 256-bit internal state of the generator.

9a426512ea327872d420923480050102714d2564b2102470c6c3471ce8947d36
88a1b03a82a304463f2fd242d8275d00cf66407c5a265c022f86825c7aa50d12
9808e0242021545478e8220400a20544e3424008628558444a0242152a03d450
aae280350a80854003a28228420609543f42a168482c0c103a22865d4c022554
936284400484a9409602a37500aa800491f0a5515004895085800730008ea1c0
80e0200504a0888494908264542aa0144478204454882010843882705488a004
90482011040288945008822504020880c050a8141000a8944002021500028014
8042a0210002000000100a201000288054528001050220001112820150060080
9140280040042800d4002a0015000880c050002054002000c500022051042800
8040002004000880851002200104008005100220050408008410022005040880
8140002000000080004000200000000081100008000400800100002000000080
8000002000000000001000080004000000100028000400000010002800000000
8000000000040000800000000000000080000000000400008000000000040000
8000000000000000800000000000000000000000000000008000000000000000
8000000000000000000000000000000080000000000000000000000000000000
8000000000000000000000000000000000000000000000000000000000000000
8000000000000000800000000000000080000000000000000000000000000000
0000000000000000800000000000000000000000000000000000100000000000
8000100000000000800000000000000000000000000000000000100002000000
0000000002000000000010000000000080001000000000000000100002000040
00000000000000408000000002000000a0001000020000000008000000000040
80080000020000002000100000000040a0001400020000400008100100000040
a000000102000000000804000000000000081400008000400000040100002000
a008000002002000a000100102800040a8081401028000400400000100002000
0408100000800040a80004000200200028021101000020400408148002000050
a800008000802010840a050102800000240a050140802000040a158102100000
2800100000102010080000804280001086080081400000100000100002100012
2000008040802012a608100102902010af089581403020000000410002100812
86085181400008102900850002202002af0a902140900012050054c10a202010
aa08804048000802000244a002b028002302c5e040940802000245801a382100
aa08816050880102890801000a242800004a40c0589400000120000000240311
222080605a882a13234ac0a002382902a842d5e8581c01020562310500200140

If state transitions were properly random, it would appear that each line was completely independent of the previous one, but you can see that we have a significant cluster of states with lots of zeros. There are many similar state sequences. Here is another:

f5559ac24095929823d413a863411a88e1c297d28c74c68852408c0277918298
84c1056854450a8837431eb8afa04e9833c7cb92f9f1541013020e3293f5429a
a08015e26810068a8045d042021410008a7791ba30845e98a18044882211478a
8145812848155100aab2541a5a8048128a73807078945812aaf14438b2994400
8106910aa00c5d12a18455426a014100a302b458a0a509122182400862045d03
01008440a8094111838070106aa815008880f15082a95400a3807000c2a94100
21008450000815110a0005004008001169a0a04000a015112a80040000021500
018085504002000042a0211040a0001142a0a400008a000042a2241000200801
018280500082081001800040002800110100a01040aa00000102000040a00810
01008010400a0011010220004000080100022010400a08100100201040080811
0102800040020001000080004000000041022000100208010002000040020001
0100000040000000400020001000080040002000500008004000200050000000
0100000000000800010000000000000001000000000008000100000000000800
0100000000000000010000000000000000000000000000000100000000000000
0100000000000000000000000000000001000000000000000000000000000000
0100000000000000000000000000000000000000000000000000000000000000
0100000000000000010000000000000001000000000000000000000000000000
0000000000000000010000000000000000000000000000000000002000000000
0100002000000000010000000000000000000000000000000000002000040000
0000000000040000000000200000000001000020000000008000002000040000
8000000000000000010000000004000001400020000400008000100000000000
0100100000040000804000200000000081400028000400008000102002000000
0140000002040000000010080000000080001028000000000000000802000040
01401000000400408140002002040000a1501028020400000008000002000040
80081020000000002150000800040040a0500420020000408008102900040000
2150000100000040010814080204000020481408028000400008142b02042000
2050002200002040001000010084000029081001028000000400002000046000
24400003008040400948102202042040095a112b028020400c00008200042010
210810a3028040102452010a000440400d5e152042806000000a00a902140040
055011000090001008040489400460502e42058bc08020100800048b00346042
0554110240a0000223161002801440502200948300b020000002400080004806
2640410040b4085404429583c0046052075185a9c0b02002810ac4628a004002

There is also some clustering around the all-ones state (although the clustering effect is less pronounced):

43b5525de187238318091c1295d360e231a72dc399f84bcb6926abdb1334acdd
329ae5946760efbc6a1b638cedac08aa4a375438b9bb6848f987ee25f6f930dc
a106683d7c35d7ca12b6d22033778f5ebfb46af4cf8f87f4a70ed27391b5236a
14be686edef77bfe0c04d0e980cddf60baf26426ad06503e558696b7000a7458
4d3c2e305e30d0c6a248dca1f33cf4a00f9f0dd3cd312bc0f5670b3048cbd018
1a13f9a1e5c7f47ee0ebff42603d0fa6fbe0c59a7a41fb06e4970ae5faf2377e
1e6f0c067f08cca60118c379ffbb00de1f77fc4180ca0f78e71b008f9eb4f359
f86ccff01e073f210000333e0079c30087eb0f31fe7ec3defe70fcc0787ecc21
061c000e660030007f87f3ffe0003fff19fbc0326679fcffe1e43fce19ffcf00
987fcc3f9fffc0ff606033c3e079f300f818003c7f87ccfffe1ff3cc79863f3f
06000c3006000cc00007ffc00001ff0007e00cf006780c00f987f3cff801f33f
ff80003ffe0000ff01e7ff000079ffc0fe6000c3fe7800c00187ff300181ff00
ffe0000ffff8003f0007fffc0001ffffffe0000ffff8003f0018000c0006003f
ffffffffffffffff0007fffc0001fffffff80003fffe0000fff80003fffe0000
00000000000000000000000000000000ffffffffffffffffffffffffffffffff
ffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffff
fffffffffffffffffffffffffffffffffffffffffffe00000000000000000000
0000000000000000fffffffffffe0000ffffffffffffffffffffffffffffffff
000000000001ffff000000000001ffff00000003ffffffff3fffe00000000000
3fffe0000000000000000003ffffffff00000000000000003fffe7fffc000000
000007fc03ffffff3fffe003ffffffff3ff81ffffffe0000ffffe7fffcff807f
c000000000ff807f0007f80003fe0000ffffe7fc03ffffff0ff0180000ff8060
cff7e00003fe001f3ff81ffc00fe7f80cfffe00003007f80300c01fefc000060
c003fe02ff007fff3ff01ffc0000001f3ff001fcfffe7f9fcffc01fe83c05f9f
300fe0007cc0207fc003e00200fe007fc00bfffe00c000600bf01e0183c05078
fbfc1e03fffe70783007fffc7cfe206030001e027cfe201fca00f97e7fc07067
01fb1881fcc0207ffbfbfffdfffe70073404f9fdc3c05067ca00ff40e0d04067
3000183ce3ee101fce041e81c0fe001fca041e80df0e7018c60c063f6017a3e5
380800824307b3e53400183dfc1e6018c70787403cde6007347f41010317d41d
387759bebc0e07e0cb0f9fff83c7b3facf747ffebfe9d3e23680a00feb279fe1
c5f8664ed4ee2bfb3c0cb9bf802067f8c8fc21cf6413d40205837fb1e7fe0d1c
fc77a040b330411f3108fe3e30dd98017e7b47c17f0dfff9cd5c8731f8c1ccfb

Thus, in behavior that is typical for an LFSR, the state-space of the underlying xoshiro256 generator is littered with statistically very unlikely clusterings of states. It's what I mean when I sometimes say that LFSRs are “too lumpy”—that all-zero states tend to be clustered together.

If all-zero states tend to be clustered together, that means they won't be showing up elsewhere, with subtle effects on the entire generator. Of course, with 256 bits of state, we would expect this generator to be well into “too big to fail” territory where the flaws are there in theory but hard for a typical PRNG test suite to discover in practice.

A good output function can do a lot to mask these issues, so we should move on to looking not just at the underlying (G)LFSR but the at xoshiro256** as a whole.

Trivial Predictability

Like their previous work, xoshiro256** is trivially predictable. That means that given only four outputs from this PRNG, we can know its prior outputs and predict all future outputs.

I know that Vigna has argued that trivial predictability is a nonissue; that it doesn't matter to anyone. But that's not what I've found. In my conversations with most programmers (and nonprogrammers) the idea of a PRNG that can be easily predicted seems like an oxymoron. But on the other hand, it is true that for many situations (e.g., running a physics simulation), predictability probably doesn't make the slightest difference—after all, the physics simulation code isn't going to turn adversarial and try to outwit the PRNG it uses. But Vigna claims to have made an “all-purpose, rock-solid generator”. An all-purpose PRNG may have to work in a world where there are things like algorithmic complexity attacks and thus not being trivially predictable matters.

On the positive side, we can use this predictability to have some fun and make it do party tricks. If we set the initial state array s[] to { 0x01d353e5f3993bb0, 0x7b9c0df6cb193b20, 0xfdfcaa91110765b6, 0xd2db341f10bb232e }, we get a nice authorship notice letting us know who wrote the generator:

0000  dd 51 b2 b7 d9 30 3a 37  eb d9 63 66 a6 70 fd 50  |.Q...0:7..cf.p.P|
0010  26 e7 29 1f 21 21 c0 35  36 c1 2d 03 77 b1 41 d3  |&.).!!.56.-.w.A.|
0020  43 33 2f 77 f7 fe 97 01  1e 93 c3 ce e4 df fc c4  |C3/w............|
0030  db 6c 06 54 08 25 6f 5a  0e 86 82 4d 1c 72 c9 50  |.l.T.%oZ...M.r.P|
0040  20 ae ca 84 d9 24 87 b9  51 96 93 ae ae d2 8f ce  | ....$..Q.......|
0050  57 37 c1 5c f4 cc 5c d6  2a 29 72 cb f0 c5 f8 f8  |W7.\..\.*)r.....|
0060  46 1e 33 a2 5d b1 66 b4  15 6f 3b ed 93 e4 70 ba  |F.3.].f..o;...p.|
0070  11 be 24 b0 20 64 13 86  71 72 92 31 d8 be 03 a9  |..$. d..qr.1....|
0080  78 6f 73 68 69 72 6f 32  35 36 2a 2a 20 62 79 20  |xoshiro256** by |
0090  42 6c 61 63 6b 6d 61 6e  20 26 20 56 69 67 6e 61  |Blackman & Vigna|
00a0  bd 9a f9 bd 3a 79 52 d3  76 50 5e 1e 55 6a 36 48  |....:yR.vP^.Uj6H|
00b0  9f c0 39 c2 5c db 99 a3  5c d5 4b a2 15 35 53 9c  |..9.\...\.K..5S.|
00c0  da dd c6 0b bf 33 ef a7  82 eb 06 52 6d 6d 31 2b  |.....3.....Rmm1+|
00d0  24 7a 0c 3f 70 43 d1 6f  aa c6 88 7e f9 30 ee ff  |$z.?pC.o...~.0..|
00e0  22 31 af c6 1f e5 68 22  e9 6e 30 06 f6 7f 9a 6e  |"1....h".n0....n|
00f0  be 19 0c f7 ae e2 fa ec  8e c6 22 e1 78 b6 39 d1  |..........".x.9.|

Interestingly, if we run it with an initial state of { 0x4685307f2e2c9b43, 0xa999f3abba9e66fe, 0xf74ff5fab0e603da, 0x6dc878990b1ea4db }, we get:

0000  68 54 0d ec e6 97 ea 07  9c 0f 5c 78 31 0e 43 50  |hT........\x1.CP|
0010  6e e0 96 20 76 bd 6f 4e  a8 12 59 5b 35 32 a1 f4  |n.. v.oN..Y[52..|
0020  27 c7 28 c3 e4 75 64 e6  32 5f 75 f2 bb ec fb d1  |'.(..ud.2_u.....|
0030  59 47 b1 87 dd 70 fc 8d  7f 94 f6 f4 62 d2 9a ae  |YG...p......b...|
0040  e5 7b 44 63 a2 63 e3 f4  67 1d 22 2b 8b 7b ad 55  |.{Dc.c..g."+.{.U|
0050  0b 08 cb 1e ac 02 d7 e6  80 47 38 8b 0d 7e 3d 3f  |.........G8..~=?|
0060  f8 24 f5 c2 83 f8 74 36  f6 64 a0 58 a8 e1 a9 c0  |.$....t6.d.X....|
0070  59 2f ca 6d 2c e7 51 94  9d cd d6 be c3 68 08 62  |Y/.m,.Q......h.b|
0080  20 20 50 4c 45 41 53 45  20 64 6f 6e 27 74 20 20  |  PLEASE don't  |
0090  20 6d 75 6c 74 69 70 6c  79 20 62 79 20 35 37 20  | multiply by 57 |
00a0  f5 e4 2f cc 41 8f 49 8c  2a 10 7c 47 7c af 50 1f  |../.A.I.*.|G|.P.|
00b0  ee 88 32 34 ea 58 e7 b6  5b 2a 55 c2 20 f7 b7 c2  |..24.X..[*U. ...|
00c0  1e f6 9c 25 81 fa 2e 30  29 c6 a8 4e e2 56 f8 f3  |...%...0)..N.V..|
00d0  1f 58 09 46 b3 b3 74 95  2c bf e1 c3 de 3b cc 98  |.X.F..t.,....;..|
00e0  1a be 7b 72 d0 21 45 ad  69 e4 60 e3 15 ad 68 59  |..{r.!E.i.`...hY|
00f0  89 c6 88 d3 39 40 fb ec  76 e6 34 d8 fc d2 5d 02  |....9@..v.4...].|

Which provides a great segue to our next, and worst issue.

Invertible Output Function — Don't Multiply by 57

The ** output function for xoshiro256** is coded as follows:

result_starstar = rotl(s[1] * 5, 7) * 9

What may be less obvious is that this output function is trivially invertible. We can rearrange the above as:

s[1] == rotl(result_starstar * 0x8e38e38e38e38e39, 57) * 0xcccccccccccccccd

This code works because 0x8e38e38e38e38e39 is the multiplicative inverse of 9, 57 is the rotational inverse of 7, and 0xcccccccccccccccd is the multiplicative inverse of 5.

In other words, all the work that the output function does to improve the output can be easily undone, returning us to the original (G)LFSR with all of its flaws.

But in fact we don't have to fully invert the output function to see problems. I tried just multiplying by 0x8e38e38e38e38e39 and found that xoshiro256** failed statistical tests. Then I tried just multiplying by 0x8e39 and that failed too. Finally, I tied multiplying by 0x39 (a.k.a. 57) and that failed as well—in only 9.5 seconds.

Here's the run:

unix% ./xoshiro256ss57  | ./RNG_test stdin64
RNG_test using PractRand version 0.93
RNG = RNG_stdin64, seed = 0x3a44e83d
test set = normal, folding = standard (64 bit)

rng=RNG_stdin64, seed=0x3a44e83d
length= 128 megabytes (2^27 bytes), time= 2.0 seconds
 no anomalies in 148 test result(s)

rng=RNG_stdin64, seed=0x3a44e83d
length= 256 megabytes (2^28 bytes), time= 4.7 seconds
 no anomalies in 159 test result(s)

rng=RNG_stdin64, seed=0x3a44e83d
length= 512 megabytes (2^29 bytes), time= 9.5 seconds
 Test Name                         Raw       Processed     Evaluation
 [Low16/64]BRank(12):1536(1)       R=+583.3  p~=  1.2e-176   FAIL !!!!!!    
 ...and 168 test result(s) without anomalies

It's not just 57, it's any number where the low-order byte is 57. As just one example, suppose the prime 32569 feels like a nice number to multiply by and you choose to multiply by that:

unix% ./xoshiro256ss32569 | ./RNG_test stdin64
RNG_test using PractRand version 0.93
RNG = RNG_stdin64, seed = 0x9d9da8ee
test set = normal, folding = standard (64 bit)

rng=RNG_stdin64, seed=0x9d9da8ee
length= 128 megabytes (2^27 bytes), time= 2.1 seconds
  no anomalies in 148 test result(s)

rng=RNG_stdin64, seed=0x9d9da8ee
length= 256 megabytes (2^28 bytes), time= 4.7 seconds
  no anomalies in 159 test result(s)

rng=RNG_stdin64, seed=0x9d9da8ee
length= 512 megabytes (2^29 bytes), time= 9.4 seconds
  no anomalies in 169 test result(s)

rng=RNG_stdin64, seed=0x9d9da8ee
length= 1 gigabyte (2^30 bytes), time= 18.3 seconds
  no anomalies in 180 test result(s)

rng=RNG_stdin64, seed=0x9d9da8ee
length= 2 gigabytes (2^31 bytes), time= 35.1 seconds
  no anomalies in 191 test result(s)

rng=RNG_stdin64, seed=0x9d9da8ee
length= 4 gigabytes (2^32 bytes), time= 67.2 seconds
  Test Name                         Raw       Processed     Evaluation
  [Low16/64]BRank(12):3K(1)         R= +2650  p~=  9.8e-799   FAIL !!!!!!!   
  ...and 200 test result(s) without anomalies

There are over 72,000 trillion 64-bit integers that it's bad to multiply by.

Does this seem like a rock-solid PRNG to you?

Conclusion

I was excited to learn that after two years of effort, Blackman and Vigna had a new PRNG. I really did want it to be good—the more good ones out there the better (and I happily celebrate good alternatives to PCG), but having looked at this work very briefly, I'm a bit disappointed because nothing I'm saying here in this critique is really new. Trivially invertible output functions aren't a good thing.

That said, most people who choose xoshiro256** are unlikely to be aware of any serious problems arising from using it. But as to the claim that xoshiro256** is an “all-purpose, rock-solid generator”, no, it's not all-purpose and it's not rock-solid.

Addendum

Lest I be accused of hypocrisy, I should point out that the issue with having an invertible permutation function also affects the PCG generators officially labelled pcgXX_once_insecure. The PCG paper (from 2014) notes this, saying:

In a cruel twist of fate, we will give it the rottenest job, generating 32 bits of randomness from 32 bits of state. It’s cruel because the generator has no choice but to perform its permutation in an invertible way (because the number of output bits and the number of state bits are the same), and so all of its statistical goodness can be undone by anyone who knows its permutation function, unmasking the lowly LCG it has inside (but given the intricateness of the permutation, that is unlikely to happen by accident). […] Notwithstanding its speed and compactness, all but the most space-constrained applications should prefer a more secure generator, unless the “every number appears only once” property is especially useful.

In fact, the every-number-appears-only once property should also cause the _once_insecure generators to fail statistical tests that look for missing repeats. In a future blog post, I'll discuss how to write an efficient test for missing repeats in a 64-bit PRNG (even though you can usually tell that the issue would occur from just looking at the code).

Thus the concerns I expressed here are entirely consistent with concerns I expressed about specialized PCG variants in 2014.