Critiquing PCG's streams (and SplitMix's too)

PCG allows its generators to optionally select a stream. I discuss the overall concept of streams in the general section on of the site on random number generation here; in essence, streams allow you to choose a generator from a collection of distinct-but-related generators.

Until now I haven't said much about the design decisions behind PCG's streams, but that changes here. At the end I'll also take a brief look at SplitMix's streams.

The Seed for This Post

Talking about PCG's streams was on my to-do list, but it wasn't that high on the list (so many posts to write, so little time!). Then I discovered a comment from a year ago written by Sebastiano Vigna, co-author of Xoroshiro128+. In a reddit comment, he wrote:

When you provide multiple streams by reparameterization (as in PCG), you are using each time a different generator. A generator nobody has ever tested. When you use jump-ahead functions, you get a different part of the orbit of a well-tested generator.

What is preferable is mostly a question of taste.

More recently, John D. Cook decided to test PCG with the DIEHARDER test suite. It did great (yay!), but in the comments up popped David Blackman, the other co-author of Xoroshiro128+ with his own critique:

PCG claims to provide multiple independent streams, but last time i looked no advice was given on how to do this. I made a few guesses for how to do this, but for 4 streams or more my attempts had detectable correlations between streams. Multiple independent streams are important for high-end non-cryptographic use, but perhaps not for more casual applications.

David is a little vague about how he detected the “correlations” he speaks of, and alas he has never contacted me about this issue and I don't have an email address for him (I'd love to hear from you, David!), so I'll assume he used the same approach I'll use in this post.

My favorite way to try to respond to a critique is to avoid being defensive and find a way to say, “You're right” to the person making the criticism. And in fact I can say “You're right” to both of these people, while hopefully convincing you that the design of PCG's streams reflects a conservative choice in a space of engineering trade-offs. David Blackman's desire for independence pushes one way; Sebastiano Vigna's worries about the perils of reparameterization pushes back in another.

What Are Streams For?

A program that uses a sequence of random numbers from a random number generator is probably going to need to do so more than once. For example, we might run the program a second time. Or we might have different subtasks within the program that each need their own sequence of numbers. Either way, reusing the same sequence between the same two tasks or between purportedly different runs seems like a bad thing.

That's the problem that seeding is meant to solve. Different seeds take you to vastly different places in the generator's sequence: problem solved. In our generators-as-books analogy, we're reading the book from two very different places.

If a generator has 64 bits of state, the odds that two randomly chosen seeds will happen to be the same is 1 in 264, which is overwhelmingly unlikely. Via the birthday problem, we know that if we create 232 different seeds (i.e., run our program that many times or create that many subtasks that each need their own generator), we'll have a 39.35% chance of having a seed we've seen before, and if we create 234 seeds, the probability rises to 99.97%.

But realistically, you're probably not going to run your program that many times or create that many tasks, so should we just not worry? No.

First, people doing large scientific computations might run into these larger numbers, but more importantly, you aren't going to just get one random number from that seed; you're going to get many. And ideally, we'd like to avoid overlap between the parts of the sequence used.

If our program grabs 232 numbers, we'd like to avoid seeds that land us anywhere within that run. That's not one seed to avoid, that's 232 seeds. Now we only need to run our program about 216 times to get a significant chance of seeing a collision. That's a small enough value to worry some people.

Solution 1: Larger State

We diminish the problem if we just have a large state. Given a generator with 128 bits of state and a period of 2128, we could be consuming a whopping 256 (72 quintillion) outputs and we'd need to create 2(128-56)/2 (60 billion) distinct sequences before we had to worry about any overlap.

The downside is that we now have more state to update each time, which might make things slower. And maybe even 128 bits doesn't seem like enough for some people. They want to be even more sure that overlap from random seeds is unlikely, so we might want to go up to something even bigger.

Solution 2: Partitioned Larger State

This is really just the previous solution with more rules. Rather than choose random seeds, we pick one seed and then subsequent generators are made by taking the initial state of our first generator and advancing it by such a huge jump that we're in a far away piece of the sequence. This approach really works best for subtasks within one run of a program rather than repeated runs unless we undertake careful seed management to make sure seeds in later runs are suitably far away from previous runs.

Vigna & Blackman's Xoroshiro128+ implementation has a jump-ahead-by-264 operation specifically to support this way of working (they didn't provide general jump-ahead, although it would have been possible to write, if a little expensive to run). PCG has (arbitrary) jump-ahead, too, and large-state-space generators, so if you like these kinds of “streams”, you can totally use them.

It's a very minor quibble, but I don't consider partitioning to create “true” streams, myself, because each partition is a random sample, not a uniform generator. In other words, if you used all the numbers you were allowed to from a given stream, you'd find bias because some numbers occur more than others. If you make sure you use a different stream every time (why wouldn't you!?!), you're okay; if you keep using numbers from the same partition (don't!), you'd be in trouble.

Solution 3: Reparameterization

Random number generators often have magic constants of one kind or another in them. Good generators have an underlying mathematical structure, and that structure helps define what the good constants are. They may look arbitrary, but the ones that the designer selected were mathematically and/or empirically verified to be good.

There may be some art or arcane knowledge required to select these constants, but almost always there are actually multiple values that would work well. If you change the values—reparameterize—to other ones that are good, you'll have a similar generator that outputs a different random sequence. Different parameters = different streams.

The cool thing about reparameterization is that you'd be using the constants anyway, so you're just putting a constant into a variable. Also, the constants will only change if you switch to a new stream, so there is less state to update each time.

Moreover, because the parameters define a self-contained generator, it really is a brand-new generator in its own right, not a just piece of an existing one.

But then we come to Sebastiano Vigna's concern: if each reparameterization is really a brand-new never-tested-before generator, how can we trust it? Sometimes the answer is “Because math!”

Streams for Linear Congruential Generators

Under the covers, PCG is using a linear congruential generator for its state transition function. The only new thing in PCG is (a) giving LCGs another chance, and (b) adding a powerful output function to improve the final result and make them harder to predict.

PCG's streams come directly from the properties of LCGs. The state advance function looks like this:

state = state * MULTIPLIER + INCREMENT;

If we change either of these parameters, we get a different sequence out of our generator. In this section, we'll look at what it means to vary these parameters for an LCG. (After we've done that, we'll see how much difference PCG's output function makes.)

Changing the Increment

Increments have just one rule: they must be odd. That's it. It doesn't matter at all what the increment is.

Finding a good increment parameter is thus trivial, but there's a catch. Changing the increment is actually equivalent to just adding a fixed offset to the output of either the increment-by-1 LCG stream or the increment-by-3 one. In other words, there is no difference between these two functions:

unsigned int lcg() {
    state = state * MULTIPLIER + INCREMENT;
    return seed;
}

and

unsigned int lcg() {
    state = state * MULTIPLIER + (INCREMENT & 3);
    return state + magic_hash_value(MULTIPLIER, INCREMENT);
}

More practically, if our multiplier was 0xdb429a1d, and our increment was 1, and our seed was 0, we'd get as our first ten outputs

0, 1, 3678575134, 1291682151, 2988285612, 187068797, 2497318186,
1534247363, 701841688, 1451282361, ...

whereas if we set the seed to 1 and the increment to the magic value 0x24bd65e5, we would get

1, 2, 3678575135, 1291682152, 2988285613, 187068798, 2497318187,
1534247364, 701841689, 1451282362, ...

which is our original sequence offset by one. And, similarly, if we set the seed to 5 and the increment to the magic value 0xb7b2fd71, we would get:

5, 6, 3678575139, 1291682156, 2988285617, 187068802, 2497318191,
1534247368, 701841693, 1451282366, ...

which is the original numbers offset by five.

Likewise, the increment-by-3 stream is

0, 3, 2445790810, 3875046453, 374922244, 561206391, 3196987262,
307774793, 2105525064, 58879787

and if we pick the increment 0x497acbcb, and seed with 2, we get

2, 5, 2445790812, 3875046455, 374922246, 561206393, 3196987264,
307774795, 2105525066, 58879789

Here are the increments to get offsets 1 to 32 on the two fundamental streams:

0x24bd65e5      0x497acbcb      0x6e3831ad      0x92f59793
0xb7b2fd75      0xdc70635b      0x012dc93d      0x25eb2f23
0x4aa89505      0x6f65faeb      0x942360cd      0xb8e0c6b3
0xdd9e2c95      0x025b927b      0x2718f85d      0x4bd65e43
0x7093c425      0x95512a0b      0xba0e8fed      0xdecbf5d3
0x03895bb5      0x2846c19b      0x4d04277d      0x71c18d63
0x967ef345      0xbb3c592b      0xdff9bf0d      0x04b724f3
0x29748ad5      0x4e31f0bb      0x72ef569d      0x97acbc83

Do you feel cheated? That all the different streams of the LCG are really just offsets on two basic ones? Nothing more?

But at least we can now relax about Vigna's concern that changing parameters would result in fundamentally different generators; we can see that they're clearly not fundamentally different at all. They're blandly similar.

Well, there is one cool thing. Let's look at the offsets on the unit-increment stream for increments of 1, 5, 9, …, 125,

0x00000000      0xc678c0c9      0x8cf18192      0x536a425b
0x19e30324      0xe05bc3ed      0xa6d484b6      0x6d4d457f
0x33c60648      0xfa3ec711      0xc0b787da      0x873048a3
0x4da9096c      0x1421ca35      0xda9a8afe      0xa1134bc7
0x678c0c90      0x2e04cd59      0xf47d8e22      0xbaf64eeb
0x816f0fb4      0x47e7d07d      0x0e609146      0xd4d9520f
0x9b5212d8      0x61cad3a1      0x2843946a      0xeebc5533
0xb53515fc      0x7badd6c5      0x4226978e      0x089f5857

Both these tables of numbers should seem pretty random. They show that you can contrive an increment to make similar streams (i.e., streams where the hamming weight of the difference between the the two streams is small), but you have to work at it and there won't be very many streams that are really similar to each other. Likewise, we've learned that if you do something simple, like pick streams 1, 3, 5, 7, …, you'll end up with quite arbitrary additive constants, and you'll need a carefully contrived seed value for the streams to line up.

If you're worrying about things accidentally lining up, on the one hand, it is possible, but remember the problem we began with. We could always have generated two seeds that were nearby, creating two generators with closely correlated output. Adding streams hasn't made the problem worse, it's drastically reduced the chance of this problem happening. Empirical results show that they behave as you'd hope: we get a reduced chance of sequence overlap similar to doubling the size of the state, all through simply varying a parameter.

Changing the increment parameter is just barely enough for streams that are actually useful. They aren't statistically independent, far from it, but they are distinct and they do help.

Changing the Multiplier

The other parameter of an LCG is the multiplier, and we could change that, too. Pick a random number, multiply by eight, add five, and you're good to go.

But here, Vigna's concern is more valid. We potentially have a completely untested generator that's never been seen before. Some LCG multipliers are terrible; for example the famed RANDU constant, which is 0x00010003. Other worse examples for a 32-bit are generator are 0x00000005 and 0xdddddddd. But let me ask, do these numbers look very random to you?

Actually, despite how it might appear from these examples, you can't tell a good multiplier from a bad one just by looking. Thankfully, because we're working with a LCG, there is a considerable body of research into what makes a good multiplier, and the most highly regarded mathematical test on a multiplier is the spectral test.

Here are ten 64-bit multipliers from the top 1% of multipliers according the spectral test (using the 8-dimension test), showing their score and their binary representation,

0.678689  1110000001100001010010111001011100111111010110010011111101010101
0.672519  1001100011111110000001111011101001001110110010110110111001010101
0.644256  0111100111110000100111001000100100100011110110000001100110001101
0.625962  1111000101110010010110011100000001000100100111110101111110100101
0.618408  1111011000111100101001010000110001011100110000101101101011110101
0.612807  1110010101011111010001010110011101100101001000011111111010010101
0.609951  1001111010111011111011100101100100101110110111010010010001010101
0.604624  0010100011101111010000000010011001111111110011110110011110101101
0.603861  0011001011100000100101000101010100001011010010110001010000000101
0.600827  1111010000110011101001100011110000100101011100000111100011010101

and here are ten from the bottom 1%,

0.082543  1001010111100111010100110111101010000100101111011101101001101101
0.081919  1001111110000100010110011110000010001000100010111011110010000101
0.080182  0101110010101011111100111000010111100000000000011010101111000101
0.078860  0001110011110101110001000110100100001100110010011111100111110101
0.074267  0101000100011111010010100000001011110010101110111110000000011101
0.071667  1011011110100010111100100000001100000101001011001001101001000101
0.054618  0101100100111101000110100101000111100011001010101010001111111101
0.052786  0011100111101110011000010000001100011010100001110010110100101101
0.036645  0111010011111001110111110110101111000011001011110101110011101101
0.032604  0001000001011100000111010111100100010111100100101110010101111101

Looking at these, I think it's fair to say that there is no quick way to eyeball a constant and tell how good it is. There is no short-cut algorithm based on counting runs of bits, or how many ones and zeros there are.

In fact, your intuition is almost certainly going to be wrong. Look at these multipliers and their scores. Despite looking bad, the spectral test says they're pretty great!

0.710248  0000000000000000000000000000000011110001001100101000001110101101
0.660283  0000000000000000000000000000000011001011011110001011010111101101
0.651093  0000000000000000000000000000000011111101101001010110110000001101
0.617484  0000000000000000000000000000000010111100000111100110110110110101
0.614549  0000000000000000000000000000000011010111001101000011100110001101
0.602821  0000000000000000000000000000000011011001010111101101001000101101
0.600774  0000000000000000000000000000000011101001000111111010001111100101
0.599844  0000000000000000000000000000000011110011001011101010100011001101
0.598651  0000000000000000000000000000000011000010011001011011111110100101
0.593278  0000000000000000000000000000000011100011110011000000001010011101

Our intuition isn't totally off, because the bottom 1% looks worse:

0.009322  0000000000000000000000000000000000000010100100000111010001110101
0.008766  0000000000000000000000000000000000000010011010010100111110111101
0.008141  0000000000000000000000000000000000000010001111010100110111100101
0.007266  0000000000000000000000000000000000000001111111111010101011101101
0.005313  0000000000000000000000000000000000000001011101100010111001100101
0.004731  0000000000000000000000000000000000000001010011010011001010101101
0.004153  0000000000000000000000000000000000000001001001000111010001010101
0.003125  0000000000000000000000000000000000000000110111000000110110000101
0.001782  0000000000000000000000000000000000000000011111011000010110011101
0.001372  0000000000000000000000000000000000000000011000001001100101000101

The spectral test has traditionally been the best way to be sure that you've got a genuinely good constant. And even it isn't quite as good a check as you might hope. These four constants are actually pretty poor in my book, but their spectral scores don't reveal the problem that I have carefully engineered into them:

0.551393  1001010101000001100100010110011110101111001011111011110110001101
0.546823  0000000110110110010010000001010010000010000000000101110110000101
0.541209  1101011001010100100100111100101100000100001101111011010100011101
0.505168  1010010010001100001000000111000111011111110010001001001011101101

I won't say what the flaw is, but leave it as a puzzle for the interested.

But despite the above concerns, it is the case that the process for checking these constants is quite well understood. It is possible to build a table of rigorously tested constants offline and then use them, and it is similarly possible to apply a runtime test to check that a multiplicative constant is okay.

So, if you want true streams that are closer to the level of stream independence that David Blackman would like, you can have them with LCGs, but you need to do some work to make sure the parameters are okay, otherwise you end up with precisely the worries that Sebastiano Vigna called out.

LCG Streams Conclusion

With LCGs we get streams essentially for free. The streams we get from adjusting the additive constant aren't very different from each other, but that makes them very safe; so long as the increment is odd, there are no bad increments. Streams from changing the multiplier are more different from each other, but here there are bad multipliers that must be avoided. These two kinds of stream are orthogonal, we can use one, or both, or none.

Streams in PCG

PCG uses a linear congruential generator as its state-transition function, but adds powerful output functions to further randomize the output. As such, it can use the streams that LCGs so cheaply provide while improving them.

Via LCG increment

Let's look back at the almost identical streams we saw earlier, corresponding to increments of 1 and 0x24bd65e5 with a multiplier of 0xdb429a1d, this time in hexadecimal:

0x00000000 0x00000001 0xdb429a1e 0x4cfd8167 0xb21d9eac 0x0b26717d
0x94da0d2a 0x5b72c1c3 0x29d54118 0x5680cfb9 0x1a3ed1f6 0x56d7c4df

and

0x00000001 0x00000002 0xdb429a1f 0x4cfd8168 0xb21d9ead 0x0b26717e
0x94da0d2b 0x5b72c1c4 0x29d54119 0x5680cfba 0x1a3ed1f7 0x56d7c4e0

And now let's see what the look like as they come out of the RXS M XS output function:

0x00000000 0x108ef29b 0x85e88cf0 0x72184b3e 0xcf3524c3 0xdc8330a8
0xb3fe1825 0xce615b12 0xe03d803c 0x5290ce64 0x5d26d6e5 0x9dae36c3

0x108ef29b 0x211de536 0x75599ddb 0xa3c5210e 0xbea637dc 0xcbf4422e
0xc48d0ed1 0xdef0487f 0xf0cc7556 0x4201dd5d 0x4c97e58a 0xccbc3f80

They certainly aren't as similar as they were. But we can't pat ourselves on the back too much, because the RXS part of the permutation is actually doing little to help in this case because the high bits it relies on aren't changing between the two streams. If it had been the high-order bit that was the only bit being flipped, as shown below, the permutation would have been doing a little more.

0x80000000 0x80000001 0x5b429a1e 0xccfd8167 0x321d9eac 0x8b26717d
0x14da0d2a 0xdb72c1c3 0xa9d54118 0xd680cfb9 0x9a3ed1f6 0xd6d7c4df

0x16c8005b 0x2756f244 0xbf67bda6 0xe17e580f 0xae0e6b51 0x077df517
0x522ae283 0xb6eb85b1 0x9e6d033c 0x811d3b15 0x8f2a023c 0xb6c4154f

But at the most fundamental level though, PCG's output functions aren't designed as maximal avalanche hash functions. They don't need to be. They're designed to permute the output as much as necessary for good statistical performance with a reasonable margin of safety.

When you don't contrive special increment choices to maximize similarity, things are better. If our increment choices had been 1 and 5, we'd have this output from the underlying output function for 5:

0xc678c0c9 0xc678c0ca 0xa1bb5ae7 0x13764230 0x78965f75 0xd19f3246
0x5b52cdf3 0x21eb828c 0xf04e01e1 0x1cf99082 0xe0b792bf 0x1d5085a8

and we'd get this output from RXS M XS:

0x783effe9 0x88cdf2c1 0x7618f9a2 0x13fdbdb6 0x24d4829d 0x1ba7404f
0x16b71c17 0xa91c6b32 0xbf125954 0xf40daec6 0xd7e99a9d 0x2fe10302

In practice, with noncontrived increments, my empirical tests show that PCG's streams work as well as stronger hash functions designed for maximal avalanche.

The advice for picking a stream is similar to the advice for picking seeds. Either use random values, or, if you're going for reproducible results, it's okay to pick 1,2,3,4 as seed and or stream. For seed or stream, an adversary could contrive a “nearby” seed, but it's unlikely to happen by chance. (STOP PRESS: at least that's what I thought when I wrote this post; correspondence with David Blackman shows that it may be easier than I had thought to make “nearby” streams with correlated initializations like a constant seed and streams of 1,2,3,4. If you're going to be using multiple streams at the same time, for now I'll recommend that the stream id and the seed should be distinct and not have obvious correlations with each other, which means don't make them both 1,2,3,4.)

In view of the strong underlying structural similarity, however, I would say that you shouldn't assume that you can take substantially more output from a generator if you're using streams than you can if you don't. In other words, if you want petabytes of random numbers, don't try to do so by making zillions of pcg32 generators; that won't be any better than using just one pcg32 generator. Instead use pcg64 for which a petabyte is like a tiny speck of nothing; if you want to use zillions of those, too, go ahead.

Via LCG Multiplier

None of the code I provide with PCG demonstrates how to change the multiplier, but it's very simple to modify the C implementation to change the constant into a variable like the increment is, and in the C++ version, the multiplier is defined by a mixin class, which has a default but can be changed by a parameter.

So, you can change the multiplier, but I didn't signpost that you can or should do so. Instead I picked a known-good multiplier.

As we've seen, a poor choice of multiplier for an LCG can make a terrible random number generator, such as RANDU. But we also know that PCG permutes the output of the LCG, so perhaps that'll help?

Let's try RANDU, one of the worst LCGs ever:

linux$ ./pcg32_multseq 65539 | ./RNG_test stdin32
Using 0x0000000000010003 as the multiplier...
RNG_test using PractRand version 0.93
RNG = RNG_stdin32, seed = 0xcbfdd420
test set = normal, folding = standard (32 bit)

rng=RNG_stdin32, seed=0xcbfdd420
length= 128 megabytes (2^27 bytes), time= 2.3 seconds
  Test Name                         Raw       Processed     Evaluation
  [Low8/32]FPF-14+6/16:all          R=  +4.7  p =  7.2e-4   unusual
  ...and 116 test result(s) without anomalies

rng=RNG_stdin32, seed=0xcbfdd420
length= 256 megabytes (2^28 bytes), time= 5.1 seconds
  Test Name                         Raw       Processed     Evaluation
  DC6-9x1Bytes-1                    R=  +6.6  p =  1.7e-3   unusual
  FPF-14+6/16:all                   R=  +5.3  p =  1.9e-4   unusual
  [Low8/32]FPF-14+6/16:(0,14-0)     R=  +9.5  p =  2.1e-8   very suspicious
  [Low8/32]FPF-14+6/16:all          R=  +9.5  p =  2.1e-8   very suspicious
  [Low8/32]FPF-14+6/16:all2         R= +19.9  p =  6.7e-9   very suspicious
  ...and 119 test result(s) without anomalies

rng=RNG_stdin32, seed=0xcbfdd420
length= 512 megabytes (2^29 bytes), time= 11.4 seconds
  Test Name                         Raw       Processed     Evaluation
  DC6-9x1Bytes-1                    R=  +6.5  p =  1.3e-3   unusual
  FPF-14+6/16:(3,14-0)              R=  +6.7  p =  8.2e-6   unusual
  FPF-14+6/16:all                   R=  +8.1  p =  4.5e-7   very suspicious
  FPF-14+6/16:all2                  R=  +9.0  p =  4.0e-5   mildly suspicious
  [Low8/32]FPF-14+6/16:(0,14-0)     R= +17.5  p =  7.7e-16    FAIL
  [Low8/32]FPF-14+6/16:(1,14-0)     R= +12.8  p =  1.9e-11   VERY SUSPICIOUS
  [Low8/32]FPF-14+6/16:all          R= +17.2  p =  1.3e-15    FAIL !
  [Low8/32]FPF-14+6/16:all2         R= +92.4  p =  3.9e-36    FAIL !!!
  ...and 124 test result(s) without anomalies

It just about makes it to the start of the area where PractRand is reporting results, but then fails. Which isn't that surprising, because it is using an absolutely terrible multiplicative constant. One of my implementations of the spectral test rejects the RANDU constant entirely, saying

The parameters do not satisfy the conditions (i) (ii) (iii) given in H. Niederreiter, Random Number Generation and Quasi-Monte Carlo Methods, SIAM, Philadelphia, 1992, p. 169.

but we can test its bizarro-world counterpart, -65539, and I can also feed the RANDU constant into another spectral-test evaluator that is less strict, and both agree that the value for the spectral test is 0.0000142006, which is terrible.

The worst constant ever is 5, with a spectral-test value of 0.00000000110482, which gives us these results when used as the multiplier in pcg32:

linux$ ./pcg32_multseq 5 | ./RNG_test stdin32 -tlmin 19
Using 0x0000000000000005 as the multiplier...
RNG_test using PractRand version 0.93
RNG = RNG_stdin32, seed = 0xe5c13b51
test set = normal, folding = standard (32 bit)

rng=RNG_stdin32, seed=0xe5c13b51
length= 512 kilobytes (2^19 bytes), time= 0.1 seconds
  Test Name                         Raw       Processed     Evaluation
  [Low8/32]FPF-14+6/16:all          R=  +5.4  p =  5.7e-4   unusual
  ...and 54 test result(s) without anomalies

rng=RNG_stdin32, seed=0xe5c13b51
length= 1 megabyte (2^20 bytes), time= 0.7 seconds
  Test Name                         Raw       Processed     Evaluation
  DC6-9x1Bytes-1                    R=  +6.4  p =  2.0e-3   unusual
  [Low8/32]FPF-14+6/16:(0,14-5)     R=  +8.6  p =  5.3e-7   mildly suspicious
  [Low8/32]FPF-14+6/16:all          R= +12.3  p =  9.3e-10   VERY SUSPICIOUS
  [Low8/32]FPF-14+6/16:all2         R= +25.1  p =  4.4e-9    VERY SUSPICIOUS
  ...and 56 test result(s) without anomalies

rng=RNG_stdin32, seed=0xe5c13b51
length= 2 megabytes (2^21 bytes), time= 1.3 seconds
  Test Name                         Raw       Processed     Evaluation
  BCFN(2+0,13-6,T)                  R= +12.8  p =  8.0e-5   mildly suspicious
  DC6-9x1Bytes-1                    R=  +8.6  p =  1.0e-4   suspicious
  [Low8/32]Gap-16:A                 R=  +4.2  p =  3.0e-3   unusual
  [Low8/32]Gap-16:B                 R=  +7.8  p =  3.9e-6   suspicious
  [Low8/32]FPF-14+6/16:(0,14-5)     R= +21.4  p =  1.4e-17    FAIL !
  [Low8/32]FPF-14+6/16:(1,14-5)     R= +18.8  p =  1.9e-15    FAIL
  [Low8/32]FPF-14+6/16:(2,14-6)     R= +10.7  p =  2.9e-8   suspicious
  [Low8/32]FPF-14+6/16:(3,14-7)     R=  +9.5  p =  2.4e-7   mildly suspicious
  [Low8/32]FPF-14+6/16:(4,14-8)     R=  +9.6  p =  6.1e-7   mildly suspicious
  [Low8/32]FPF-14+6/16:all          R= +33.7  p =  2.1e-27    FAIL !!
  [Low8/32]FPF-14+6/16:all2         R=+233.6  p =  8.4e-72    FAIL !!!!
  ...and 59 test result(s) without anomalies

These two results shows exactl the issue that Sebastiano Vigna worried about at the start. Different multiplicative constants really do create substantively different generators, and that's a problem.

An early draft of my PCG paper was titled “Redeeming RANDU” because I did come up with output functions that could handle this kind of terrible behavior from an underlying poor RNG, but in the end I discarded those output functions because, as cute as it was to be able to fix such a poor LCG, there was no reason to use such a terrible constant, and less agressive output functions were faster.

I didn't think much more about it until I began writing this post. It turns out that although there are a small number of terrible multiplicative constants, the vast majority of them are good, and PCG can actually handle almost all of the terrible ones, too, just not the worst of the worst. Let's look at just a few really bad ones.

1.104e-9  0000000000000000000000000000000000000000000000000000000000000101
0.000014  0000000000000000000000000000000000000000000000010000000000000011
0.000399  1001001011011011110111011001010010001011101000101110100010111101
0.000417  1111100101000100101100011011010001110101101101111111111001000101
0.031192  0000000000000000000000010000000000000000000000000001000000000101
0.032604  0001000001011100000111010111100100010111100100101110010101111101
  • 1.104e-9 (0x0000000000000005) — fails after just 2 MB of output (and raises clear suspicions at at the half MB mark).
  • 0.000014 (0x0000000000010003, RANDU), fails at 512 MB (and raises clear suspicions at 128 MB)
  • 0.000399 (0x92dbdd948ba2e8bd), makes PractRand suspicious at 2TB and fails at 4TB.
  • 0.000417 (0xf944b1b475b7fe45) successfully completed a full 32TB of PractRand testing with no anomalies.
  • 0.031192 (0x0000010000001005), fails at 16TB (after making PractRand suspicious at 8TB). (I engineered this one to to be RANDU-like, but not have as bad a spectral test result. 60 of its 64 bits are zeros, which makes it something of an outlier.)
  • 0.032604 (0x105c1d791792e57d) successfully completed a full 32TB of PractRand testing with no anomalies.

Only about 1 in every 250,000 64-bit multipliers is as bad as the ones that caused PractRand to fail. On the one hand, that means you'd have to be quite unlucky to get one by chance. On the other hand, not that unlucky.

These results support the choice I made to not make it easy for people to change the multiplier, but also show that my choice is a conservative one. As such, I'm slowly reconsidering it.

For one thing, we could have a table of known-good multipliers. A table of particularly well-chosen multipliers will support the creation of several different variations from each constant, so we can actually get a lot of known-good constants from a very small table. Or, given how rare really bad ones are, we could pick one from a reproducible pseudorandom sequence of multipliers, and have a blacklist for any bad ones that should be avoided. We could even go the whole hog and just embed the code for the spectral test so that we can find a random-but-excellent multiplier.

I'm still mulling these ideas (and a few other ones too!).

Allowing people to change the multiplier potentially makes David Blackman happier (at the risk of making Sabasiano Vigna unhappier), but it possibly makes me happier, too. One thing I don't much like about pcg32 is that it's a little bit too easy to predict. It's harder than some generators, sure, but I outline how to predict it on this website, and Marius Lombard-Platet actually wrote code to do as I suggested (thanks Marius!). Doing the prediction is resource intensive, and not 100% exact, but it's certainly within the reach of anyone who cares to do so. An unknown multiplier doesn't make it impossible either, but it at adds an extra level of tiresomeness when it comes to predicting it.

We'll see.

Stream Alternatives

PCG's extended generation scheme (pcg32_k2 and friends) provides an alternative to the streams that come for free from the underlying LCG. Unlike the LCG streams, the added state does provide additional statistical strength.

Given the length of this post, we'll leave discussion of that option for another time. (And given my queue of to-be-written posts, that might be some time from now, alas.)

Streams in other PRNGs

Also, in view of the length of this post, I won't go into all the other PRNGs out there, but one is worth mentioning, SplitMix.

In SplitMix, the state transition function creates a Weyl sequence (i.e., whereas an LCG uses state = state * mult + inc; as its transition function, a Weyl sequence just uses state += inc;), and it uses the increment parameter to form different streams. So where does SplitMix fall on the Blackman/Vigna “concerned Internet commenter” scale?

In terms of added statistical independence from its streams, I think it's probably better than LCG-increment-based streams, but probably slightly less good than LCG-multipler-based streams. Since PCG uses LCG-increment-based streams, we can chalk that up as a win for SplitMix, although I'll confess that I don't know for sure, I'm just willing to concede the point.

But whereas there is a theory for LCG multipliers, there is no similar theory that I am aware of for Weyl sequences beyond what is in the SplitMix paper. So Vigna's concerns about reparameterization may be valid here. Is it really the case that SplitMix's code to detect and discard bad constants is sufficient?

Actually, no, it is not. Both Guy Steele and I have independently discovered cases where SplitMix's detection code fails. For example, the constant 0xaaaaaaaaaaaaaaab passes SplitMix's check for bad constants but almost instantly fails when tested:

linux$ ./splitmix.64 0x5a383856a683771 0xaaaaaaaaaaaaaaab | ./RNG_test stdin64
SplitMix64(0x05a383856a683771,0xaaaaaaaaaaaaaaab) initialized.
Using next_int64() output function
RNG_test using PractRand version 0.93
RNG = RNG_stdin64, seed = 0xf217d6bf
test set = normal, folding = standard (64 bit)

rng=RNG_stdin64, seed=0xf217d6bf
length= 128 megabytes (2^27 bytes), time= 2.2 seconds
  Test Name                         Raw       Processed     Evaluation
  [Low16/64]BCFN(2+2,13-5,T)        R=  -6.2  p =1-1.0e-3   unusual
  ...and 147 test result(s) without anomalies

rng=RNG_stdin64, seed=0xf217d6bf
length= 256 megabytes (2^28 bytes), time= 5.7 seconds
  Test Name                         Raw       Processed     Evaluation
  BRank(12):2K(1)                   R= +51.9  p~=  1.1e-16    FAIL !
  ...and 158 test result(s) without anomalies

Alas, there doesn't seem to be an easy repair for the SplitMix's test. The slightly more carefully contrived constant 0xcec4ec4ec4ec4ec5 has an almost even spread of zeros and ones, and a structure that seems a bit regular, but it too fails PractRand at the 256 GB point (and much faster if you actually realize what I did to make this constant and how to reveal it; I'll leave that as a reader exercise for now).

So bad parameter values can have a dramatic effect on SplitMix. Vigna's concerns are absolutely valid here.

I was a bit perplexed and disturbed when Vigna created his own implementation of SplitMix and tore out and threw away the stream functionality that is one of SplitMix's key selling points; it seemed so odd to me to promulgate a weakened version of someone else's generator. But perhaps there was an argument for doing so after all.

In my implementation, I left it in, risks and all. Let SplitMix be SplitMix, I say. There's just more care required to use it than you might think. ... something something don't cross the streams.