Specific Problems with Other RNGs

There are many random number generators already out there. On this page, we will look at a some of the most popular and well known. These RNGs have many good qualities, but we will focus mostly on their flaws.

Mersenne Twister

This RNG is one of the most widely used and highly recommended RNGs. It is provided in C++11 as mt19937 and mt19937_64, and is also the default random number generator for Python. (See the Mersenne Twister Wikipedia page for more details about the points listed below.)

Positive Qualities

  • Produces 32-bit or 64-bit numbers (thus usable as source of random bits)
  • Passes most statistical tests

Neutral Qualities

  • Inordinately huge period of 219937 - 1
  • 623-dimensionally equidistributed
  • Period can be partitioned to emulate multiple streams

Negative Qualities

  • Fails some statistical tests, with as few as 45,000 numbers.
  • Predictable — after 624 outputs, we can completely predict its output.
  • Generator state occupies 2504 bytes of RAM — in contrast, an extremely usable generator with a huger-than-anyone-can-ever-use period can fit in 8 bytes of RAM.
  • Not particularly fast.
  • Not particularly space efficient. The generator uses 20000 bits to store its internal state (20032 bits on 64-bit machines), but has a period of only 219937, a factor of 263 (or 295) fewer than an ideal generator of the same size.
  • Uneven in its output; the generator can get into “bad states” that are slow to recover from.
  • Seedings that only differ slightly take a long time to diverge from each other; seeding must be done carefully to avoid bad states.
  • While jump-ahead is possible, algorithms to do so are slow to compute (i.e., require several seconds) and rarely provided by implementations.

Minstd and Unix rand

Minstd and rand (some implementations) are both examples of linear congruential generators with a non-power-of-two modulus. (See the Minstd Wikipedia page for more details.)

Minstd is provided as minstd and minstd0 in C++11, and is also currently the RNG that will be selected by LLVM's libc++ and GCC's libstdc++ if you select default_random_engine.

Positive Qualities

  • Uses a very small amount of memory
  • Efficient jump-ahead is possible

Negative Qualities

  • Short period
  • Predictable — after producing just one random number, we can completely predict its output.
  • Not especially good for producing 32-bit numbers (or a stream of random bits)
  • Only produces each number once
  • Fails many statistical tests
  • Minstd is fairly slow (whereas speed of rand varies across implementations)
    • Using a non-power-of-2 modulus require a division operation which can be slow to implement.
  • Typical implementations do not provide multiple streams, although it would be possible to do so
  • Typical implementations do not provide jump-ahead, although it would be possible to do so

Arc4Random

This RNG is provided on OS X and BSD systems as arc4random, which is based on the trade-secret RC4 algorithm—see the RC4 Wikipedia page for more details. (Recently, on OpenBSD systems, a generator based on the ChaCha20 stream cipher, discussed below, is provided under the name arc4random.)

Positive Qualities

  • Cryptographically secure (not predictable, widely used as a stream cipher).
  • Passes TestU01's empirical statistical tests (but see below)
  • Produces 32-bit numbers (thus usable a stream of random bits)

Neutral Qualities

  • Inordinately huge expected period of approximately 21699
  • Different initializations can be expected to provide different random streams

Negative Qualities

  • Statistically mediocre
    • Although the standard variant does pass TestU01's BigCrush battery, that isn't much of an achievement given 2064 bits of internal state—a simple LCG passes with 88 bits of internal state! If we reduce the number of S-boxes from 256 to 64, requiring 396 bits of internal state, it still passes, but if we reduce the number to 32, which is 170 bits of state, it fails badly.
  • Has been mathematically shown to be nonuniform.
    • In fact, even though the test is not included in TestU01's suite, tests exist that can distinguish the output of arc4random from a true random sequence.
  • Unlike some generators with a very large period, does not provide k-dimensional equidistribution.
  • Period varies based on seed.
    • In fact, in FreeBSD prior to the 7.1 release (late 2008), OS X prior to OS X Lion, and iOS prior to iOS 5, the implementation of arc4random did not take steps to avoid very short periods—more than 1 time in 75,000, the period of the generator could be tiny, producing only 16320 integers before repeating.
  • Very slow.
  • The BSD implementations (which are the only widely used ones) have several additional issues
    • No facility for a user-provided seed, preventing programs from getting reproducible results
    • Periodically “stirs” the generator using kernel-provided entropy; this code must be removed if reproducible results desired (and is removed from the C++ adaptation I created for testing purposes)
  • Has been repeatedly “fixed” to address bugs and security issues.

OpenBSD was sufficiently dissatisfied with it that they replaced it with a generator based on the ChaCha20 stream cipher.

ChaCha20

This RNG is provided on OpenBSD systems as a replacement for arc4random (and somewhat confusingly is provided under the name arc4random for backwards compatibility). It is based on Daniel J. Bernstein's ChaCha20 stream cipher, which is a variant of his Salsa20 cipher.

Positive Qualities

  • Cryptographically secure (not predictable, has been scrutinized by cryptographic community).
  • Supports different variants (e.g., ChaCha8) which try less hard to be secure, but run faster by reducing the number of rounds.
  • Passes TestU01's empirical statistical tests (but see below)
  • Produces 32-bit numbers (thus usable a stream of random bits)
  • The ChaCha20 stream cipher is claimed be about 3× faster than the Arc4 stream cipher (but see below).
  • The ChaCha20 stream cipher is seekable, allowing jump-ahead (but this feature isn't provided by the OpenBSD implementation).
  • About 70% faster than arc4random when used as a general-purpose 32-bit random number generator
    • Assuming they're are compared on an equal footing (with periodic “stirring” code removed)
    • In the case of OpenBSD, their arc4random implementation “stirs” (i.e., requests external entropy from the kernel) far more frequently than their ChaCha20 implementation, resulting in a 4.5× speed improvement on that platform.

Neutral Qualities

  • Designed for uses as a stream cipher, thereby produces blocks of random numbers. Although the block-a-time approach can allow some optimizations, it can waste space when as a typical one-at-a-time general-purpose RNG.

Negative Qualities

  • Slow, compared to general purpose generators (although it is 70% faster than arc4random, it is an order of magnitude slower than high performing general-purpose RNGs).
  • OpenBSD's implementation uses 1104 bytes to store the generator state (about 4× the space for arc4random, because it computes sixteen 64-byte blocks at a time). More positively, the C++ implementation by Orson Peters only requires 104 bytes because it only stores a single 64-byte block.
  • Has a much smaller period than might be inferred from the size of the generator state.
  • Fewer rounds result in poor statistical performance; ChaCha2 fails statistical tests badly, and ChaCha4 passes TestU01 but sophisticated mathematical analysis has shown it to exhibit some bias. ChaCha8 (and higher) are believed to be good. Nevertheless, ChaCha needs to go to more work to achieve satisfactory statistical quality than many other generators.
  • ChaCha20, being newer, has received less scrutiny from the cryptographic community than Arc4.
  • Unlike some generators with a very large period, it does not provide k-dimensional equidistribution.
  • Unlike C++ implementation by Orson Peters, OpenBSD's implementation (which is quite commonly suggested as a “good” implementation) has several additional issues:
    • No facility for a user-provided seed, preventing programs from getting reproducible results
    • Periodically “stirs” the generator using kernel-provided entropy; this code must be removed if reproducible results desired (in testing its speed, I deleted this code)
    • Provided as a single, global RNG.
    • Multithreaded programs must share the global RNG, and contend for the spin lock that protects it.

Unix drand48, Java's java.util.Random

These generators are examples of linear congruential generators with a power-of-two modulus. (See the Linear Congruential Generator Wikipedia page for more details.)

Positive Qualities

  • Fast (assuming multiplication is fast)
  • Uses a small amount of memory
  • Efficient jump-ahead is possible
  • Good for producing 32-bit numbers (or a stream of random bits)

Negative Qualities

  • Fairly short period
  • Predictable (although generating 48 bits and dropping the low 16 bits makes it very slightly harder to predict than rand)
  • Fails very many statistical tests
  • Typical implementations do not provide multiple streams, although it would be possible to do so
  • Typical implementations do not provide jump-ahead, although it would be possible to do so

Unix random

Unix random is an example of a linear-feedback shift register RNG. (See the Linear Feedback Shift Register Wikipedia page for more details.)

Positive Qualities

  • Fast
  • Good for producing 32-bit numbers (or a stream of random bits)

Negative Qualities

  • Fairly short period
  • Predictable — after producing 16 random numbers, we can completely predict its output
  • Fails very many statistical tests
  • Typical implementations do not provide jump-ahead, although it would be possible to do so (the implementation would be fairly complex, however)
  • Technically not uniform — zero will be produced once less often than every other output

XorShift (32-bit and 64-bit)

These examples of XorShift generators, kind of generalized linear feedback shift register. (See the XorShift Wikipedia page for more details.)

Positive Qualities

  • Fast (even if multiplication isn't fast)
  • Uses a small amount of memory
  • Good for producing 32-bit numbers (or a stream of random bits)

Negative Qualities

  • Fairly short period (32-bit version only)
  • Predictable. After producing just one random number, we can completely predict its output.
  • Fails many statistical tests
  • Technically not uniform — zero will be produced once less often than every other output
  • 32-bit XorShift should usually not be used to produce 32-bit numbers, because it only produces each number once, and never produces zero.
  • Does not provide multiple streams
  • Typical implementations do not provide jump-ahead, although it would be possible to do so (the implementation would be fairly complex, however)

RanQ and XorShift* 64/32

These generators use a generalization of a linear feedback shift register design, coupled with a multiplicative step for the output function. RanQ and XorShift* 64/32 are the same generator with 64 bits of state, but XorShift* 64/32 returns only 32 bits of output, whereas RanQ outputs all 64 bits. (See the XorShift Wikipedia page for more details.)

Positive Qualities

  • Fast (if multiplication is fast)
  • Uses a small amount of memory
  • Good for producing 32-bit numbers (or a stream of random bits)
  • Easily passes empirical statistical tests (XorShift* 64/32 only)

Neutral Qualities

  • Period of 264 is enough for many uses, but in some contexts, it would be considered consider “too small”

Negative Qualities

  • Technically not uniform — zero will be produced once less often than every other output
  • Predictable — after producing just one random number, we can completely predict its output (RanQ only).
  • RanQ claims to be a 64-bit generator but its low-order 32 bits fail statistical tests.
  • Does not provide multiple streams
  • Typical implementations do not provide jump-ahead, although it would be possible to do so (the implementation would be fairly complex, however)

PCG Family

This page is supposed to be about other RNGs besides the PCG family, but we may as well show how the PCG family compares.

Positive Qualities

  • Fast (if multiplication is fast)
  • Uses a small amount of memory (although the extended generators allow it to use an arbitrary amount of additional memory to extend the period)
  • Good for producing b-bit numbers for any b (or a stream of random bits)
  • Easily passes empirical statistical tests (and offers better statistical performance than any of the above generators)
  • pcg32 offers a 264 period and 263 distinct random streams, but arbitrarily large periods are possible (e.g., the library provides pcg32_k16384 which has a period of 2524352, which is vastly larger than the computational capacity of the universe)
  • Uniform
  • Extended generation scheme provides k-dimensional equidistribution for arbitrary k
  • Challenging to predict
  • Offers jump ahead and distance-between-states

Neutral Qualities

  • Can perform party tricks, like creating a generator which will produce exactly 3,141,592,653,589,793,238,463 random numbers, and then suddenly output a Zip file containing Hamlet.

Negative Qualities

  • New
  • Although it is less trivial to predict than many mainstream generators, that does not mean it should be considered crypographically secure. Exactly how hard it is to break different members of the PCG family is unknown at this time.