Ease of Use without Loss of Power

To the extent that anyone cares about C++11's random-number facility at all, the C++ community is polarized between two views—one that likes it, and one that hates it.

The two views are basically these:

  • It's amazingly elegant, a shining example of separation of concerns. Pseudo-random engines, random distributions, seeding techniques, and entropy sources are all quite different things. C++11 creates a pluggable and extensible architecture that allows for new RNGs, new custom distributions, and so forth. It's comprehensive and flexible.
  • It's horrible to use, feeling unnecessarily overengineered. To do even the simplest task requires significant boilerplate. Things that would be one easy-to-understand line in Python are three or four hard-to-follow lines in C++11. It's completely unsuitable for beginners, and even seasoned programmers hate it.

Both camps are right. But it turns out that we don't need to do very much to fix the situation and (hopefully) leave both camps feeling that they're getting what they want.

In this post, we'll develop the missing C++11 class that ties all the pieces together, making C++11 random number generators easy to use without abandoning any of their power and flexibility.

A Preview

Start by reading the following code, which demonstrates randutils::mt19937_rng, specifically the uniform, pick and variate member functions.

#include "randutils.hpp"
#include <iostream>

int main()
{
    randutils::mt19937_rng rng;

    std::cout << "Greetings from Office #" << rng.uniform(1,17)
              << " (where we think PI = "  << rng.uniform(3.1,3.2) << ")\n\n"
              << "Our office morale is "   << rng.uniform('A','D') << " grade\n"
              << "We " << rng.pick({"welcome",
                                    "look forward to synergizing with",
                                    "will resist",
                                    "are apathetic towards"})
                       << " our management overlords\n\n";

    std::cout << "On the 'business intelligence' test, we scored " 
              << rng.variate<double,std::normal_distribution>(70.0, 10.0)
              << "%\n";
}

Even though you haven't read any API docs, just by looking at this code, you can probably see exactly what is going on.

You can check to make sure it does what you think by looking at the outputs from a couple of runs (or you can download the header and try the code yourself):

Greetings from Office #3 (where we think PI = 3.18079)

Our office morale is D grade
We are apathetic towards our management overlords

On the 'business intelligence' test, we scored 74.511%

and

Greetings from Office #9 (where we think PI = 3.11824)

Our office morale is C grade
We will resist our management overlords

On the 'business intelligence' test, we scored 64.5473%

Two of the most common use cases for a random-number generator are generating a uniform value in a range, and making a random selection. With this class, which is just a very simple wrapper class around existing facilities, these tasks become easy. (If you're saying, “what about shuffling?” don't worry, we've got you covered, too. In fact, anything that's easy to do with Python's RNG class is just as easy to do here—in designing the class, wanted random number generation in C++ to become as easy as it is in Python.)

But from the last line of code where we called variate asking for a std::normal_distribution, you can also surmise that all the power and flexibility of C++11 RNGs is available to us. We can pick any distribution we like, including ones we've written ourselves. We can similarly use different random engines, or different sources of seeds. If we want control, we've got it.

A Tour of the Facilities

The type mt19937_rng is just an alias for random_generator<std::mt19937>, a thin wrapper around the Mersenne Twister engine. We can wrap any C++11 compatible engine, including the ones from Boost and my own PCG engines. They all work. The random_generator does almost no work, just hands everything off to facilities that already exist in other classes and functions. Its job is to wrap everything up to make it all easy.

Construction (and seed)

A random-generator object is a wrapper around a C++11-compatible random-number engine. Its sole state is the state of the underlying engine and so its only initialization work is to set up the engine.

If arguments are provided to the constructor, they are passed through to the underlying engine (via perfect forwarding). If, however, the first argument is convertible to a seed sequence via a base() member function (a capability indicated by providing a base_seed_seq typedef), this conversion is performed before the argument is passed to the engine.

Finally, and most usefully, if the random generator is default constructed, it creates a seeding object to give to the random-number engine. By default, it uses an auto_seed_256 object to provide nondeterministic seeds. (You can change the default via a template argument to the class.)

In other words, if we default construct the class, it begins with excellent, highly random seeding.

All the same rules apply to the arguments to the seed member function, which can be used to reseed the underlying engine.

Examples
randutils::mt19937_rng rng;

randutils::default_rng another_rng;

randutils::mt19937_rng rng_zero_seed{0u};
randutils::mt19937_rng rng_explict_seed_seq{randutils::auto_seed_128{}};

randutils::random_generator<std::knuth_b>   horribly_bad_rng;

std::seed_seq oldstyle_seedseq{1,2,3};
randutils::mt19937_rng rng_oldstyle_seed_seq{oldstyle_seedseq};

another_rng.seed(0u);       // Fixed seed
another_rng.seed();         // Nondeterministic seed

Because of the high-quality seeding, it's even possible (though somewhat wasteful) to create and use a random generator on the fly, like,

std::cout << randutils::mt19937_rng{}.pick({"Don't do it!", "Okay, do it"})
          << "\n";

(Note that programmers who know anything about random number generation may find that this last example makes their skin crawl, because they've learned that seeding should happen once, and that creating multiple generators is usually a sign of programmer confusion. That is a good rule of thumb. Nevertheless, thanks to the awesome seeding power of auto_seeder, while creating a plethora of RNGs might be slow, clumsy and bad style, doing so won't cause statistical problems. And the results will be nondeterministically random.)

Implementation Details

Although all that really needs to happen is passing arguments through to the constructor for the underlying random engine, the code to do this has a fair amount of baggage. Evil SFNAE-based compile-time introspection is required to detect what kind of seed sequence is being passed so that we can know whether to call base() or not.

uniform (member function)

For any numeric type N, with values v1 and v2, uniform(v1, v2) returns a value of type N in the range v1 to v2. For integer types, the range is a closed (i.e., up to and including v2), exactly as produced by the equivalent call to uniform_int_distribution, whereas for noninteger types it produces values according to the rules for uniform_real_distribution.

Examples
int i           = rng.uniform(42,54);
unsigned int u  = rng.uniform(0u, 99u);
long l          = rng.uniform(0l, 99999999999999l);
char c          = rng.uniform('A','Z');
double e        = rng.uniform(2.717, 2.719);

int error1      = rng.uniform(0, 1.6);   // Inconsistently typed arguments
int error2      = rng.uniform(0, 17u);   // Inconsistently typed arguments
Implementation details

uniform is really just a special case of the more general variate function. Here is the entire implementation:

template <typename Numeric>
Numeric uniform(Numeric lower, Numeric upper)
{
    return variate<Numeric,uniform_distribution>(lower, upper);
}

What's nice about uniform is that it infers the type of the output based on the type of the input arguments. variate can't do that because it works for arbitrary distributions and doesn't know what the distribution parameters mean.

(The distribution uniform_distribution is just a convenience templated typedef that uses std::uniform_int_distribution for integers and std::uniform_real_distribution otherwise.)

pick (member function)

For any instance, c, of a container type, pick(c) will return a reference to a randomly chosen member of the container. Containers are defined to be anything for which std::begin and std::end are defined.

Picking from an empty container is an error (with results similar to trying to dereference a past-the-end iterator).

Examples
int nums[]                  = {43, 51, 22};
std::list<std::string> list = {"Gizmo", "Hotel", "Ingot"};

++rng.pick(nums);                           // O(1) because it's an array
std::string& thing = rng.pick(list);        // O(n) because it's a list

const char* play   = rng.pick({"Rock", "Paper", "Scissors"});
bool b             = rng.pick({true, false});

void (*hello)()    = []{ std::cout << "Hello World!!\n"; };
void (*fail)()     = []{ std::cerr << "Error, Error!\n"; };

void (*mystery)()  = rng.pick({hello, fail});
mystery();
Implementation Details

The pick member function is actually just a wrapper around the choose function, described later on. Whereas choose takes and returns iterators, pick returns a reference to the chosen item. The implementation is trivial:

template <typename Range>
auto pick(Range&& range) -> decltype(*std::begin(range))
{
    return *choose(std::begin(range), std::end(range));
}

For technical reasons, we also need to provide a distinct and almost identical overload to handle being given an initializer list.

shuffle (member function)

For any writable random-access container c, shuffle(c) will shuffle the container. Containers are defined to be anything for which std::begin and std::end are defined.

Alternatively, shuffle can be passed two random-access iterators that specify a range to shuffle. In other words, shuffle(b,e) shuffles the range that begins at b and ends just before e.

Examples
const char* foods[]         = {"Flour", "Grain", "Honey", "Icing"};
std::vector<double> vec     = {1.23456, 3.14159, 2.71828, 1.61803};

rng.shuffle(foods);
rng.shuffle(vec);
rng.shuffle(vec.begin(), vec.begin()+2);
Implementation Details

Once again, the implementation is trivial. All the hard work is already done for us by std::shuffle. The code for both variants is simply

template <typename Iter>
void shuffle(Iter first, Iter last)
{
    std::shuffle(first, last, engine_);
}

template <typename Range>
void shuffle(Range&& range)
{
    shuffle(std::begin(range), std::end(range));
}

variate (member function)

This member function provides the full power of C++11's distribution classes. For some result type R, and class template T, where T<R> is a C++ distribution that can be initialized with parameters p1,…,pN, variate<R,T>(p1,…,pN) produces a random variate from that distribution.

If the distribution is not specified, it is assumed to be std::normal_distribution. To avoid mistakes, the output type must be specified.

Examples
// Create three random variates with mean = 0.0 and std dev = 1.0
double normal1 = rng.variate<double>();         
double normal2 = rng.variate<double>(0.0,1.0);
double normal3 = rng.variate<double,std::normal_distribution>(0.0,1.0);

float  grade   = rng.variate<float>(70.0,12.5);
int collisions = rng.variate<int,std::poisson_distribution>(1.0/8.0);

auto dd_init   = {40.0, 10.0, 10.0, 40.0};
int index      = rng.variate<int,std::discrete_distribution>(dd_init);

The last example is worthy of some explanation; std::discrete_distribution wants to be passed a std::initializer_list, but a C++ limitation prevents us from passing one directly, so we have to put it into a temporary variable first (or explicitly construct the initializer list). It's a little annoying, but most distributions don't take initializer lists, so it's a relatively minor issue.

C++11 provides us with lots of distributions to choose from, all of which work with variate:

Implementation Details

Again, this code is a very simple wrapper around preexisting C++11 functionality. The only thing that makes the code look a little scary is the C++11's perfect-forwarding machinery.

template <typename ResultType, 
          template <typename> class DistTmpl = std::normal_distribution,
          typename... Params>
ResultType variate(Params&&... params)
{
    DistTmpl<ResultType> dist(std::forward<Params>(params)...);

    return dist(engine_);
}

One thing to note is that this code creates and tears down a distribution object every time we call variate. In typical C++ standard library implementations, these objects have no persistent state and little or nothing is lost by doing things this way. (Any redundant work should be eliminated by the optimizer anyway.)

But if you wish distribution objects could persist longer, you may like the next member function we'll examine, generate.

generate (member function)

For any writable container c containing elements of type R, and distribution class template T, where T<R> is a C++ distribution that can be initialized with parameters p1,…,pN, generate<T>(c,p1,…,pN) fills the container random variates from that distribution. If T is not specified, it is a uniform distribution.

Containers are defined to be anything for which std::begin and std::end are defined.

Alternatively, we can pass two forward iterators that specify a range to generate instead of a container. In other words, generate<T>(b,e,p1,…,pN) fills the range that begins at b and ends just before e.

Examples
char grades[100];
std::vector<double> percentages(100);

rng.generate(grades, 'A', 'D');
rng.generate(grades,grades+10, 'A', 'B');
rng.generate<std::normal_distribution>(percentages, 80.0, 7.5);
Implementation Details

The implementation is very similar to the code for variate, except that having created a distribution object, we call it repeatedly. The target container is filled using std::generate, so this function is once again simply marshalling data into preexisting functions.

choose (member function)

This function behaves similarly to pick except that it returns an iterator rather than a reference. For two forward iterators, b and e that specify a range, choose(b,e) will find return an iterator within the range.

In contrast to pick, which cannot be called with an empty container, running choose(b,e) on an empty container returns b (which should be equal to e the past-the-end position).

For a container c, choose(c) is equivalent to choose(std::begin(c), std::end(c)).

Examples
std::array<int,50> counts;
std::iota(counts.begin(), counts.end(), 0);   // Fill with 0..49

auto mid_mid   = rng.choose(counts);
auto left_mid  = rng.choose(counts.begin(), mid_mid);
auto right_mid = rng.choose(mid_mid, counts.end());
rng.shuffle(left_mid, mid_mid);
rng.generate(mid_mid,right_mid,50,100);
Implementation Details

Again, this function is simply using preexisting functionality. The size of the range is determined with std::distance, a std::uniform_int_distribution finds the index of the desired element, and std::advance moves to that element. This work is only done if the size of the range is ≥ 2 (i.e., where there is an actual choice to make).

sample (member function)

This function takes a random sample of items from a container by moving a given number of them to the front, otherwise preserving order. For a container c, sample(n,c) moves n randomly chosen elements to the front and returns an iterator that marks both the end of the sample and the start of the remainder.

Alternatively, you can pass two iterators, b and e, that specify a range; sample(n,b,e) samples n elements in the same way as described above. In fact, sample(n,c) simply calls sample(n,std:begin(c), std:end(c)).

If n is greater than the size of the container, all items from the container are selected.

Examples
std::array<int,50> population;
std::iota(population.begin(), population.end(), 0);   // Fill with 0..999
auto sample_end = rng.sample(20, population);
Implementation Details

Of all the functions, sample is the only one that isn't merely a completely trivial wrapper for preexisting functionality. Nevertheless, all the heavy lifting is actually done by std::stable_partition.

I provided this function because a similar function is provided by Python.

engine (member function)

This function returns the underlying engine. Anything not provided for by this class can be achieved by using the engine directly.

Examples
rng.engine().discard(500);
auto raw_output = rng.engine()();

Versus the C++17 randint Proposal

I'd very much like to see the random_generator class template (or an enhanced iteration of it) make it into a future version of C++. That probably means someone (perhaps me) will have to write a proposal at some point, but there is another contender for making C++11 random number generation easy to use, the randint proposal, currently being considered for C++17.

The randint proposal gives the following motivation [sic]:

The std::rand friends are discouraged in C++14, so we want:

  1. A direct replacement to the std::rand friends. Despite of the security issues, std::rand is considered both handy and useful as a global uniform random number generator.

  2. To expose the most widely-used combo in C++11 <random> without pushing the users to learn the whole design. Smoothing the learning curve can usually optimize the acceptance.

randint Is Too Limited

The randint proposal is essentially, a dumbing-down proposal. Working from the basis that the C++11 random-number facility is too hard to use, it gives a radically simplified interface offering users just one thing, uniformly distributed integers. Want a replacement for drand48 (which provides uniform doubles)? You're out of luck. Want to have numbers with a normal distribution? Look elsewhere.

Look back at the code that we used to begin this article, the code that printed random text like this:

Greetings from Office #1 (where we think PI = 3.14242)

Our office morale is A grade
We synergize with our management overlords

On the 'business intelligence' test, we scored 82.8281%

It was a cute, fun, silly example. Exactly the kind of literally random silliness that can capture the imagination of a beginner and can be easily within their grasp. Could we produce this silly example with randint? Some of it, but not all of it. If it doesn't involve uniformly distributed integers, it won't be easy.

The idea behind the randint proposal seems to be that users starting out with C++ aren't going to want to do anything remotely sophisticated (even if they've previously used Python and its far richer random class). And, likewise, presumably nonbeginners should be willing to tough it out.

randint Is Not a Stepping Stone

A goal of the randint proposal is to “smooth the learning curve” so programmers don't have to “learn the whole design” of the C++ random-number facility all at once. But it actually doesn't help them learn it at all.

What do you learn about the wider facility when you write code like the code shown below?

for (int i = 0; i < 10; ++i)
    std::cout << "Rolled " << std::randint(1, 6) << "\n";

None of the key ideas of C++ random number generation are in play.

But random_generator Is a Stepping Stone

Contrast that randint code, with the following code using the ideas from this blog post:

mt19937_rng   my_rng;

for (int i = 0; i < 10; ++i)
    std::cout << "Rolled " << my_rng.uniform(1, 6) << "\n";

Here you encounter the idea that generators exist as distinct objects. We understand that we're asking for a uniformly distributed number.

We can learn about seeding:

mt19937_rng   my_rng(0);    // Temporarily set to zero during testing

for (int i = 0; i < 10; ++i)
    std::cout << "Rolled " << rng.uniform(1, 6) << "\n";

We can learn about C++ distributions:

mt19937_rng   my_rng;

for (int i = 0; i < 10; ++i)
    std::cout << "Rolled "
              << my_rng.variate<double,std::uniform_int_distribution>(1, 6)
              << "\n";

In general, we're better poised to slowly unfurl all the pieces that make up the C++ random number generation scheme.

randint Blesses a Global Variable

A global RNG isn't in any way necessary if we have a good way to seed newly created RNGs. Like any other case of global variables, putting your RNG in a global variable should be a matter of programmer choice. Some programmers find global variables expedient and convenient, whereas others consider them an anathema.

With the random_generator class template discussed in this post, it's up to you. If you want a global RNG, it's easy, declare it like any other global variable:

mt19937_rng global_rng;

or, if you prefer, a thread-local one,

thread_local mt19937_rng semiglobal_rng;

randint Isn't What C++17 Needs

The motivations for the randint proposal were good. As I said at the start of this post, many people, from beginners to professionals, find the modular structure of the C++ random-number facility unduly cumbersome. But randint is too limited and backward looking. It turns its back on the elegance we already have.

Conclusions

About a month ago, I posted a challenge to the C++ community on reddit—write C++ code equivalent to these two lines of Python:

import random;
for i in range(2,17): print "[0,{}):\t{}".format(i,random.randrange(i))

It was a hard challenge because of the difficulties of getting the seeding right. But even without that issue, it was annoying to code given all the pieces that had to be properly fitted together.

But with the random_generator wrapper we've discussed in this post, the task becomes trivial:

mt19937_rng  rng;
for (int i = 1; i < 16; ++i)
    std::cout << "[0," << i+1 << "]:\t" << rng.uniform(0,i) << "\n";

We have the same kind of ease of use that programmers in other languages take for granted, but without losing any of the power that comes with the modular design of C++'s random-number facility. Truly, the best of all worlds.

But don't just take my word for it, download randutils.hpp and play around. Start having actual fun with random number generation.

Simple Portable C++ Seed Entropy

In two recent posts, I've looked at seeding random number generators in in C++11 (looking at what's wrong with std::seed_seq and developing something that avoids those flaws), but seed_seq exists as a mechanism to “mix up” entropy that you give it. You still need to get that entropy from somewhere. So where?

Sadly std::random_device Is Not Your Friend

The obvious source for external randomness we can use in seeding is std::random_device. But as I mentioned in this post,

  • It's hard to make std::random_device conform to the requirements of a seed sequence.
  • It's unspecified just how costly this “device” is to read from.
  • It may not be nondeterministic at all, rendering it unfit for our purposes.

Portable code needs to look to other sources of entropy for RNG seeding. And even if std::random_device works well, mixing in entropy from other sources can have a variety of benefits, including performance.

Other (Cheap) Sources of Entropy

If we can't rely on std::random_device, what can we use? We will survey our options, focusing on low-cost and highly portable choices (rather than, say, trying to connect to random.org on the Internet).

Entropy from the Time

The current time has a long history as a source of seed entropy. Vast numbers of C programs use

srand(time(NULL));

to initialize the C RNG. In C++11 we can do better than the venerable C time function though, because we have std::chrono::high_resolution_clock. In fact, if we use the LLVM or GNU C++ libraries, we find that the (claimed) resolution of this clock is one nanosecond. Nanosecond resolution means that no calls that occur one after the other will ever see the same value, and makes it extremely unlikely that calls in programs executing at (almost) the same moment will observe the same value. With the low-order 32 bits cycling through 232 values every 4.3 seconds we can easily claim that a high-resolution clock can give a typical program 32 bits of entropy.

Seeding with Just the Clock

Let's see what happens if we seed with just a nanosecond resolution timer. First, let's try std::seed_seq and see if there seems to be any pattern to its output:

uint32_t seedseq_random_using_clock()
{
    uint64_t seed = std::chrono::high_resolution_clock::
                                        now().time_since_epoch().count();
    std::seed_seq seeder{uint32_t(seed),uint32_t(seed >> 32)};
    ++seed;
    int out;
    seeder.generate(&out,&out+1);
    return out;
}

If we test this generator with TestU01's SmallCrush battery, we get these rather disappointing results:

========= Summary results of SmallCrush =========

 Version:          TestU01 1.2.3
 Generator:        std::seedseq (using std::chrono::high_resolution_clock)
 Number of statistics:  15
 Total CPU time:   00:01:15.25
 The following tests gave p-values outside [0.001, 0.9990]:
 (eps  means a value < 1.0e-300):
 (eps1 means a value < 1.0e-15):

       Test                          p-value
 ----------------------------------------------
  1  BirthdaySpacings                 eps  
  2  Collision                        eps  
  3  Gap                              eps  
  4  SimpPoker                        eps  
  5  CouponCollector                  eps  
  6  MaxOft AD                      1 - 2.1e-11
  7  WeightDistrib                    eps  
 ----------------------------------------------
 All other tests were passed

In other words, if we use std::seed_seq to seed two RNGs at nearby times, there will be a discernible pattern in their seeds.

Now let's try randutils::seed_seq_fe128 instead. not only does it pass SmallCrush, it passes Crush and BigCrush as well:

========= Summary results of BigCrush =========

 Version:          TestU01 1.2.3
 Generator:        randutils::seed_seq_fe128 (using std::chrono::high_resolution_clock)
 Number of statistics:  160
 Total CPU time:   12:02:01.31

 All tests were passed

Suitably scrambled, sampled at high enough resolution, the time alone is good enough to provide workable statistically good seeds. But before we celebrate, we need to look a little closer.

Troubles with the Time

There are some issues with using C++11's high_resolution_clock. One is that libraries lie. The GNU libstdc++ library claims that its high-resolution clock ticks every nanosecond, but on some platforms, including OS X, the underlying implementation uses a POSIX gettimeofday system call which only has microsecond resolution (even though better alternatives exist).

And even if your C++ library tells the truth, there is no requirement that the high-resolution clock be especially high resolution. You could be using a C++ library where the clock only ticks once a second (or even once a year!).

So std::chrono::high_resolution_clock might be a good (if limited) source of entropy (perhaps even able to do all the heavy lifting by itself), but it might be merely mediocre. It can't be our only source.

Entropy from the Memory Configuration

In C and C++ we can find out where in memory our variables and functions are stored. Thus, we can write programs like this one, which prints out where things are in memory:

#include <iostream>
#include <iomanip>
#include <cstddef>
#include <cstdlib>

static void lf()
{
    // Do nothing
}

static int zsv;
static int isv = 1;

int main()
{
    int lv;
    void* mem = malloc(1);
    free(mem);
    std::cout << std::hex;
    std::cout << "Local Variable             " << (void*)&lv    << std::endl;
    std::cout << "Heap Memory                " << mem           << std::endl;
    std::cout << "Static Variable (Zeroed)   " << (void*)&zsv   << std::endl;
    std::cout << "Static Variable (Inited)   " << (void*)&isv   << std::endl;
    std::cout << "Local Function             " << (void*)&lf    << std::endl;
    std::cout << "Library Function           " << (void*)&_Exit << std::endl;
}

If we run this program on OS X, we see the following results for three different runs:

Kind Run 1 Run 2 Run 3
Local Variable 0x7fff5ea636f4 0x7fff5dd726f4 0x7fff50caf6f4
Heap Memory 0x7fa7d1c04bb0 0x7f8eb3404bb0 0x7ffb6ac04bb0
Static Variable (Zeroed) 0x10119e0cc 0x101e8f0cc 0x10ef520cc
Static Variable (Inited) 0x10119e0c8 0x101e8f0c8 0x10ef520c8
Local Function 0x10119d8a0 0x101e8e8a0 0x10ef518a0
Library Function 0x7fff94c4256a 0x7fff94c4256a 0x7fff94c4256a

Notice that almost everything is in a different location each run. A few years ago, the results would have been different—the memory addresses of everything would be the same every time—but today's operating systems improve security with address-space–layout randomization (ASLR). Not everything changes—we can see that in all cases the high-order bits and the low-order bits often stay the same. In addition, the address of the library function stayed constant, but even that would change if we rebooted (or if we set the environment variable DYLD_SHARED_REGION=avoid).

On Linux, by default, the memory map isn't quite as random:

Kind Run 1 Run 2 Run 3
Local Variable 0x7fffa0b322cc 0x7fff8cff3bdc 0x7fffc8e07e4c
Heap Memory 0x2320c20 0x7aac20 0x249cc20
Static Variable (Zeroed) 0x6021b8 0x6021b8 0x6021b8
Static Variable (Inited) 0x602098 0x602098 0x602098
Local Function 0x400db0 0x400db0 0x400db0
Library Function 0x400950 0x400950 0x400950

In these Linux tests, the stack and the heap have moved around but all the code has stayed in the exact same place. Linux can move libraries and application code around as well, but does not do so by default, choosing to do so only for applications where security is considered to be important. The claimed reason for limiting ASLR and keeping code at fixed addresses is one of efficiency—position-independent executables purportedly adds a small amount of additional overhead. Although both OS X and OpenBSD consider the overhead acceptable for the benefit of making attacks difficult, mainstream Linux distributions apparently do not.

But we can override Linux's defaults and compile position-independent code by running by using the compiler arguments -pie -fpie:

Kind Run 1 Run 2 Run 3
Local Variable 0x7ffd6d03e18c 0x7ffec84f456c 0x7fff745b05bc
Heap Memory 0x7f47ef07ac20 0x7f89b0f24c20 0x7fbddf6e4c20
Static Variable (Zeroed) 0x7f47edf2f09c 0x7f89b002009c 0x7fbddf50f09c
Static Variable (Inited) 0x7f47edf2f090 0x7f89b0020090 0x7fbddf50f090
Local Function 0x7f47edd2df70 0x7f89afe1ef70 0x7fbddf30df70
Library Function 0x7f47ecf6b2d0 0x7f89af05c2d0 0x7fbdde54b2d0

Here we can see that everything is moving around, but it doesn't actually seem like the total entropy has changed, since the heap, static data, and functions all seem to be similarly located.

(Looking at startup performance with LD_DEBUG=statistics [and DYLD_PRINT_STATISTICS on OS X], I could find no appreciable performance difference between the position-independent and regular versions, which makes me think that OS X and OpenBSD have it right, and most Linux distributions have it wrong.)

Of course, it's possible that our code won't run on a system that has ASLR. But even on those systems, what is on the stack and on the heap may vary depending on exactly what has previously happened during the program's execution. And different threads will have different stacks, so it provides a mechanism to give different threads different entropy even if other parts of the entropy stay the same (e.g., checking the time at the exact same moment).

Thus the memory configuration will provide some entropy on all systems, and on systems with ASLR, we can hope it's providing about 32 bits worth (even if some of that is baked in at compile time rather than runtime).

Entropy from the CPU

Many CPUs have some kind of cycle counter. Although C++ provides no portable way to read such a counter, there are compiler-specific mechanisms.

  • Clang provides the intrinsic function __builtin_readcyclecounter as a platform-independent way to access such a counter. (Although on platforms that do not provide an accessible cycle counter, such as ARM, it always returns zero.)
  • Intel provides a header file <immintrin.h> that declares several useful X86-specific intrinsics:
    • __rdtsc reads the timestamp counter (essentially equivalent to __builtin_readcyclecounter).
    • _rdrand64_step provides a 64-bit cryptographically secure random number. Unfortunately, this facility only exists in Ivy Bridge (and later) Intel CPUs. Also some (possibly rather paranoid) people have expressed concern that this facility may have been somehow subverted.
    • _rdseed64_step provides a 64-bit “true” random number for use in seeding (generated from thermal noise). Unfortunately, this facility only exists in Broadwell (and later) Intel CPUs.

On the one hand, some of these sources seem to be truly excellent. On the other, the best ones are the least likely to be available. Even if we're only developing for Intel machines, we can't necessarily assume our hardware has Intel's randomness instructions (Sandy Bridge machines are still commonplace, for example, the previous generation Mac Pro).

For portable code, perhaps we can count on 32 bits of entropy from a cycle counter, but not much else. But, like nanosecond-resolution time, the Intel cycle counter alone is sufficient to pass TestU01's BigCrush battery provided we scramble the bits sufficiently (e.g., with seed_seq_fe128).

Entropy from the Operating System

Although std::random_device seems like the obvious source for operating system for entropy, there are some other traditional options. One of the classics for seeding RNGs on a Unix system is to use getpid() to get the process id (PID), a small integer which should vary from run to run. Although getpid is a Unix-ism, Microsoft Windows provides a similar call.

C++11 does, however, provide something remarkably similar for portable code—the thread id—which we can obtain via std::this_thread::get_id(). Unfortunately, for some compilers/platforms, its value doesn't seem to vary from run to run as much as the PID does.

Entropy from Local State

Some of our entropy sources change every time we run the program, but remain the same for the duration of the run. Thus it's useful to also have an entropy source that explicitly changes every time we examine it. A simple counter can do that task.

Design Goals & Resulting Class

Ordinary programmers shouldn't have to think about all these considerations. The whole point of libraries should be to shield everyday programmers from worrying about unnecessary details. Ideally, std::random_device would have been designed to serve as an entropy source (and work as a Seed Sequence), but since that's not the case, we can at least provide the missing functionality.

My way of wrapping up the process of gathering entropy for RNG seeding is to provide a class template auto_seeded that can initialize any Seed Sequence using a combination of the entropy sources I've mentioned in this blog post. For some class S, auto_seeded<S> derives from S but replaces the default constructor with a constructor that initializes S with some appropriately chosen entropy.

You can find the code in randutils.hpp, which includes the code from this post and others in this series.

Thus, to create a Mersenne Twister engine with an arbitrary seed that varies from run to run, we can write

randutils::auto_seeded<std::seed_seq> seeds;
std::mt19937 rng{seeds};

As we've previously learned, std::seed_seq has some issues, so we might want to prefer the alternative, seed_seq_fe, that we developed in the previous blog post. Thus the code also provides auto_seed_128 and auto_seed_256 as handy aliases for auto_seeded<seed_seq_fe128> and auto_seeded<seed_seq_fe256>, allowing us to write:

randutils::auto_seed_128 seeds;
std::mt19937 rng{seeds};

Getting Technical

You might hope that we could compress the two lines of our previous example into a single line, and write it as

std::mt19937 rng{randutils::auto_seed_128{}};

Unfortunately, trying to use a temporary object here doesn't work, due to another minor glitch in the C++11 standard. If we try it, we'll get this error:

error: no matching constructor for initialization of 'std::mt19937'
    std::mt19937 rng{randutils::auto_seed_128{}};
                 ^  ~~~~~~~~~~~~~~~~~~~~~~~~~~~~
/usr/include/c++/v1/random:2113:18: note: 
      candidate constructor [with _Sseq = randutils::auto_seed_128] not viable:
      expects an l-value for 1st argument

The glitch in the standard is that Seed Sequences are passed by l-value reference, and you can't pass a temporary object as an l-value reference (or at least you can't without doing some extra work). The C++ committee should have made the seed parameter a universal reference and then we'd have been all good.

But there is actually a bigger issue than this one. The two lines of code I showed you are (technically) wrong. Once again, C++11's strange requirements of Seed Sequences bite us. For any legitimate Seed Sequence S, auto_seeded<S> is technically not a Seed Sequence, because the requirements state that the default constructor for a Seed Sequence

Creates a seed sequence with the same initial state as all other default-constructed seed sequences [of that type]

In other words, all default constructed seed sequences must have the same state! The whole point of our default constructor was to create seed sequences with (nondeterministically) different states.

Thankfully, there is a solution—we can just cast the object back to its base class which does have a dumb always-the-same default constructor. I thus provide a base method to do just that. And, as a bonus, we get something nice for free, the ability to do all the initialization on one line using a temporary object (because the base method does return a l-value reference!).

Thus the code actually becomes

std::mt19937 rng{randutils::auto_seed_128{}.base()};

Not the most beautiful thing, perhaps, but it is technically correct.

Testing and Performance

Here's a quick test program to show off the resulting class:

#include "randutils.hpp"

#include <cstdio>
#include <cstdint>
#include <cstddef>

int main()
{
    uint32_t seeds[8];

    for (int i = 0; i < 16; ++i) {
        randutils::auto_seed_256 seeder;
        seeder.generate(seeds, seeds+8);
        bool first = true;
        for (uint32_t seed : seeds) {
            printf("%s%08x", first ? "" : ", ", seed);
            first = false;
        }
        printf("\n");
    }
}

Here's one run:

670e5a45, 9ac9ea9c, 88853620, c4c836f9, 07ab5640, b20b313e, 7b945051, 37f50e84
51cb6dd9, b2e34e8a, 8cc4f78e, 6530b128, 1a5eab3c, 0adae353, e3c1587b, dec6000e
45f869e9, 9c802b0a, db790af6, 6ced810c, e8a2a513, 09e6de7c, 87a55cea, 3b67b4a6
7d171d66, e53dd7e9, cc09c87e, 32672c79, 87f427a5, 079b721f, 7e844793, 4729f207
a0db8507, 4ed656d4, 4bc9cc88, bd723730, bb11303d, 979ac968, 34bdc61c, 389d75af
b4840033, e7176d4a, c0a01431, 52e58522, 2ff42863, b4bde340, 34177446, 56b100b6
fc555da1, 56888d72, 2a9c5000, a5a1a01a, 07fa8cfe, a949b563, 1c31c4bd, 37d45aef
4aad1d58, 5cfae50e, e37ecc55, efe03ba6, 6dc0f415, 93fa0f9e, 84c9fa62, b340c696
6767ca87, b0e54c11, 89045654, b889850a, 9aabe2bb, 45628c06, 46d666d6, 0eb4ad37
3273cef7, d0469f31, 759e4dd3, 6fecd66b, b7d350a7, b758b04f, 230c3361, 83cc1a0b
eb0a4104, b99d0441, c69d8737, 40fd935a, 695f7321, 0f71b9bd, 7b05e99f, 02e60d01
0ee38480, 5649ed46, b770579c, c627516b, 6510e5f3, 11df41f0, 5d287c54, 771f7a23
af81b6bb, e437c6df, f062227e, 165653f2, 4be238a8, 8f19f0c1, 001aa816, 2fef1cfb
e088cb5f, 126fb9b6, 4e43d64e, 9cd96242, e0dc26ab, a6b95e20, c989ea24, 9f414b5a
0fdd89ce, 4bea7340, 28a2f1b3, 20a92dec, 5729581a, d20a9670, c0deea0f, 6f0120c4
53f35b6d, 96031a00, 8ac98dc3, 96942558, 0376dd98, aa2e418d, b596fe6c, be355589

and here's another:

cad830e2, 7a572603, 67153026, 15e2b642, 9cbe2af4, d0f7acf3, 11fda8ee, b8cde0bd
9e5b8acd, 1b097e56, cbb3630c, 51b11bd5, 41738595, 4cc7fc18, 125bcf04, 932570f1
fd8854f5, 5c66ebbc, 807b3b14, e747bbf0, 0698eba5, 4875823f, 25c40a6b, 74238874
6fd21e7c, 728f027a, 3c8cc9aa, bf89f104, 5388e256, 7e5cc849, fd3718e1, 3c72215f
a18ec556, 0b523c1e, 6570c3e0, 7cbf8f70, 20f390db, f8d52009, 6851ed91, 55e5d56a
b6460b40, 54659aa8, 8a0ecf2b, fb5cffdc, 49143b62, 22c6c0cb, bb9e47db, 2fc20892
a6df5533, ae12f233, da955e9d, 0abc43df, f3e416f8, 6de0f041, 3d73b1ab, 2b6d860e
0466466c, f5284293, 9d642904, 7ef5f2a9, a4afded5, 0d098b32, 9f8669b6, 51b9c355
7f3222ef, 742f80fc, b3182b2c, 8d6b5fba, 4947df9a, f7ad0a7e, 93e27446, 89d9dfc8
c7cf0252, 755b253a, d8e2c647, 43c7272c, dafc95e1, dd65d784, ca1101d4, 74bdb176
eb37b834, 71509f10, ca168b17, 45855a2d, 4a2c9d6f, 29bece0e, 0d92f673, bc927e6a
daac0c7f, 851aa6b9, 84903a8b, df36a790, 92d87bab, 6096da95, bc79a802, 9b01dd37
5ad0b74e, f75188b6, e8ced8a5, 140c31d8, 3b937a7d, 086fb548, b20a7261, b4e31307
4a20ba89, 58827d51, e8d0fbd9, d052ecb4, e818f611, d97adda9, 33cbe5a2, 863cbc4a
39cb4142, 438fd23d, 7e2aa6bf, 58fdc4c1, 37f3e4da, ad449f06, 5aa3cc36, 3cf6a63a
ec75e458, cda85e15, d60eba81, 9b34dc4e, 6e39d9b4, aa742c69, 311f1950, 027bf991

Of course, as satisfying as it is to see some random output, we can't tell how good this output is just by looking, but, as we did earlier, we can abuse our seed sequence to act as a random number generator. Since we could pass TestU01 using only the high-resolution time, we should expect it to pass TestU01's test batteries. It does. Here are the results from running SmallCrush and BigCrush.

========= Summary results of SmallCrush =========

 Version:          TestU01 1.2.3
 Generator:        auto_seed_128
 Number of statistics:  15
 Total CPU time:   00:01:09.76

 All tests were passed

========= Summary results of BigCrush =========

 Version:          TestU01 1.2.3
 Generator:        auto_seed_128
 Number of statistics:  160
 Total CPU time:   27:34:38.55

 All tests were passed

Although its statistical performance is excellent, this time doesn't look stellar compared to actual random number generators. So let's look more closely at time performance. Here I'll use cycle counts rather than seconds (and take the median of 101 samples), and consider four possible ways to provide our seed data,

  1. randutils::auto_seed_256 seeder;
  2. randutils::auto_seed_128 seeder;
  3. randutils::seed_seq_fe128 seeder{rdev(),rdev()};
  4. std::seed_seq seeder{rdev(),rdev()};

where rdev in the last two is an instance of std::random_device.

auto_seed_128 auto_seed_256 seed_seq_fe_128 std::seed_seq
First Call (LLVM 3.6) 114580 115232 20900 113796
First Call (GCC 4.9) 31724 31153 25424 32132
Subsequent Calls (LLVM 3.6) 440 704 8316 9928
Subsequent Calls (GCC 4.9) 552 808 1252 1516

As I mentioned in a previous post the first-call performance numbers can be hard to make sense of. There's a lot going on, such as page faults bringing library code into memory and so forth. For LLVM, much of the overhead somehow relates to the libc++ C++ library—if we use Clang with the GNU C++ library (not shown above), it actually outperforms the GCC version. Although we could spend more time investigating, it's worth remembering that a one-time cost of 115,000 cycles on my test machine works out to be a mere 34 µs.

Looking beyond the strange world of first-call performance, the auto_seed_128 and auto_seed_256 classes outperform everything else, despite the number entropy sources they consult (and all the mixing done by seed_seq_fe, which they're built on). In fact, auto_seed_128 can produce eight words of output in the time it takes to read a single word from GCC's rdrand-based std::random_device.

(These kind of benchmarks also show that between GCC and LLVM, there's no clear performance winner. Each has scenarios where it shines.)

Conclusions

With auto_seed_128 and auto_seed_256 we can generate high-quality nondeterministic seed entropy, and do so quickly. It works around various questionable decisions in the C++11 specification, from the possibility that std::random_device isn't fit for its purpose, to the strange requirements of C++11 Seed Sequences, to the ungainly ways they must be passed to C++11 RNGs.

Of course, it's still a bit annoying that someone who wants to generate a random letter grade 'A' through 'D' needs to write

std::mt19937 rng_engine{randutils::auto_seed_128{}.base()};
std::uniform_int_distribution<char> dist{'A','D'};
char grade = dist(rng_engine);

That's a lot of visible machinery. But finally we have the pieces in place to fix that, and that'll be the topic for my next post.

Everything You Never Wanted to Know about C++'s random_device

Continuing my series of blog posts about seeding random number generators in C++, it's time to take a look at std::random_device. In this post, we'll see the many ways in which it isn't really your friend.

It Isn't a Seed Sequence

C++11 random number generators can be initialized either with a single integer or with an object that matches the C++11 Seed Sequence concept. In an ideal world, std::random_device would be usable as a Seed Sequence, allowing us to write

std::mt19937 engine{std::random_device{}};

C++11 and C++14 don't allow that. It would certainly be nice if that changes in C++17 but we'll have to see—thus far, I don't think many people are demanding it. Actually, you pretty much can write this code today, but only if you're using Boost, where you'd say

boost::random::mt19937 engine{boost::random::random_device{}};

Some people think they're saying this when they say:

std::mt19937 engine{std::random_device{}()};

But those two extra parentheses make a huge difference. Rather than passing an object representing the random device, in this code we're passing a single integer taken from the random device, with all the problems that entails.

You might think that we could write a wrapper class that turns the random device into a Seed-Sequence–compatible object. Alas, thanks to the strange rules Seed Sequences are supposed to follow, that's quite a challenge.

C++11 Seed Sequences Are Strange

Seed sequences are required to have a param member function defined as follows. For some Seed Sequence S,

r.param(ob) — copies to the given destination [i.e., ob] a sequence of 32-bit units that can be provided to the constructor of a second object of type S, and that would reproduce in that second object a state indistinguishable from the state of the first object.

In other words, even though Seed Sequences aren't required to have a copy constructor, we must be able to create an “indistinguishable” clone of them.

Obviously, you can't create a “indistinguishable clone” of a std::random_device device object—the moment they start giving different outputs, the objects aren't indistinguishable, and if its outputs are nondeterministic, that'll happen immediately.

In general, the Seed Sequence requirements are written to make it clear that a seed sequence represents a fixed set of seed data, not a (nondeterministic) source of seed data. If it were otherwise, all the verbiage about indistinguishable objects wouldn't be there.

A Seed-Sequence Wrapper Seems Possible

Even though std::random_device lacks the features needed for a Seed Sequence, perhaps there is some way of providing a wrapper that makes it Seed-Sequence compatible.

Here are some options:

  • Not providing the dreaded size and param member functions at all, just the generate function. In other words, failing to meet the Seed-Sequence requirements but hoping that no one will actually notice. This approach happens to work in several implementations (because these functions are utterly unnecessary for initializing a typical RNG), but, technically, this choice renders our code nonportable.
  • Providing size and param but making them _impractical to use. This is a delightfully legalistic option. The standard makes no claim that the size of the state should have any correspondence to the amount of initial seed data originally provided, so we can claim that the size of our internal state is so vast that no one could ever allocate enough memory to copy it. You _can clone our object, we say, provided you do something impossible first.
  • Create a wrapper that lives a double life. So long as param is never called, the object behaves nondeterministically using the random device, but the moment we call param in an attempt to clone it, we “switch modes” to use something entirely deterministic, like std::seed_seq. One small challenge to this approach is that param most work on const objects, but our mode switch requires us to change the state of the object.

There is one added complication, however. The standard also says that default-constructed Seed-Sequence objects should also be indistinguishable. Ugh! I will show how to cope with that requirement too, but not in this post.

It Only Produces Outputs One At a Time

Despite their flaws, Seed-Sequence classes do have one convenient feature, their generate function, which can be asked to produce several 32-bit integers at once. In contrast, std::random_device can only generate one integer at a time.

Thus asking for 64 integers may require std::random_device to make 64 operating system calls (which tend to be expensive).

It Has Unknown Cost

How costly it is to read a number from this “device”? That is unspecified. It could, for example, be reading from /dev/random on a Linux system, which can block for an extended period waiting for entropy (which is itself problematic for a variety of reasons).

In fact, when using GCC on Linux, std::random_device will try to use Intel's rdrand instruction. When using Clang with libc++, however, the program instead uses /dev/urandom. The table below shows cycle counts for each option:

/dev/urandom rdrand      
First Call 16936 24595
Subsequent Calls 4972 544

On a modern system, the first call to any function takes longer than subsequent calls. Pages need to be faulted into memory, on-time setup costs paid, and so forth. At one level, it's interesting to note that the rdrand code takes longer to get started than the /dev/urandom code—possibly this difference is related to the rdrand code triggering 37 more page faults than the /dev/urandom code. But at another level, we don't know nearly enough about what's happening during this first call and how it might vary from program to program to draw any useful conclusions. Stepping back, it's worthwhile to realize that for a 3.4 GHZ CPU, a one-time cost of 24,595 cycles amounts to 7.2 µs—seven millionths of a second. We could investigate more deeply where this extra time is spent, but it's not really worth the bother—if you spend just one second thinking about it (and you've spent more than that already), that's more time than it'll ever waste for you.

For subsequent calls, we're on a firmer footing for drawing conclusions. We can see that rdrand is much cheaper to call, almost an order of magnitude faster, which shouldn't be a huge surprise. But for the typical use cases for std::random_device (one-time seeding) the difference between 0.16 µs and 1.46 µs is negligible.

This is positive news. On the platforms most people care about, std::random_device is going to be fast (and high quality, too). We just can't depend on that.

It Might Actually Be Deterministic

We have saved the worst problem for last. C++11's std::random_device is not required to be nondeterministic! Implementations can and do implement it as a simple RNG with a fixed seed, so it produces the same output for every run of the program.

If you read the C++ specification, you might briefly think there is hope of detecting such horrible implementations, because std::random_device provides an entropy member function that must return zero when the generator is completely deterministic. But popular libraries (both GCC's libstdc++ and LLVM's libc++) always return zero, even when they're using high-quality external randomness.

These aren't just theoretical issues, as this stackoverflow post reveals.

Conclusion

So, there's good news and bad news. On popular platforms std::random_device works well, and the awkward impedance mismatch between std::random_device and Seed Sequences is something we can work around (even if I haven't given all the details yet).

The worst news is that in portable C++ code, std::random_device may be unfit for its intended purpose. With some platform-testing magic, we can discover known-good platforms, but the possibility of broken platforms still exists (and the most off-beat platforms are the ones most likely to be problematic).

In my opinion, the C++ committee should never have allowed std::random_device to be deterministic—basically that's lying. It would have been far better to have platforms that cannot provide such a device not provide this class at all.

In my next post, I'll discuss how we can augment std::random_device with other sources of entropy.

Developing a seed_seq Alternative

In my previous post, I talked about issues with C++11's std::seed_seq. As a reminder, the goal of the std::seed_seq is to generate seed data that can be used to initialize a random number generator. And to allow it to do its job it needs seed data. Seed data in, seed data out. It might seem like a “do nothing” function would be up to the task, but in many situations the seed data you can provide is of variable quality. Some of it is “good” entropy, which changes often, and some of it is “lower quality” entropy which is more predictable or changes less frequently (e.g., the machine id). It also may be that the data that changes often (e.g., the time) changes more in the low-order bits than the high-order bits. So the goal of std::seed_seq is to mix together the seed data that we provide so that the high-quality and low-quality data is well-mixed.

In essence, the task that std::seed_seq does (and thus anything intending to replace it should do) is compute a hash (i.e., a scrambled version) of its input data. The design of hash functions is often (rightly) considered to be something of a black art, but since the same can be said of designing random number generators and I've already done that, I might as well tackle this problem, too.

Design Space

If we're going to develop an alternative to std::seed_seq that fixes some of its deficiencies, we need to understand what some of those issues are. Specifically, std::seed_seq

  1. Is never a bijection if it's generating more than one output—when the amount of output data is equal to (or less than) the amount of input data, it generates some outputs more often than others, and may not generate some outputs at all.
  2. May not be computing a “good” hash (something we'll elaborate on shortly).
  3. Stores all the input seed data verbatim. Because you are allowed to give it an arbitrary amount of seed data, it needs to dynamically allocate memory to store it.

We'll address each of these issues with some design constraints.

Finite Seed Entropy

To address the third point first, we'll adopt a strategy where our std::seed_seq alternative is a fixed (but configurable) size. Usually programmers know either how much entropy they're supposed to provide, or how much they have readily available, so it makes sense to make our entropy store a fixed size. Thus we'll provide two ready-made fixed size types for common use cases,

  • seed_seq_fe128 (synonym for seed_seq_fe<4, uint32_t>)
  • seed_seq_fe256 (synonym for seed_seq_fe<8, uint32_t>)

representing four words or eight words of stored ready-mixed seed data. The fe stands for fixed entropy.

Despite the fixed-sized entropy store, we can initialize with less entropy (although that's somewhat foolish—if we have less entropy we could use a smaller entropy store), or more entropy (in which case all the provided entropy is mixed together before being stored in the fixed size container). This latter property is useful if we wish to fill a small entropy store from several low-quality sources of entropy.

Maximal Strict Avalanche

In the parlance of cryptography, good hash functions have good diffusion properties. One property that follows from good diffusion (and one that's often mentioned in the context of hashing), is good avalanche properties—changing one bit in one of the seed inputs affects all the bits of the seed outputs. When a single bit flips every bit in the output with a 50% probability, it is known as strict avalanche.

Strict avalanche refers to the effect of flipping a single bit, but we can generalize the property to multiple bits (“higher-order avalanche”). One way to express this property is to say that if you pick an arbitrary b bits of the input and iterate them through all possible values (while keeping the other bits fixed), any sample of b bits from the output will appear to come from a uniform distribution. Strict (higher-order) avalanche for all choices of b is known as maximal strict avalanche. This property is desirable because we don't know exactly where the “good” entropy might be—it could be all at one end of the input (or scattered throughout in the low order bits)—we want all of it to flow everywhere.

At a glance, std::seed_seq does appear to have good avalanche properties, but both from looking at the algorithm and looking at the distributions its outputs form, it appears to me that it does not satisfy maximal strict avalanche.

Bias Freedom (i.e., Bijection or K-to-1)

Finally, we'd like to address the first point on our list of std::seed_seq issues (and the one we discussed last time)—we want the output to be unbiased to the maximal extent possible.

Specifically, if the entropy store is \(N\) 32-bit words in size, and initialized with \(I\) input words, and producing \(S\) words of output, if

  • \(I = N \wedge S = N\), the function should be \(1\) to \(1\), a bijection
  • \(I \ge N \wedge S \le N\), the function should be \(2^{32 (I-S)}\) to \(1\)—in other words, across all possible inputs, all possible outputs repeat the same number of times
  • \(I < S\) then some outputs cannot be produced (by definition, because there are fewer inputs than outputs) but, if
    • \(I \le N\), no outputs should ever be repeated

Basically, we're just stating “be as much of a bijection as you can”.

Output params

One final (somewhat dubious) requirement of the C++11 Seed Sequence concept is that you should be able to clone a seed sequence, but not the normal way via a copy constructor (no, that'd be too easy!) but via a param function which must produce a collection of 32-bit integers which, if passed to the classes's constructor as input seeds, will produce a copy that behaves identically to the original.

In other words, all the entropy mixing we do must be undoable to allow us to recover the seeds we were originally given (or if we gave more seeds than we really needed originally, a functionally equivalent set of seeds).

We can do this—and in some ways it's neat to do it just to show that we can do the unmixing—it's just annoying to be forced to provide this never-actually-used feature.

Basic Design

So, with those design requirements, we can sketch out the basic design. In the diagrams below, the yellow boxes represent the entropy store, initially containing the user-provided seed data, but progressively transformed into a mixed-up version.

The pink circles contain hash functions—each hash function is distinct and provides a distinct 32-bit 1-to-1 mapping. The blue circles are a “mix” function that combines two 32-bit values to make a resulting 32-bit value. (Incidentally, cryptographers call these kinds of mixing functions compression functions because they output fewer bits than they take in, not to be confused with data compression.) We could use multiple mix functions in the same way as we use multiple hashes, but we stick to using the same function throughout.

Initialization (example with four zero seeds!)

The diagram below shows what happens if we initialize a four-word entropy store with four zero seeds.

In the diagram, you can see that we began with a horrible seed, four zeros, and ended up with a random-looking final set of values in the entropy store. If there were more than four seed inputs, we'd add additional rows to the diagram; in each additional row we'd mix the hashed values of that input with each of the entropy-store values.

Output Seed Generation

Once we've initialized the entropy store, we're poised and ready to generate output seeds. This process entails a final round of hashing (with a new set of hash functions) so that if people ask for more seed values than we have in the entropy store, there won't be an obvious correlation between values generated by the same internal entropy.

Avalanche

This diagram gives a sense of how the avalanche property works. Here we change a single bit, and observe how this diagram differs from the all-zeros one we saw earlier. (Data unchanged from the previous diagram is in gray, data that is different is shown in brown).

One thing you might have surmised already is that our algorithm is a \(\mathrm{O}(N \times I)\) algorithm, where \(N\) is the size of the entropy store and \(I\) is the number of words it is initialized with. If we want maximal avalanche, that's alas somewhat unavoidable. Since entropy stores are fixed-size, however, and usually small (four or eight words ought to be ample), this quadratic behavior isn't an issue in practice.

Design Details

Thus far, we have a sense of how information will flow but some details are vague, what are these hash functions (of which we have many) and what is the mix function?

Multiply–Xorshift

For both hashing and mixing, I'll use a currently popular technique, multiply–xorshift. This operation is used as a building block for the mix functions of FastHash and MurmurHash2, and the final mix functions of MurmurHash3, CityHash and xxHash (and the output function of the SplitMix RNG). In other words, it's a very fast and very popular hashing technique.

In multiply–xorshift, we multiply by an odd random-looking constant, and then perform an xorshift on the result; for example, (excerpted from MurmurHash3)

h *= 0xc2b2ae35;    // Binary 11000010101100101010111000110101
h ^= h >> 16;

We could just look at this code and decide that it's some kind of black magic and leave it at that, but instead we'll use two perspectives to understand why it's good for scrambling the bits of an integer (and thus creating good avalanche within a single word).

A Bit-Mixing Perspective on Multiply–Xorshift

In the multiply step, high bits in the word depend in complex ways on the lower bits, but lower bits aren't changed nearly as much—for example, the low-order two bits don't change at all in the multiply step. The xorshift step pushes variation from the high-order bits back down to the lower ones. Just one multiply–xorshift step isn't perfect for creating good avalanche (e.g., the high bit cannot have influenced the low bit). That's why the hash functions I mentioned earlier (CityHash, etc.) apply several multiply–xorshifts. In our case, because we have several repetitions of hashing and mixing, only doing one multiply–xorshift in any one stage is fine.

A Graphical Perspective on Multiply–Xorshift

I like visualizations, so let's also look at what multiply–xorshift does graphically. This time, we'll imagine a much simpler 8-bit version. So we'll be providing a permutation of 256 values.

The figures below show random permutations to give us a feel for what a good permutation might look like:

With an idea of what we might be hoping for, we'll now look at what our hash function does, using an appropriately cut down 8-bit version of the operation:

h *= 0x35;    // Binary 00110101
h ^= h >> 4;

First, let's see what the multiplication and xorshift steps each do by themselves, respectively:

As you can see, these diagrams don't look very random. But our hashing operation doesn't just perform a multiply or an xorshift, it applies both, so let's apply an xorshift to the lefthand one (turning it into multiply–xorshift), and a multiplication to the righthand one (turning it into xorshift–multiply):

As you look at the plots above, you might be disappointed. They're not as regular as the previous plots, but it still looks like there is more order than chaos. But as we repeat the process things become more chaotic and start to look more random.

So let's continue and apply another multiply on the left and another xorshift on the right.

This is quite interesting because it looks like multiply–xorshift–multiply (on the left) does better than xorshift–multiply–xorshift (on the right). That's true in this example, but in general, it depends on what kind of regular structure the input has.

But if we do one more step, so that now each side has two xorshifts and two multiplies, we end up at a similar point:

So from these graphs, we can see that a single multiply–xorshift isn't really sufficient, but a couple of them do a pretty good job. It's also fairly common when repeating the process to use different constants and different shifts in successive steps (on the basis that doing so only adds to the chaos).

The Hash Function(s)

Based on these ideas, we could write a hash function like

uint32_t hash(uint32_t value)
{
    value *= 0xc2b2ae35;
    value ^= value >> 16;
    return value;
}

Notice that this code only does one step of multiply–xorshift, but it will be combined with other hashing and mixing steps, so having a single multiply–xorshift is okay here.

But for our task, we don't need just one hash function, we need many. The obvious fix is to use different multipliers for different hashes. One way to do so would be with a table of multipliers like this: something like this:

size_t which_multiplier = 0;

uint32_t hash(uint32_t value)
{
    value *= MULTIPLIERS[which_multiplier++];
    value ^= value >> 16;
    return value;
}

but then we need enough multipliers in our table for all the hash functions we might produce. Instead, since the hash multipliers are “random-looking” (odd) numbers, instead we can instead randomly generate the multiplier using a MCG (a.k.a. a Lehmer-style LCG), which has the convenient property that its outputs are always odd numbers. This change gives us the following code:

uint32_t hash_multiplier = 0x43b0d7e5;

uint32_t hash(uint32_t value)
{
    hash_multiplier *= 0x931e8875;  // New "random" multiplier

    value *= hash_multiplier;
    value ^= value >> 16;
    return value;
}

Each time we pass through the hash function, we do so with a brand-new multiplier.

There is one final flaw to fix, however. Thus far, our hash function always maps zero to zero. We can fix this issue by xoring our input with a random-looking constant. We have one of those kicking around already, so our final code becomes

uint32_t hash_multiplier = 0x43b0d7e5;

uint32_t hash(uint32_t value)
{
    value ^= hash_multiplier;       // Avoid 0 -> 0

    hash_multiplier *= 0x931e8875;  // New "random" multiplier

    value *= hash_multiplier;
    value ^= value >> 16;
    return value;
}

The Mix Function

The mix function is similarly inspired by these ideas.

uint32_t mix(uint32_t x, uint32_t y)
{
    uint32_t result = 0xca01f9dd*x - 0x4973f715*y;
    result ^= result >> 16;
    return result;
};

The multiplicative constants are somewhat arbitrary; I used constants that are known to be good when used as the multiplier for a MCG random number generator (according to the spectral test).

Developing functions like this one does feel a bit like a black art. In testing, I found that simpler mix functions (such as xor) didn't work very well, and some other simple functions (such as subtraction) worked slightly better, (presumably due to their asymmetry), but still not well enough. But it's still not completely clear to me just what the necessary and sufficient properties are. Was it sensible for me to pick multiplicative constants that do well in the spectral test? They have the maximum possible multiplicative order, but that property, while important for one random-number generation context, is probably of no consequence in this context. (At one point, I considered the constant pair 0x6ca55485 and 0x331e8877 because they gave the mix function slightly better mathematical properties, but I wasn't able to show that these were actually helping in practice, so rather than tinker more, I left the constants as is.)

Final Code

The file randutils.hpp contains the final code for the fixed-entropy seed sequence, as well as code that I'll talk about in later posts in this series.

The code follows the ideas we've outlined above. One annoying thing about C++11's Seed Sequence concept is that we must write a param method that can give the user parameters to recreate the exact same seed sequence, so that needed to be implemented, too. Thankfully, the same design decisions that ensure we have a bijection also make it possible to invert the mixing function, which is exactly what we need to provide param.

Testing

There are several things we can test to make sure our seed sequence concept does its job properly. Since I want to actually get this blog post finished, I've been lazy and not included my testing code, but none of it is difficult to write.

Bijection (and K-to-1) Properties

We've already outlined what these properties are, but it's useful to test them to make sure they actually old. By changing the word size to an 8-bit word, we can exhaustively test the 4-word–entropy version with all possible inputs and make sure that it produces all possible outputs.

It's quite easy and fun to write the necessary test code (especially if you have a machine with plenty of memory), so I'll leave it as one of those legendary “exercises for the reader”. But it passes this test.

(In fact, the we needed to invert the mixing operation to provide the param member function which establishes that it is indeed a bijection.)

Avalanche Properties

It should be the case that, on average, a single-bit change anywhere in the input flips each bit of the output with a 50/50 probability. To test this property, I wrote a tester that considers every possible single-bit change we can make to a 32-bit integer, and tests them all (as we change one 32-bit input through all possible values, we keep the other inputs constant).

In testing, the results were that each 32-bit output had 16 bits on average get flipped, with a standard deviation of 2.8284—exactly what we'd expect for a binomial distribution with 32 trials and a 50% chance of success. In fact, not only does the standard deviation match the expected value of \(2 \sqrt{2}\), but the whole distribution also exactly matches a binomial distribution.

In other words, the avalanche properties look excellent.

RNG Properties

The ultimate test for any hash function is to give it very orderly input (e.g., simple increments) and see if the output is random random enough to pass the stringent tests of RNG test suites.

In other words, we can turn a seed sequence into a random number generator like this:

// Abuse seed_seq_fe to be a random number generator
uint32_t seed_seq_fe_rng()
{
    static uint32_t counter = 0;
    randutils::seed_seq_fe128 seeder{counter,0u,0u,0u};
    ++counter;
    int out;
    seeder.generate(&out,&out+1);
    return out;
}

Before I show you the actual results, let's see what happens if we replace randutils::seed_seq_fe128 in the code above with std::seed_seq and put this RNG through TestU01's SmallCrush suite:

========= Summary results of SmallCrush =========

 Version:          TestU01 1.2.3
 Generator:        std::seed_seq (counter-at:0, result-at:0)
 Number of statistics:  15
 Total CPU time:   00:00:56.06
 The following tests gave p-values outside [0.001, 0.9990]:
 (eps  means a value < 1.0e-300):
 (eps1 means a value < 1.0e-15):

       Test                          p-value
 ----------------------------------------------
  1  BirthdaySpacings                 eps  
  2  Collision                        eps  
  3  Gap                              eps  
  4  SimpPoker                        eps  
  5  CouponCollector                  eps  
  6  MaxOft                         1 -  1.6e-6
  6  MaxOft AD                       0.9997 
  7  WeightDistrib                    eps  
 ----------------------------------------------
 All other tests were passed

These results are not good. We could argue that we're abusing std::seed_seq by trying to use it like a random-number generator—that seeds don't need to have the same quality as random numbers. But it is nevertheless, not good. It's also pretty slow (these timings were with GCC—it's even slower with Clang, taking 68% longer).

In contrast, here are the results for randutils::seed_seq_fe128:

========= Summary results of SmallCrush =========

 Version:          TestU01 1.2.3
 Generator:        randutils::seed_seq_fe128 (counter-at:0, result-at:0)
 Number of statistics:  15
 Total CPU time:   00:00:12.91

 All tests were passed

We can see that seed_seq_fe128 is both faster and statistically much better. In fact, it doesn't just pass SmallCrush, it passes Crush as well.

In fact if we adjust the code to make counter a 64-bit int, it passes BigCrush too (a 32-bit counter isn't enough because largest BigCrush test uses 12.8 billion numbers, which is significantly more than 232).

========= Summary results of BigCrush =========

 Version:          TestU01 1.2.3
 Generator:        randutils::seed_seq_fe128 (counter-at:0, result-at:0)
 Number of statistics:  160
 Total CPU time:   09:06:12.19

 All tests were passed

But what about other positions for the seed data? From the diagrams above, we know that each input position gets mixed in slightly differently. Perhaps later stages are worse. So, here's the code for placing the counter last

// Abuse seed_seq_fe to be a random number generator
uint32_t seed_seq_fe_rng()
{
    static uint32_t counter = 0;
    randutils::seed_seq_fe128 seeder{0u,0u,0u,counter};  // counter at end
    ++counter;
    int out;
    seeder.generate(&out,&out+1);
    return out;
}

On the positive side, this version also passes SmallCrush. But it is actually statistically worse than placing the counter at the beginning; specifically it fails Crush's MaxOft test. That result might seem disappointing, but it's nevertheless performing vastly better than std::seedseq or numerous widely used RNGs. But, more importantly, if any of the other input data changes, even if it changes very rarely, that's enough. In other words, this version has no problem passing BigCrush:

// Abuse seed_seq_fe to be a random number generator
uint32_t seed_seq_fe_rng()
{
    static uint32_t counter = 0;
    static uint32_t bump = 0;
    randutils::seed_seq_fe128 seeder{0u,0u,bump,counter};
    ++counter;
    if ((counter && 0xFFFF) == 0)
        ++bump;
    int out;
    seeder.generate(&out,&out+1);
    return out;
}

In this code, we only update bump 0.0015% of the time, but that's sufficient to make the code pass BigCrush. From that example, we can surmise that if any of the other seed data is worth anything at all, we're fine.

But if, for some reason, you really want our first program to pass all the statistical tests, there is a third template argument that controls how many rounds of mixing are performed. In fact, for tiny entropy pools (two words or less), an extra mixing actually happens automatically, because in those cases the added overhead is tiny.

Conclusions

In this article, we built something that beats std::seed_seq in several respects. It's faster, provably free of bias, empirically statistically better, and it doesn't dynamically allocate memory.

In fact, for the standard C++11 RNGs, our “silly” RNG from abusing seed_seq_fe is actually statistically better the RNG you're initializing.

But even though we're done with this blog post, there is more to do. We have something that does a good job of mixing together entropy from multiple sources, but you still need to get that entropy from somewhere. Where? That'll be the topic for the next two blog posts, covering what's wrong with std::random_device and how to cope with that.

C++ Seeding Surprises

Properly seeding random number generators doesn't always get the attention it deserves. Quite often, people do a terrible job, supplying low-quality seed data (such as the system time or process id) or combining multiple poor sources in a half-baked way. C++11 provides std::seed_seq as a way to encourage the use of better seeds, but if you haven't thought about what's really going on when you use it, you may be in for a few surprises.

In contrast to C++11, some languages, such as popular scripting languages like JavaScript, Python, or Perl, take care of good seeding for you (provided you're using their built-in RNGs). Today's operating systems have a built-in source of high-quality randomness (typically derived from the sequencing of unpredictable external and internal events), and so the implementations of these languages simply lean on the operating system to produce seed data.

C++11 provides access to operating-system–provided randomness via std::random_device, but, strangely, it isn't easy to use it directly to initialize C++'s random number generators. C++'s supplied generators only allow seeding with a std::seed_seq or a single integer, nothing else. This interface is, in many respects, a mistake, because it means that we are forced to use seed_seq (the “poor seed fixer”) even when it's not actually necessary.

In this post, we'll see two surprising things:

  1. Low-quality seeding is harder to “fix” than you might think.
  2. When std::seed_seq tries to “fix” high-quality seed data, it actually makes it worse.

Perils of Using a Single 32-Bit Integer

If you look online for how to initialize a C++11 random number generator, you'll see code that looks like

std::mt19937 my_rng(std::random_device{}());

or possibly the more explicit (but equivalent) version,

std::random_device rdev;
uint32_t random_seed = rdev();
std::seed_seq seeder{random_seed};
std::mt19937 my_rng(seeder);
Footnote: This example is specific to the Mersenne Twister. For some other generators, especially the simpler ones, if you just specify a single integer for the seed, as we did in the first piece of code, it uses it directly without involving seed_seq.

We're about to discover why this code is actually problematic, but first, let's be clear about what it's doing. It is

  • Creating a random-device object, which (hopefully) will ask the operating system for system-provided randomness.
  • Getting a single 32-bit value from that device.
  • Creating a seed_seq object.
  • Using that seed_seq object to initialize the Mersenne Twister.

It might seem odd to use just one 32-bit value to perform our initialization (the Mersenne Twister uses 624 32-bit integers to represent its internal state, plus a few more for housekeeping). If doing so seems wrong to you, you have good instincts! But as wrong as it is, it's nevertheless widely seen.

So if it is a bad idea, why would anyone write code like this? In fact, people do so largely because with the way that things are currently specified in C++, doing any better is quite tiresome. Actually, I asked the C++ community on Reddit to have a go at doing it right, and people found it surprisingly hard. (In fact, as we'll eventually see, no one really succeeded, because even if you try to use 624 integers to initialize the seed_seq, it's not going to do what you'd like!)

Given that this kind of code is widely seen, let's explore why it's problematic…

Predictability

If we seed with a single 32-bit integer, that only allows us to have about four billion (i.e., 232) possible initial states for our RNG. Unfortunately, in today's world, that's not very many at all. Suppose you generate 357913944 as your first number, I can just search all possible seeds and determine what your initial seed was (I won't spoil it by telling you ). If you do want to try it yourself, the Mersenne Twister is fairly slow so it'll take a while to slog through it (using all cores of a 3.40 GHz Haswell i7-4770, it takes nearly two hours).

Of course, speed problems can often be solved by trading some space for more speed. If we have a little over 16 GB of spare space on disk (or in memory), we can just precompute a lookup table that maps RNG outputs to the seeds that produced them.

However we do it, once we know the seed, we can predict all future output from the RNG. You can imagine how bad that would be in some contexts (e.g., anything involving gambling), but even for ordinary programs, predictability can allow algorithmic complexity attacks.

Bias

You might have thought predictability was our only worry (and maybe one you could dismiss if no one will see the numbers you generate), but it isn't the only one. To see an even worse problem, let's imagine a scenario.

Scenario: The Twitter App Mystery

Imagine you're working for Twitter. With more than 250 million active Twitter users, and more than a billion smartphones in active use, it's easy to imagine that the Twitter app is on a lot of phones; let's suppose it's used by about 125 million people. You'd like to get some very detailed usage information from your users, but you worry about the deluge of data you'll get if you got information from everyone every time the open the app. So, you decide to take a sampling approach. Each time the app starts up, it'll randomly decide whether to send in the usage information. Looking at the hugeness of the numbers (you guess the twitter app (re)starts about 8 times per day per user, for a total of 2 billion launches per day), it looks like a good (and convenient!) probability for whether to send in a report is 1 in 232 to get a new report to look at every couple of days.

So, copying the code you saw above, you write

std::mt19937 my_rng(std::random_device{}());
if (my_rng() == 7) /* lucky seven!  you get to send in a report */
    send_detailed_tracking_info_secretly();

And then you push out the updated app and wait. Soon the reports will come rolling in and you can analyze them.

Except that nothing happens! Days pass and no reports ever come. In the next update, you change the lucky number 7 to the unlucky number 13, and still nothing happens. This behavior would only be possible if your random number generator has a severe aversion to producing seven or thirteen.

Walking through the code with a co-worker, you're told to stop being so cute and use 0 as the “lucky” number. With the new update, suddenly tracking reports are rolling in. In fact, you're now getting twice as many as your back-of-the-envelope calculations had lead you to expect. Does the Mersenne Twister like zero twice as much as it should?

The stats you're getting look interesting, but you decide you'd like to increase the number of reports twenty-fold, so you change the == 0 line to < 20. But the change only gives you 7.5× more reports!?!

Perplexed, you review the code one more time. You realize that because you were just using one number, you never needed a RNG at all. You change the code to just use the random device directly:

std::random_device rdev;
if (rdev() < 20) /* 20 in 2**32 chance, you get to send in a report */
    send_detailed_tracking_info_secretly();

and it all suddenly works the way your back-of-the-envelope calculations said it should. And when you set your code back to checking for “lucky 7”, or 13, or 0, they all give the same number of reports.

Sure, maybe you should have just used the random device in the first place, but why did your code fail so badly when you used seed_seq and the Mersenne Twister?

What's Going On...

Strangely enough, when you initialize the Mersenne Twister with a 32-bit seed (via seed_seq), it can't ever generate 7, or 13 as its first output. And two different seeds produce 0. Even more crazy, there are twelve different 32-bit seeds that can produce the “random” numbers 1226181350 and 1563636090, so those numbers show up twelve times more often than we'd expect.

In fact, if we look at the distribution of outputs, it very closely matches a binomial or Poisson distribution. In other words, these counts exactly match what you'd expect to see if you asked for 232 random 32-bit numbers from a good source of randomness. It isn't just 7 or 13 that are missing, they're just two of 1,580,024,992 32-bit outputs that don't get output (36.7% of 232, and almost exactly the 232/e we'd expect from a Poisson distribution).

Normally, when taking random samples, which numbers are missing and which show up multiple times are different each time so there is no systemic bias. But in our example the missing and duplicated numbers are always the same ones, baked in, forever. Because they never change, we have to consider the pattern to be a kind of bias, and, as we saw in our story, this kind of bias can have real and surprising consequences.

Don't Blame std::seed_seq

You might be thinking that the problem here is seed_seq or the Mersenne Twister, but the real problem is that we're asking the impossible. The Mersenne Twister has 19937 bits of internal state, and we can't initialize it in a bias-free way using fewer than 19937 bits. (If our seeding algorithm is deterministic, there have to be some states that can never appear.)

So, if you take away one lesson, it's that

  • You really shouldn't try to initialize a random number generator whose state is 624 32-bit ints with a single 32-bit int.
  • If you ignore that advice, std::seed_seq tries to come up with a sane initialization, but it can't work miracles.

Incidentally, similar problems also affect std::minstd_rand, which uses a single 32-bit word to store its internal state but whose number of states isn't an exact power of two. It's clear (via the pigeonhole principle) that there is no bias-free way to use a single full-range 32-bit value to initialize this RNG.

Why does seed_seq exist, if initializing 624 ints using a single int isn't a good idea in the first place? Well, just because something is a bad idea doesn't mean people won't want to do it! And without seed_seq, people might employ far more pathological initializations, such as repeating the same integer 624 times. The Mersenne Twister is prone to “bad states” (all zeros causes it to not work at all, whereas lots of zero bits are merely bad)—taking 0, or 1, or 2 and repeating it 624 times could be disastrous. The seed_seq algorithm avoids this issue (when given a single integer), resulting in an initialization that's relatively sane.

But a good rule of thumb is that if you're so cheap that you want to use a single 32-bit integer to initialize your random number generator, you should probably use an RNG with a 32-bit state. Otherwise, you should use the amount of seed randomness your generator actually needs—don't try to give it less.

Blame std::seed_seq After All

In the previous section, we saw that seed_seq was just trying to do something sane when faced with an impossible task (i.e., not being given enough seed data to work with). But what about when we do provide it with the right amount of seed data? Are we free of problems now? Alas, the answer is (at least sometimes), no.

To explore this issue, we'll look at something much smaller than the Mersenne Twister, a 64-bit RNG that thus needs 64 bits of seed data. C++ doesn't include such a generator, but it does provide the means to make one:

typedef std::linear_congruential_engine<
    uint64_t, 6364136223846793005U, 1442695040888963407U, 0U>
        knuth_lcg;  /* Knuth's preferred 64-bit LCG */

(Incidentally, I sometimes use this generator in preference to the PCG generators when all I care about is raw speed. Sometimes you just don't need anything statistically better.)

A reasonable way to initialize this generator would be

std::random_device rdev;
uint64_t       seed = (uint64_t(rdev()) << 32) | rdev();
knuth_lcg      rng{seed}

Here we just make a single 64-bit integer from two calls to the system's random device and use that as our seed. This approach works really well, but we're going to see what happens if we try to use seed_seq instead.

std::random_device rdev;
std::seed_seq  seeder{rdev(),rdev()};
knuth_lcg      rng{seeder}

This code seems like it ought to be broadly equivalent. We're not making the mistake of using just one 32-bit value, we're using two, and inside knuth_lcg it'll request two integers from the seed sequence. So we should be all good, right?

But no, it's not good at all. Once again, seed_seq introduces bias, and this time it really has no excuse.

std::seed_seq Is Not a Bijection

If we think of seed_seq in this context as taking in 64 bits (2 × 32-bit ints) and outputting 64 bits (again as 2 × 32-bit ints), we can see it as a simple function from 64-bit ints to 64-bit ints. The question is, what kind of function is it? The answer is shown in the following graph:

What we really want is for seed_seq to be a bijection—in other words, we want every possible output to have the same chance of being produced (1 in 264). But that's not what seed_seq does. Some output values are produced twice, and some are never produced at all. (Interestingly, it also doesn't look like a random distribution—it's its own thing.)

You might wonder how I got the histogram above. After all, did I really generate all 264 (18 quintillion!) possible inputs and see what it outputs? No. Did I do crazy mathematics to figure it out? No. In fact, I cheated. I made a 16-bit version of seed_seq and tested how it behaved when generating two 16-bit numbers (which gives us only 232 possibilities to consider, a mere 4 billion).

“Ha!” you cry. “Maybe that's not a valid approach!”

But I went on to check my work. With good evidence to suspect that the real 32-bit seed_seq isn't a bijection, I looked for collisions. And I found them. For example, if you generate two ints from seq1 or seq2 below, you get the same output, {0x3bf5972a, 0xe304114d}:

  • seed_seq seq1{0xf5e5b5c0, 0xdcb8e4b1}
  • seed_seq seq2{0xd34295df, 0xba15c4d0}

This single example is enough to show that seed_seq is not acting as a bijection. If you'd like a few more examples, this program will print out about a hundred more.

Obviously I couldn't search the entire 64-bit space looking for duplicates, but I didn't have to. Instead, I sampled a mere 234 (about 17 billion) seedings, and, thanks to the birthday problem, I found about eight duplicates per run. (Done naïvely, this process would require a 256 GB hash table, but you can get by with a somewhat smaller table; I might write about exactly how I did that in a separate post.)

Of course, thus far I've only used generate to make two ints. Maybe that's a weird case and it behaves differently if we need a different amount of seed data? Actually, that's true, but not in a positive way, as we can see in the graph below:

Here, we find that if we're generating a single integer (from a single seed int), it is a bijection, but as we generate more and more integers (while properly giving the RNG a matching number of input seeds), the results actually begin to look more and more like the Poisson distribution we saw above. In fact, in the last one, some outputs show up six times (albeit a mere 0.0006% of them).

(Incidentally, these graphs were drawn using an 8-bit incarnation of seed_seq; you'll notice that the second histogram looks identical to its 16-bit counterpart, which lends credence to the idea that we can extrapolate about performance from smaller bit sizes.)

Scary All-Zero States

Since seed_seq can never produce some outputs, you might hope that among those never-produced outputs is the all-zero state. A reason to want this property is that the Mersenne Twister and other generators based on linear-feedback shift-register techniques (including XorShift) can't handle an all-zeros state—if you initialize them that way, they'll just output zero forever.

Unfortunately, seed_seq can produce the all-zeros state. In fact, in my tests with the 8-bit and 16-bit versions of seed_seq, I found that the all-zeros state occurred twice when generating two ints (but only once when generating one, three, and four ints). So putting good random data though seed_seq may actually be increasing the chance of a pathological state rather than decreasing it.

That said, even if we double the odds, it's still really unlikely and you probably shouldn't worry about it.

More std::seed_seq Flaws

Besides its somewhat surprising mathematical properties, there are a few other gotchas for C++ programmers when using seed_seq. One is that it is required to store its seed data so that every time generate is called in the same way, it produces the same output. Because the amount of seed data can vary, it stores it on the heap, but doesn't have a std::allocator as a parameter. Thus, if you're working on an embedded system that prohibits heap allocation, you can't (easily) use seed_seq at all.

Fixing Seed Sequences

In my opinion, the C++ standard should do two things for C++17 to help with these issues.

  1. It should relax the requirements for the Seed Sequence concept so that it is not required to be almost identical to std::seed_seq in its implementation. The idea that repeated calls to generate must always generate the same values needs to go.
  2. The std::random_device class should have a generate method so that it can be used as a Seed Sequence. This addition also aids efficiency because the library can then read multiple values from the operating system at once rather than requesting them one by one.

With those changes, we can let std::seed_seq take a reduced role in our code.

Another interesting question is whether we can write a better seed_seq, one that is both a bijection and does a good job when given low-quality seed data. I believe the answer is yes, but this post is long enough, so I'll leave that for another time.

A Place for News

When I started looking for tools to create websites, I found there were many options. Many of those options were extremely complex, designed for complex authoring needs (e.g., dynamic content, web-based story submission, multiple authors, etc.), and required significant infrastructure. All I wanted was a simple static site, so these solutions were overkill.

All I wanted was a simple file-based static site generator, but most of the options in that space were focused on blogging, although they often claimed that you didn't have to use them that way. Nikola seemed like the best/easiest choice, with clear instructions for how to disable its blogging functions.

But as time has passed since creating the back in October of last year, I've come to realize that I need a place for news, announcements and random musings about randomness that don't easily fit elsewhere on the site. In other words, it needs a blog.

So, today I adjusted Nikola's configuration once again, and with just a few lines of changes, ta-da, we have a blog. Well, actually initially it was a horrible-looking blog because the blog page templates were terrible, but with a some changes to the templates (following a strategy of expedience over elegance), I managed to create a more-or-less acceptable layout.

Anyway, watch this space for more…