Ease of Use without Loss of Power

To the extent that anyone cares about C++11's random-number facility at all, the C++ community is polarized between two views—one that likes it, and one that hates it.

The two views are basically these:

  • It's amazingly elegant, a shining example of separation of concerns. Pseudo-random engines, random distributions, seeding techniques, and entropy sources are all quite different things. C++11 creates a pluggable and extensible architecture that allows for new RNGs, new custom distributions, and so forth. It's comprehensive and flexible.
  • It's horrible to use, feeling unnecessarily overengineered. To do even the simplest task requires significant boilerplate. Things that would be one easy-to-understand line in Python are three or four hard-to-follow lines in C++11. It's completely unsuitable for beginners, and even seasoned programmers hate it.

Both camps are right. But it turns out that we don't need to do very much to fix the situation and (hopefully) leave both camps feeling that they're getting what they want.

In this post, we'll develop the missing C++11 class that ties all the pieces together, making C++11 random number generators easy to use without abandoning any of their power and flexibility.

A Preview

Start by reading the following code, which demonstrates randutils::mt19937_rng, specifically the uniform, pick and variate member functions.

#include "randutils.hpp"
#include <iostream>

int main()
{
    randutils::mt19937_rng rng;

    std::cout << "Greetings from Office #" << rng.uniform(1,17)
              << " (where we think PI = "  << rng.uniform(3.1,3.2) << ")\n\n"
              << "Our office morale is "   << rng.uniform('A','D') << " grade\n"
              << "We " << rng.pick({"welcome",
                                    "look forward to synergizing with",
                                    "will resist",
                                    "are apathetic towards"})
                       << " our management overlords\n\n";

    std::cout << "On the 'business intelligence' test, we scored " 
              << rng.variate<double,std::normal_distribution>(70.0, 10.0)
              << "%\n";
}

Even though you haven't read any API docs, just by looking at this code, you can probably see exactly what is going on.

You can check to make sure it does what you think by looking at the outputs from a couple of runs (or you can download the header and try the code yourself):

Greetings from Office #3 (where we think PI = 3.18079)

Our office morale is D grade
We are apathetic towards our management overlords

On the 'business intelligence' test, we scored 74.511%

and

Greetings from Office #9 (where we think PI = 3.11824)

Our office morale is C grade
We will resist our management overlords

On the 'business intelligence' test, we scored 64.5473%

Two of the most common use cases for a random-number generator are generating a uniform value in a range, and making a random selection. With this class, which is just a very simple wrapper class around existing facilities, these tasks become easy. (If you're saying, “what about shuffling?” don't worry, we've got you covered, too. In fact, anything that's easy to do with Python's RNG class is just as easy to do here—in designing the class, wanted random number generation in C++ to become as easy as it is in Python.)

But from the last line of code where we called variate asking for a std::normal_distribution, you can also surmise that all the power and flexibility of C++11 RNGs is available to us. We can pick any distribution we like, including ones we've written ourselves. We can similarly use different random engines, or different sources of seeds. If we want control, we've got it.

A Tour of the Facilities

The type mt19937_rng is just an alias for random_generator<std::mt19937>, a thin wrapper around the Mersenne Twister engine. We can wrap any C++11 compatible engine, including the ones from Boost and my own PCG engines. They all work. The random_generator does almost no work, just hands everything off to facilities that already exist in other classes and functions. Its job is to wrap everything up to make it all easy.

Construction (and seed)

A random-generator object is a wrapper around a C++11-compatible random-number engine. Its sole state is the state of the underlying engine and so its only initialization work is to set up the engine.

If arguments are provided to the constructor, they are passed through to the underlying engine (via perfect forwarding). If, however, the first argument is convertible to a seed sequence via a base() member function (a capability indicated by providing a base_seed_seq typedef), this conversion is performed before the argument is passed to the engine.

Finally, and most usefully, if the random generator is default constructed, it creates a seeding object to give to the random-number engine. By default, it uses an auto_seed_256 object to provide nondeterministic seeds. (You can change the default via a template argument to the class.)

In other words, if we default construct the class, it begins with excellent, highly random seeding.

All the same rules apply to the arguments to the seed member function, which can be used to reseed the underlying engine.

Examples
randutils::mt19937_rng rng;

randutils::default_rng another_rng;

randutils::mt19937_rng rng_zero_seed{0u};
randutils::mt19937_rng rng_explict_seed_seq{randutils::auto_seed_128{}};

randutils::random_generator<std::knuth_b>   horribly_bad_rng;

std::seed_seq oldstyle_seedseq{1,2,3};
randutils::mt19937_rng rng_oldstyle_seed_seq{oldstyle_seedseq};

another_rng.seed(0u);       // Fixed seed
another_rng.seed();         // Nondeterministic seed

Because of the high-quality seeding, it's even possible (though somewhat wasteful) to create and use a random generator on the fly, like,

std::cout << randutils::mt19937_rng{}.pick({"Don't do it!", "Okay, do it"})
          << "\n";

(Note that programmers who know anything about random number generation may find that this last example makes their skin crawl, because they've learned that seeding should happen once, and that creating multiple generators is usually a sign of programmer confusion. That is a good rule of thumb. Nevertheless, thanks to the awesome seeding power of auto_seeder, while creating a plethora of RNGs might be slow, clumsy and bad style, doing so won't cause statistical problems. And the results will be nondeterministically random.)

Implementation Details

Although all that really needs to happen is passing arguments through to the constructor for the underlying random engine, the code to do this has a fair amount of baggage. Evil SFNAE-based compile-time introspection is required to detect what kind of seed sequence is being passed so that we can know whether to call base() or not.

uniform (member function)

For any numeric type N, with values v1 and v2, uniform(v1, v2) returns a value of type N in the range v1 to v2. For integer types, the range is a closed (i.e., up to and including v2), exactly as produced by the equivalent call to uniform_int_distribution, whereas for noninteger types it produces values according to the rules for uniform_real_distribution.

Examples
int i           = rng.uniform(42,54);
unsigned int u  = rng.uniform(0u, 99u);
long l          = rng.uniform(0l, 99999999999999l);
char c          = rng.uniform('A','Z');
double e        = rng.uniform(2.717, 2.719);

int error1      = rng.uniform(0, 1.6);   // Inconsistently typed arguments
int error2      = rng.uniform(0, 17u);   // Inconsistently typed arguments
Implementation details

uniform is really just a special case of the more general variate function. Here is the entire implementation:

template <typename Numeric>
Numeric uniform(Numeric lower, Numeric upper)
{
    return variate<Numeric,uniform_distribution>(lower, upper);
}

What's nice about uniform is that it infers the type of the output based on the type of the input arguments. variate can't do that because it works for arbitrary distributions and doesn't know what the distribution parameters mean.

(The distribution uniform_distribution is just a convenience templated typedef that uses std::uniform_int_distribution for integers and std::uniform_real_distribution otherwise.)

pick (member function)

For any instance, c, of a container type, pick(c) will return a reference to a randomly chosen member of the container. Containers are defined to be anything for which std::begin and std::end are defined.

Picking from an empty container is an error (with results similar to trying to dereference a past-the-end iterator).

Examples
int nums[]                  = {43, 51, 22};
std::list<std::string> list = {"Gizmo", "Hotel", "Ingot"};

++rng.pick(nums);                           // O(1) because it's an array
std::string& thing = rng.pick(list);        // O(n) because it's a list

const char* play   = rng.pick({"Rock", "Paper", "Scissors"});
bool b             = rng.pick({true, false});

void (*hello)()    = []{ std::cout << "Hello World!!\n"; };
void (*fail)()     = []{ std::cerr << "Error, Error!\n"; };

void (*mystery)()  = rng.pick({hello, fail});
mystery();
Implementation Details

The pick member function is actually just a wrapper around the choose function, described later on. Whereas choose takes and returns iterators, pick returns a reference to the chosen item. The implementation is trivial:

template <typename Range>
auto pick(Range&& range) -> decltype(*std::begin(range))
{
    return *choose(std::begin(range), std::end(range));
}

For technical reasons, we also need to provide a distinct and almost identical overload to handle being given an initializer list.

shuffle (member function)

For any writable random-access container c, shuffle(c) will shuffle the container. Containers are defined to be anything for which std::begin and std::end are defined.

Alternatively, shuffle can be passed two random-access iterators that specify a range to shuffle. In other words, shuffle(b,e) shuffles the range that begins at b and ends just before e.

Examples
const char* foods[]         = {"Flour", "Grain", "Honey", "Icing"};
std::vector<double> vec     = {1.23456, 3.14159, 2.71828, 1.61803};

rng.shuffle(foods);
rng.shuffle(vec);
rng.shuffle(vec.begin(), vec.begin()+2);
Implementation Details

Once again, the implementation is trivial. All the hard work is already done for us by std::shuffle. The code for both variants is simply

template <typename Iter>
void shuffle(Iter first, Iter last)
{
    std::shuffle(first, last, engine_);
}

template <typename Range>
void shuffle(Range&& range)
{
    shuffle(std::begin(range), std::end(range));
}

variate (member function)

This member function provides the full power of C++11's distribution classes. For some result type R, and class template T, where T<R> is a C++ distribution that can be initialized with parameters p1,…,pN, variate<R,T>(p1,…,pN) produces a random variate from that distribution.

If the distribution is not specified, it is assumed to be std::normal_distribution. To avoid mistakes, the output type must be specified.

Examples
// Create three random variates with mean = 0.0 and std dev = 1.0
double normal1 = rng.variate<double>();         
double normal2 = rng.variate<double>(0.0,1.0);
double normal3 = rng.variate<double,std::normal_distribution>(0.0,1.0);

float  grade   = rng.variate<float>(70.0,12.5);
int collisions = rng.variate<int,std::poisson_distribution>(1.0/8.0);

auto dd_init   = {40.0, 10.0, 10.0, 40.0};
int index      = rng.variate<int,std::discrete_distribution>(dd_init);

The last example is worthy of some explanation; std::discrete_distribution wants to be passed a std::initializer_list, but a C++ limitation prevents us from passing one directly, so we have to put it into a temporary variable first (or explicitly construct the initializer list). It's a little annoying, but most distributions don't take initializer lists, so it's a relatively minor issue.

C++11 provides us with lots of distributions to choose from, all of which work with variate:

Implementation Details

Again, this code is a very simple wrapper around preexisting C++11 functionality. The only thing that makes the code look a little scary is the C++11's perfect-forwarding machinery.

template <typename ResultType, 
          template <typename> class DistTmpl = std::normal_distribution,
          typename... Params>
ResultType variate(Params&&... params)
{
    DistTmpl<ResultType> dist(std::forward<Params>(params)...);

    return dist(engine_);
}

One thing to note is that this code creates and tears down a distribution object every time we call variate. In typical C++ standard library implementations, these objects have no persistent state and little or nothing is lost by doing things this way. (Any redundant work should be eliminated by the optimizer anyway.)

But if you wish distribution objects could persist longer, you may like the next member function we'll examine, generate.

generate (member function)

For any writable container c containing elements of type R, and distribution class template T, where T<R> is a C++ distribution that can be initialized with parameters p1,…,pN, generate<T>(c,p1,…,pN) fills the container random variates from that distribution. If T is not specified, it is a uniform distribution.

Containers are defined to be anything for which std::begin and std::end are defined.

Alternatively, we can pass two forward iterators that specify a range to generate instead of a container. In other words, generate<T>(b,e,p1,…,pN) fills the range that begins at b and ends just before e.

Examples
char grades[100];
std::vector<double> percentages(100);

rng.generate(grades, 'A', 'D');
rng.generate(grades,grades+10, 'A', 'B');
rng.generate<std::normal_distribution>(percentages, 80.0, 7.5);
Implementation Details

The implementation is very similar to the code for variate, except that having created a distribution object, we call it repeatedly. The target container is filled using std::generate, so this function is once again simply marshalling data into preexisting functions.

choose (member function)

This function behaves similarly to pick except that it returns an iterator rather than a reference. For two forward iterators, b and e that specify a range, choose(b,e) will find return an iterator within the range.

In contrast to pick, which cannot be called with an empty container, running choose(b,e) on an empty container returns b (which should be equal to e the past-the-end position).

For a container c, choose(c) is equivalent to choose(std::begin(c), std::end(c)).

Examples
std::array<int,50> counts;
std::iota(counts.begin(), counts.end(), 0);   // Fill with 0..49

auto mid_mid   = rng.choose(counts);
auto left_mid  = rng.choose(counts.begin(), mid_mid);
auto right_mid = rng.choose(mid_mid, counts.end());
rng.shuffle(left_mid, mid_mid);
rng.generate(mid_mid,right_mid,50,100);
Implementation Details

Again, this function is simply using preexisting functionality. The size of the range is determined with std::distance, a std::uniform_int_distribution finds the index of the desired element, and std::advance moves to that element. This work is only done if the size of the range is ≥ 2 (i.e., where there is an actual choice to make).

sample (member function)

This function takes a random sample of items from a container by moving a given number of them to the front, otherwise preserving order. For a container c, sample(n,c) moves n randomly chosen elements to the front and returns an iterator that marks both the end of the sample and the start of the remainder.

Alternatively, you can pass two iterators, b and e, that specify a range; sample(n,b,e) samples n elements in the same way as described above. In fact, sample(n,c) simply calls sample(n,std:begin(c), std:end(c)).

If n is greater than the size of the container, all items from the container are selected.

Examples
std::array<int,50> population;
std::iota(population.begin(), population.end(), 0);   // Fill with 0..999
auto sample_end = rng.sample(20, population);
Implementation Details

Of all the functions, sample is the only one that isn't merely a completely trivial wrapper for preexisting functionality. Nevertheless, all the heavy lifting is actually done by std::stable_partition.

I provided this function because a similar function is provided by Python.

engine (member function)

This function returns the underlying engine. Anything not provided for by this class can be achieved by using the engine directly.

Examples
rng.engine().discard(500);
auto raw_output = rng.engine()();

Versus the C++17 randint Proposal

I'd very much like to see the random_generator class template (or an enhanced iteration of it) make it into a future version of C++. That probably means someone (perhaps me) will have to write a proposal at some point, but there is another contender for making C++11 random number generation easy to use, the randint proposal, currently being considered for C++17.

The randint proposal gives the following motivation [sic]:

The std::rand friends are discouraged in C++14, so we want:

  1. A direct replacement to the std::rand friends. Despite of the security issues, std::rand is considered both handy and useful as a global uniform random number generator.

  2. To expose the most widely-used combo in C++11 <random> without pushing the users to learn the whole design. Smoothing the learning curve can usually optimize the acceptance.

randint Is Too Limited

The randint proposal is essentially, a dumbing-down proposal. Working from the basis that the C++11 random-number facility is too hard to use, it gives a radically simplified interface offering users just one thing, uniformly distributed integers. Want a replacement for drand48 (which provides uniform doubles)? You're out of luck. Want to have numbers with a normal distribution? Look elsewhere.

Look back at the code that we used to begin this article, the code that printed random text like this:

Greetings from Office #1 (where we think PI = 3.14242)

Our office morale is A grade
We synergize with our management overlords

On the 'business intelligence' test, we scored 82.8281%

It was a cute, fun, silly example. Exactly the kind of literally random silliness that can capture the imagination of a beginner and can be easily within their grasp. Could we produce this silly example with randint? Some of it, but not all of it. If it doesn't involve uniformly distributed integers, it won't be easy.

The idea behind the randint proposal seems to be that users starting out with C++ aren't going to want to do anything remotely sophisticated (even if they've previously used Python and its far richer random class). And, likewise, presumably nonbeginners should be willing to tough it out.

randint Is Not a Stepping Stone

A goal of the randint proposal is to “smooth the learning curve” so programmers don't have to “learn the whole design” of the C++ random-number facility all at once. But it actually doesn't help them learn it at all.

What do you learn about the wider facility when you write code like the code shown below?

for (int i = 0; i < 10; ++i)
    std::cout << "Rolled " << std::randint(1, 6) << "\n";

None of the key ideas of C++ random number generation are in play.

But random_generator Is a Stepping Stone

Contrast that randint code, with the following code using the ideas from this blog post:

mt19937_rng   my_rng;

for (int i = 0; i < 10; ++i)
    std::cout << "Rolled " << my_rng.uniform(1, 6) << "\n";

Here you encounter the idea that generators exist as distinct objects. We understand that we're asking for a uniformly distributed number.

We can learn about seeding:

mt19937_rng   my_rng(0);    // Temporarily set to zero during testing

for (int i = 0; i < 10; ++i)
    std::cout << "Rolled " << rng.uniform(1, 6) << "\n";

We can learn about C++ distributions:

mt19937_rng   my_rng;

for (int i = 0; i < 10; ++i)
    std::cout << "Rolled "
              << my_rng.variate<double,std::uniform_int_distribution>(1, 6)
              << "\n";

In general, we're better poised to slowly unfurl all the pieces that make up the C++ random number generation scheme.

randint Blesses a Global Variable

A global RNG isn't in any way necessary if we have a good way to seed newly created RNGs. Like any other case of global variables, putting your RNG in a global variable should be a matter of programmer choice. Some programmers find global variables expedient and convenient, whereas others consider them an anathema.

With the random_generator class template discussed in this post, it's up to you. If you want a global RNG, it's easy, declare it like any other global variable:

mt19937_rng global_rng;

or, if you prefer, a thread-local one,

thread_local mt19937_rng semiglobal_rng;

randint Isn't What C++17 Needs

The motivations for the randint proposal were good. As I said at the start of this post, many people, from beginners to professionals, find the modular structure of the C++ random-number facility unduly cumbersome. But randint is too limited and backward looking. It turns its back on the elegance we already have.

Conclusions

About a month ago, I posted a challenge to the C++ community on reddit—write C++ code equivalent to these two lines of Python:

import random;
for i in range(2,17): print "[0,{}):\t{}".format(i,random.randrange(i))

It was a hard challenge because of the difficulties of getting the seeding right. But even without that issue, it was annoying to code given all the pieces that had to be properly fitted together.

But with the random_generator wrapper we've discussed in this post, the task becomes trivial:

mt19937_rng  rng;
for (int i = 1; i < 16; ++i)
    std::cout << "[0," << i+1 << "]:\t" << rng.uniform(0,i) << "\n";

We have the same kind of ease of use that programmers in other languages take for granted, but without losing any of the power that comes with the modular design of C++'s random-number facility. Truly, the best of all worlds.

But don't just take my word for it, download randutils.hpp and play around. Start having actual fun with random number generation.