Simple Portable C++ Seed Entropy

In two recent posts, I've looked at seeding random number generators in in C++11 (looking at what's wrong with std::seed_seq and developing something that avoids those flaws), but seed_seq exists as a mechanism to “mix up” entropy that you give it. You still need to get that entropy from somewhere. So where?

Sadly std::random_device Is Not Your Friend

The obvious source for external randomness we can use in seeding is std::random_device. But as I mentioned in this post,

  • It's hard to make std::random_device conform to the requirements of a seed sequence.
  • It's unspecified just how costly this “device” is to read from.
  • It may not be nondeterministic at all, rendering it unfit for our purposes.

Portable code needs to look to other sources of entropy for RNG seeding. And even if std::random_device works well, mixing in entropy from other sources can have a variety of benefits, including performance.

Read more…

Developing a seed_seq Alternative

In my previous post, I talked about issues with C++11's std::seed_seq. As a reminder, the goal of the std::seed_seq is to generate seed data that can be used to initialize a random number generator. And to allow it to do its job it needs seed data. Seed data in, seed data out. It might seem like a “do nothing” function would be up to the task, but in many situations the seed data you can provide is of variable quality. Some of it is “good” entropy, which changes often, and some of it is “lower quality” entropy which is more predictable or changes less frequently (e.g., the machine id). It also may be that the data that changes often (e.g., the time) changes more in the low-order bits than the high-order bits. So the goal of std::seed_seq is to mix together the seed data that we provide so that the high-quality and low-quality data is well-mixed.

In essence, the task that std::seed_seq does (and thus anything intending to replace it should do) is compute a hash (i.e., a scrambled version) of its input data. The design of hash functions is often (rightly) considered to be something of a black art, but since the same can be said of designing random number generators and I've already done that, I might as well tackle this problem, too.

Read more…

C++ Seeding Surprises

Properly seeding random number generators doesn't always get the attention it deserves. Quite often, people do a terrible job, supplying low-quality seed data (such as the system time or process id) or combining multiple poor sources in a half-baked way. C++11 provides std::seed_seq as a way to encourage the use of better seeds, but if you haven't thought about what's really going on when you use it, you may be in for a few surprises.

In contrast to C++11, some languages, such as popular scripting languages like JavaScript, Python, or Perl, take care of good seeding for you (provided you're using their built-in RNGs). Today's operating systems have a built-in source of high-quality randomness (typically derived from the sequencing of unpredictable external and internal events), and so the implementations of these languages simply lean on the operating system to produce seed data.

C++11 provides access to operating-system–provided randomness via std::random_device, but, strangely, it isn't easy to use it directly to initialize C++'s random number generators. C++'s supplied generators only allow seeding with a std::seed_seq or a single integer, nothing else. This interface is, in many respects, a mistake, because it means that we are forced to use seed_seq (the “poor seed fixer”) even when it's not actually necessary.

In this post, we'll see two surprising things:

  1. Low-quality seeding is harder to “fix” than you might think.
  2. When std::seed_seq tries to “fix” high-quality seed data, it actually makes it worse.

Read more…

A Place for News

When I started looking for tools to create websites, I found there were many options. Many of those options were extremely complex, designed for complex authoring needs (e.g., dynamic content, web-based story submission, multiple authors, etc.), and required significant infrastructure. All I wanted was a simple static site, so these solutions were overkill.

All I wanted was a simple file-based static site generator, but most of the options in that space were focused on blogging, although they often claimed that you didn't have to use them that way. Nikola seemed like the best/easiest choice, with clear instructions for how to disable its blogging functions.

But as time has passed since creating the back in October of last year, I've come to realize that I need a place for news, announcements and random musings about randomness that don't easily fit elsewhere on the site. In other words, it needs a blog.

So, today I adjusted Nikola's configuration once again, and with just a few lines of changes, ta-da, we have a blog. Well, actually initially it was a horrible-looking blog because the blog page templates were terrible, but with a some changes to the templates (following a strategy of expedience over elegance), I managed to create a more-or-less acceptable layout.

Anyway, watch this space for more…