History of the PCG Paper

M.E. O'Neill

2017-07-25 23:08

This site has gone more than two years without an update, but today I'm finally making some updates. It seems appropriate to look back at how we got here, and at some of the things that have happened in the last two years.

If you've been wondering what happened with the PCG paper, this blog post will let you know. Eventually.

Academics don't often write about how papers get written and where their ideas come from, since it's not relevant to the end product, but I think there is value in knowing how things get made, so I'll share the full story of the PCG paper. In hindsight, I think its fate was telegraphed by its origins.

Beginnings

The official date for the PCG paper is September 5, 2014, but research papers do not spring into life fully formed at a specific date. Nor are research processes always quite as orderly or directed as some people think. Sometimes we stumble upon good ideas quite by chance. The PCG scheme is one example of this phenomenon.

Avoiding Midterm Grading Over Thanksgiving Break

The paper's origins (and the origins of the PCG family in general) date back to almost a year earlier: Thanksgiving break of 2013. Looking for an escape from a stack of midterms I had to grade, I ran across a post on the programming-related discussion area of Reddit that linked to the talk “rand() considered harmful” by Stephan T. Lavavej from Microsoft.

In his talk (which had a lot of good advice), Lavavej recommended that people avoid linear congruential generators and use “a good generator like the Mersenne Twister”. I had said pretty much the same thing myself in the past, but this time, I saw a fun programming challenge—I wondered if I could use the worst possible generators (a couple of tiny LCGs) to rival the Mersenne Twister. At that time I first employed “randomgram” diagrams to visualize the success or failure of my approach, and successfully tested the resulting generator using the Diehard test suite (because at the time I was unaware of the more superior TestU01 suite).

Returning from Thanksgiving break, I told a colleague what I'd been up to, showed him my visualizations, and then let the matter drop—vacation recreation achieved (and Midterms graded too!).

Avoiding Final-Exam Grading in the Spring

In late May, 2014, I had yet another moment where I had a stack of exams to grade, this time final exams, and any distraction was welcome. I was thinking about my upcoming summer research projects, which would be centered around concurrency and parallelism, and this time the attractive intellectual challenge I set myself was another “turn something bad into something good” exercise, turning the usually problematic nondeterministic behavior of parallel programs into a source of randomness. I wrote a small test program, and then wanted a good test suite to check the output. This time I discovered TestU01 and was pleased to discover that my idea seemed to work.

I also tested the toy generator I had written for fun over Thanksgiving break, and discovered that it failed TestU01 even though it had passed Diehard (in retrospect I now know it failed because had too small a state space to pass). I was disappointed, but that it lead me to realize that the techniques I had used to permute two LCGs could be adapted to permute a single LCG, and with LCGs, one big one is better than two small ones.

My First Talk on Random Number Generation

By that time, the results I had seemed interesting, but it seemed unlikely to me that they were particularly noteworthy. But eager to get some value out of the work, I decided to give a talk to our summer-research students, who were still at an early stage in the developing their research skills. Sometimes students fall prey to thinking that research should be like coursework, where there is a clear plan and an obvious path forward. But research typically represents a step into the unknown. Sometimes the path forward seems clearly illuminated, but much is stumbled upon, creating either unexpected advances or unexpected setbacks. As such, curiosity, exploration, and play can prove as valuable as rigor.

Thus on June 20, 2014, I gave a talk, “A Random Tour through the Research Process”, to our cohort of summer research students (and many of their faculty advisors). It contained the first description of the key ideas behind the PCG family. Students enjoyed the talk, and many of my faculty colleagues did, too. Two of them chatted to me afterwards and suggested that I consider writing a paper about my techniques.

I was less convinced that doing so would be worthwhile. At that time, my generation scheme looked like this:

uint64_t state = INIT;

uint64_t nextValue()
{
    uint8_t shift = 16 + (state >> 60);
    state = state * MULT + ADD;
    return state >> shift;
}

Although the code looks remarkably similar to current PCG generators and passed statistical tests, I saw a flaw in the design. While it had no significant bias in practice, it was technically not uniform.

But two days after the talk, I realized that a tiny change to the code would lead to a uniform generator. In addition, it was more faithful to my earlier work in combining two LCGs:

uint64_t state = INIT;

uint32_t nextValue()
{
    state = state * MULT + ADD;
    uint8_t shift = 29 – (state >> 61);
    return state >> shift;
}

Thus, June 22 was when the key idea underlying PCG's output functions was born: permutation functions on tuples. (I had actually been using the idea back in November, but had not realized how generally applicable it was!) This point was where I felt like I had something genuinely new to contribute. Scrambling the output of an LCG seemed pretty obvious to me, but this technique for doing so seemed genuinely new.

Writing the Paper

Despite feeling like I had a new contribution to share, I was nevertheless a little hesitant about writing a paper on random number generation. It's one thing to stumble upon a neat idea, but it's quite another to claim that it represents an advance over prior work—that requires a much deeper understanding of the overall state of the field in which the work lies. Added to that, I was an outsider in the field of random number generation with much to learn about prior work before I could make any claims with confidence. But as I explored further, I found both that no one seemed to have had the insight that I had had in constructing permutation functions, and that it seemed that many prior schemes were blighted by one deficiency or another, whereas the PCG scheme seemed more balanced.

On June 30, 2014, after a flurry of background reading, I began writing the paper, under the preliminary title “A Simple, Fast, Space-Efficient Algorithm for Random Number Generation”. Less than two weeks later I had a draft I could share with a colleague for critique and feedback. His reaction was positive, so I continued the work. It was not until the end of July that I settled on the name PCG for the generation scheme.

Research Paper or Dissertation?

One recurring concern as I worked on the paper was how it kept growing in size. As July became August and the paper was not yet finished, it was starting to feel more like I was writing a dissertation than a regular paper.

In most areas of computer science the notion of “quality” is narrower than it is for random number generation, in that there are only a small number of aspects to optimize, such as the space or time required to run an algorithm, or even just showing that an algorithm or a proof exists. In contrast, random number generation is an area where there were multiple dimensions to generator quality, all of which were important. From my reading of prior work, it seemed like too many papers in the field focused narrowly on a subset of the desirable qualities to their detriment because that has often meant they had a flaw in some aspect that the authors had not addressed.

Moreover, I prefer to write papers that are broadly accessible. I'd rather write a paper that can be enjoyed by people who are interested in the topic than one that can only be understood by a tiny number of experts. I don't agree with the philosophy that the more impenetrable the paper, the better the work must be! Describing desirable qualities in detail seemed to be necessary for the paper to make sense to anyone not deeply entrenched in the field. Doing so also seemed necessary for anyone in the field who only cared about a subset of the qualities I considered desirable—I would need to convince them that the qualities they usually didn't care about were actually valuable too.

At one point, the draft paper contained this somewhat apologetic text at the end of the Introduction:

A Note on Style, Structure and Length

When a development like this one occurs, you have every reason to be skeptical—quite possibly at some point, you may think “It can't just be that. Come on!”. I thus feel obligated to make the case thoroughly. In addition, by producing a generator that claims to provide essentially all the features people want, it puts me in the position of needing to describe all those features. Both factors conspire to make the paper quite long, but they also make it fairly self contained, and, I hope, accessible to readers from a variety of backgrounds.

A paper of this length could easily be quite dull, so I have endeavored to keep the style engaging. Sometimes, in defiance of academic conventions, I will talk to you directly and speculate about what you might be thinking (as I did in the previous paragraph). Hopefully, for most readers, I will be close to the mark.

Despite the length of the paper, inevitably there must be some elisions—there are far too many random number generators for me to contrast them all. Even the generators I do include are not all discussed in every section.

Finally, following the advice of Peyton Jones [XX], this paper is written to convey not only the final idea, but the intuitions that underlie it. Thus there are many diagrams and visualizations which will hopefully be elucidatory. There are some elements of whimsy too, which I hope you will enjoy when you reach them.

But one of my colleagues who read a draft with that text simply put a line through that section and wrote “there, made it shorter”. His point was that I clearly intended to submit the paper to a journal, and that longer, more accessible papers are the norm for computer-science journals. No apology for length was necessary.

Choosing a Venue

In some areas of computer science, there are clear venues for the work. A new breakthrough in theory should appear at the Annual ACM Symposium on the Theory of Computing (STOC), work in CS education should appear at one of the ACM SIGCSE conferences, programming languages at an ACM SIGPLAN conference, graphics at ACM SIGGRAPH, and so on. And typically lesser breakthroughs can be presented at an affiliated workshop. But there is no ACM SIGRAND for random number generation.

Pierre L'Ecuyer maintains an extensive bibliography on random number generation. When we look back at the last 15 years, we find 54 journal articles, 32 papers in conference proceedings, 16 contributions in books collecting papers on a topic, and 9 tech reports or equivalent.

Of the 32 conference papers, only these venues have two or more papers:

Monte Carlo and Quasi-Monte Carlo Methods (7 articles)
Winter Simulation Conference (3 articles)
Simulation and Modeling Conference (2 articles)
Parallel Processing and Applied Mathematics (2 articles)
International Conference on High Performance Computing and Simulation (2 articles)

Similarly, of the 54 journal articles, only these venues have two or more papers:

Computer Physics Communications (7 articles)
Concurrency and Computation: Practice and Experience (3 articles)
ACM Transactions on Mathematical Software (3 articles)
Numerical Algorithms (2 articles)
Nature Photonics (2 articles)
Mathematics and Computers in Simulation (2 articles)
Journal of Computational and Applied Mathematics (2 articles)
Journal of Complexity (2 articles)
INFORMS Journal on Computing (2 articles)
Computing (2 articles)
ACM Transactions on Reconfigurable Technology and Systems (2 articles)

And of the 16 articles appearing in books, only the following books have two or more articles:

Handbook of Computational Statistics (3 articles)
The Encyclopedia of Operations Research and Management Science (2 articles)
International Encyclopedia of Statistical Science (2 articles)

L'Ecuyer's bibliography reflects his specific interests (tilting towards simulation and Monte Carlo methods rather than, say, cryptographic stream ciphers) and is thus not entirely comprehensive, but I think it does provide a sense that there is no single obvious place to send a paper on random number generation, especially a generator family like PCG that targets a broader range of desirable properties than those demanded by any single application area.

In August, 2014, out of the choices available, I picked ACM Transactions on Mathematical Software (TOMS) as probably the best choice. Pierre L'Ecuyer had served on its editorial board since 2004, which, given his experience of the field, made it the gold standard for reviewing expertise. It had been the venue for his 2007 paper on TestU01, which was a lengthy paper that had also recapitulated a number of previously known positive qualities of random number generators (instead of simply giving no explanation and only a long list of citations for the reader to chase), critiqued existing generators, described TestU01 from a fairly systems-oriented perspective, and presented new results.

In other words, TestU01 was all the existence proof I needed that TOMS was the right venue and so in September, 2014, I submitted the paper there.

PCG on the Internet

The 2014–2015 academic year was a planned sabbatical year, and one of the things on my to-do list was revamping my personal website, switching to a more modern design and better content-management tools. So it seemed only a small amount of extra effort to build this website as well.

I launched the PCG website in August, 2014, and added the paper to the website the day it was submitted. On November 11, 2014, it was linked to from Hacker News and on November 18, 2014 it was discussed on Reddit. The website and paper also got mentioned numerous times on Twitter, Facebook, and other social media.

The feedback about the paper from Internet discussions was particularly helpful. I learned of several random number generators that I had not explicitly mentioned in the paper and perhaps should have, and also that I should have been more careful in arguing that trivial predictability is a bad thing. In particular, in trying to express a scale where trivial predictability lies at the bottom end, the it is dangerous to use terms like “cryptographic security” in trying to express a scale of quality with respect to predictability, because people will see the word “cryptographic” and misunderstand what is being claimed even if the text tries to make it plain that I do not propose PCG for use in cryptography. I didn't feel able to retroactively change the paper to make matters plainer, but I did adjust the website to stick to using the less loaded term, “predictability”.

As a result of the Hacker News article, I got invited to give a colloquium talk about the scheme at Stanford. It went well, resulting in this video.

As time passed and I waited to hear back from TOMS, I worked on other aspects of random number generation—including creating this blog. Here I could discuss broader issues related to random number generation, including the random-number facilities in C++. Over several blog posts, I developed randutils (C++ source). At the suggestion of several people in the C++ standards community and with the generous help of Tim Song, that work eventually developed into C++ proposal P0347R1, a variant of which may yet make its way into a future C++ standard. (Being on sabbatical, I also worked on other projects besides random number generation, including two papers for TRANSACT 2015.)

While attending PLDI and TRANSACT in June of 2015, I got one of the first clues that my work had had real impact. I can't remember the talk or the paper, but someone was saying how their results had been much improved from prior work by switching to a new, better, random number generator. At the end I asked which one. It was PCG. It was gratifying to see that my work had had genuine positive impact on one of my academic peers.

TOMS Response

It wasn't until July 21, 2015, 320 days after I had submitted the paper (and 21 days after my sabbatical was over!) that I finally head back from the folks at ACM Transactions on Mathematical Software (TOMS). Unfortunately, the determination was that it was not a good fit for the journal in its current form.

Although the consensus was that the paper was well written, the reviewers considered it far too long, seeming too much like a dissertation rather than a research paper. The fundamental error I made was thinking that TOMS followed the norms of a computer-science journal because ACM publishes it, but in reality I should have considered it to be a math journal—the kinds of norms for CS journal articles (especially length/audience) don't apply. Their publication of L'Ecuyer's paper on TestU01 was an outlier, not a sign that systems-oriented papers that similarly mixed review material and new results would be welcome.

The reviewers did invest time in writing detailed comments which were much appreciated, from listing parts of the literature that I had missed, to highlighting areas of the paper that had not been clear to them (which differed between reviewers), to quibbles over terminology (which also differed between reviewers). All of these comments would be useful if I were to revise the paper, although many of them echoed reactions I'd seen months earlier when the paper was first discussed by Internet readers.

One issue already discovered from Internet readers was that reviewers did not react well to my use of the minefield phrase “cryptographic security” when trying to outline a spectrum of predictability, and place PCG in a position of not-trivially-predictable but not proven or recommended for cryptographic use. But it was even more starkly clear how I had failed to make a case that predictability is a bad thing for non-cryptographic PRNGs. To some reviewers, the only uses of PRNGs worth considering are modelling/simulation or cryptography; predictability matters little in the first case and massively in the second. I see other uses of PRNGs between these poles, but I had not outlined the case well enough, providing just a single sentence and citing the literature, assuming that doing so would be sufficient. I should have said more.

Beyond my occasional failures to make points clearly enough for one reviewer or another, there were no major flaws discovered in the review process, which was one positive aspect to the process. Given the scrutiny the paper had already received from being available on the Internet and widely read there, that wasn't a huge surprise, but was nevertheless good to see.

What was more interesting were the ways in which the TOMS reviewing differed from the paper's Internet reception. Some reviewers found my style of exposition enjoyable, but others found it too leisurely and inappropriately relaxed. Although I felt that my paper echoed the TestU01 paper in pointing out flaws in existing generation schemes, some reviewers felt that in doing so I was trying to (inappropriately!?) cast my own work in the best possible light. I don't quite understand that, because a flaw is a flaw.

An additional difference from the Internet reaction was that some of the TOMS reviewers felt that what I'd done just wasn't very mathematically sophisticated and was thus trivial/uninteresting.

Finally, few Internet readers had complained that the paper was too long but, as I mentioned earlier, the length of the paper was a theme throughout all the reviewing.

Regarding that latter point, I am, on reflection, unrepentant. I wanted to write something that was broadly accessible, and based on other feedback I succeeded.

It's also not clear to me how I could have shortened it to please the reviewers. Although each reviewer thought I spent far too much time belaboring the obvious, each one had a different take on exactly which parts the obvious ones were, and each reviewer also thought I had spent too little time on some aspect they themselves cared about!

In modern conference reviewing systems like HotCrp and EasyChair, reviewers can read each others reviews after they have submitted their own, which helps to provide consensus on what is important. Based on the variance between what reviews for TOMS, it's not clear to me whether the TOMS reviewers saw each other's reviews or not.

But even though I might have liked a different outcome, a key truth in academia is that your reviewers are always right, even if they're wrong. If your reviewer has an objection to your work, it means that you didn't lay the groundwork to silence that objection.

Sometimes, it also means you sent the work to the wrong venue. In the PCG paper, I characterized the field of random number generation as one of islands. In sending the paper to TOMS, I had sent it to math island.

Aftermath

Although well received on the Internet, the TOMS reviewers were right that in many ways the PCG paper was too much like a dissertation; I had felt that when I was writing it. It is common for academics to write a long dissertation and then parcel it out in bite-sized chunks as papers over the first few years of their careers. I'm well aware that there are many ideas in the PCG paper. In theory, it's possible to break it up into multiple shorter papers (e.g., on testing, on approximating the birthday problem, on the behavior of ideal uniform random number generators, on characterizing TestU01, on permutation functions on tuples, and on the PCG family itself).

Or I could apply what I learned from the review process and resubmit the same long paper somewhere else. But if I were to do that, it would undoubtedly be to another journal and I might easily find myself waiting another year to hear the result of that review process.

But as it turns out, thus far I have done neither. Given Mudd's focus on excellence in teaching, I find that the summers and sabbaticals are the best times for me to make significant headway on research, and by the time I received the review results from TOMS, there was less than a month of summer left, and I had to wrap up the summer research projects already underway and gear up for teaching duties for the fall. In the fall of 2015 my colleagues elected me department chair, and I made a conscious choice to focus more on my duties to the department than on a research side project.

Lack of time also contributed to my neglect of this website and blog, but I think it was also that after receiving hearing back from TOMS, my next post would have to be a discussion of the outcome of the review and that too would require more time and energy than I had. No academic likes to see their work not get accepted for publication, and even though I understood the rationale, writing a post (this one!) to explain what had happened would be a chore, especially as I would have to try to balance the worry that people might see the result as an indictment of the technique (it wasn't) or imagine that I'm being unduly defensive in the face of criticism (hopefully I'm not). Determining how to navigate that path required more energy than I had at the time, so everything on the website stalled.

With this piece finally written, I am hope the stream of blog posts will resume (I have plenty more left to write on a variety of topics!), although I'm still department chair and still have limited time.

Final Thoughts

Although academics like to be published in well-respected venues, fundamentally what matters is whether the work is useful to others. The video of my talk at Stanford has been watched by thousands of people and the paper has also been downloaded many thousands of times. Although I'm sure that not every person who started either the video or the paper finished the whole thing, I do know that many people have and some good has been done.

Not being published in TOMS has not prevented people from citing the paper. A search on Google Scholar reveals ten papers that cite it, and numerous languages have provided PCG generators as an alternative to the default generators or even made a PCG generator their default (e.g., crystal and chapel).

To make it easier to cite, the paper page of this website now lists the technical-report version of the paper, rather than the version submitted to TOMS. The two are identical except for details of font choice and pagination.

No doubt this post was too long (more than 4000 words!) and contained far more exposition than was actually necessary. A short note might have sufficed. Oh well, you'll just have to get over that.