Last week I was interested to read about the proposed `math/rand/v2`

for Golang’s standard library. It mentioned a new-ish flavour
of PCG random number generator which I had not previously encountered,
called PCG64 DXSM. This blog post collects what I have learned about
it. (I have not found a good summary elsewhere.)

At the end there is source code for PCG64 DXSM that you can freely copy and use.

Here’s the bit where the author writes their life story, and all the readers skip several screens full of text to get to the recipe.

Occasionally I write randomized code that needs a pseudorandom number generator that is:

- Not gratuitously terrible;
- Seedable, so I can reproduce test cases;
- Does not tank multicore performance by using global mutable state;
- Runs everywhere without portability faff.

The various `libc`

random number generators can satisfy one or two of
my requirements, if I am lucky.

So I grab a copy of PCG and use that. It’s only like 10 lines of code, and PCG’s creator isn’t an arsehole.

PCG random number generators are constructed from a collection of linear congruential generators, and a collection of output permutations.

A linear congruential random number generator looks like:

```
state = state * mul + inc;
```

The multiplier `mul`

is usually fixed; the increment `inc`

can be
fixed, but PCG implementations usually allow it to be chosen when the
RNG is seeded.

A bare LCG is a bad RNG. PCG turns an LCG into a good RNG:

- The LCG state is twice the size of the RNG output
- The LCG state is permuted to produce the RNG output

PCG’s output permutations have abbreviated names like XSH (xor-shift),
RR (random rotate), RXS (random xor-shift), XSL (xor shift right
[*sic*]), etc.

The reference implementation of PCG in C++ allows you to mix and match LCGs and output permutations at a variety of integer sizes. There is a bewildering number of options, and the PCG paper explains the desiderata at length. It is all very informative if you are a researcher interested in exploring the parameter space of a family of random number generators.

But it’s all a bit too much when all I want is a replacement for
`rand()`

and `srand()`

.

For 32-bit random numbers, PCG has a straightforward solution in the
form of `pcg_basic.c`

.

In C++ PCG this standard 32-bit variant is called
`pcg_engines::setseq_xsh_rr_64_32`

or simply `pcg32`

for short.

(There is a caveat, tho: `pcg32_boundedrand_r()`

would be faster if it
used Daniel Lemire’s nearly-divisionless unbiased rejection sampling
algorithm for bounded random numbers.)

For 64-bit random numbers it is not so simple.

There is no 64-bit equivalent of `pcg_basic.c`

. The reference
implementations have a blessed flavour called `pcg64`

, but it isn’t
trivial to unpick the source code’s experimental indirections to get a
10 line implementation.

And even if you do that, you won’t get the best 64-bit flavour, which is:

PCG64 DXSM is used by NumPy. It is a relatively new flavour of PCG,
which addresses a minor shortcoming of the original `pcg64`

that arose
in the discussion when NumPy originally adopted PCG.

In the commit that introduced PCG64 DXSM, its creator Melissa O’Neill describes it as follows:

## DXSM – double xor shift multiply

This is a new, more powerful output permutation (added in 2019). It’s a more comprehensive scrambling than RXS M, but runs faster on 128-bit types. Although primarily intended for use at large sizes, also works at smaller sizes as well.

As well as the DXSM output permutation, `pcg64_dxsm()`

uses a “cheap
multiplier”, i.e. a 64-bit value half the width of the state, instead
of a 128-bit value the same width as the state. The same multiplier is
used for the LCG and the output permutation. The cheap multiplier
improves performance: `pcg64_dxsm()`

has fewer full-size 128 bit
calculations.

O’Neill wrote a longer description of the design of PCG64 DXSM, and the NumPy documentation discusses how PCG64DXSM improves on the old PCG64.

In C++ PCG PCG64 DXSM’s full name is
`pcg_engines::cm_setseq_dxsm_128_64`

. As far as I can tell it doesn’t
have a more friendly alias. (The C++ PCG `typedef`

`pcg64`

still refers
to the previously preferred `xsl_rr`

variant.)

In the Rust `rand_pcg`

crate PCG64 DXSM is called
`Lcg128CmDxsm64`

, i.e. a linear congruential generator with 128 bits
of state and a cheap multiplier, using the DXSM permutation with 64
bits of output.

That should be enough search keywords and links, I think.

OK, at last, here’s the recipe that you were looking for:

```
// SPDX-License-Identifier: 0BSD OR MIT-0
typedef struct pcg64 {
uint128_t state, inc;
} pcg64_t;
uint64_t pcg64_dxsm(pcg64_t *rng) {
/* cheap (half-width) multiplier */
const uint64_t mul = 15750249268501108917ULL;
/* linear congruential generator */
uint128_t state = rng->state;
rng->state = state * mul + rng->inc;
/* DXSM (double xor shift multiply) permuted output */
uint64_t hi = (uint64_t)(state >> 64);
uint64_t lo = (uint64_t)(state | 1);
hi ^= hi >> 32;
hi *= mul;
hi ^= hi >> 48;
hi *= lo;
return(hi);
}
```

The algorithm for seeding PCG takes some raw seed values and conditions them to make a new RNG state that is “not ludicrous” (in the words of Simon Tatham). It is the same for all flavours of PCG:

```
pcg64_t pcg64_seed(pcg64_t rng) {
/* must ensure rng.inc is odd */
rng.inc = (rng.inc << 1) | 1;
rng.state += rng.inc;
pcg64_dxsm(&rng);
return(rng);
}
```

You can pass `pcg64_seed()`

a structure literal containing the raw
seed values, or use it like:

```
pcg_t pcg64_getentropy(void) {
pcg64_t rng;
if (getentropy(&rng, sizeof(rng)) < 0)
err(1, "getentropy");
return (pcg64_seed(rng));
}
```

The code above works on 64-bit systems with `clang`

and `gcc`

, which
have `__uint128_t`

built in.

If you need a C implementation that works on systems without a handy 128-bit integer type, then the PCG64 implementation in NumPy has its own support for 128-bit arithmetic.

Or if you are using C++ then the reference implementation of PCG has another 128-bit arithmetic class.

⇐ 2023-05-28
⇐ Where does "where does my computer get the time from?" come from?
⇐
⇒ Random floating point numbers
⇒ 2023-06-23
⇒