Removing the Linux /dev/random blocking pool

This article brought to you by LWN subscribers

Subscribers to LWN.net made this article — and everything that surrounds it — possible. If you appreciate our content, please buy a subscription and make the next set of articles possible.

By Jake Edge
January 7, 2020

The random-number generation facilities in the kernel have been reworked some over the past few months—but problems in that subsystem have been addressed over an even longer time frame. The most recent changes were made to stop the getrandom() system call from blocking for long periods of time at system boot, but the underlying cause was the behavior of the blocking random pool. A recent patch set would remove that pool and it would seem to be headed for the mainline kernel.

Andy Lutomirski posted version 3 of the patch set toward the end of December. It makes "two major semantic changes to Linux's random APIs". It adds a new GRND_INSECURE flag to the getrandom() system call (though Lutomirski refers to it as getentropy(), which is implemented in glibc using getrandom() with fixed flags); that flag would cause the call to always return the amount of data requested, but with no guarantee that the data is random. The kernel would just make its best effort to give the best random data it has at that point in time. "Calling it 'INSECURE' is probably the best we can do to discourage using this API for things that need security."

The patches also remove the blocking pool. The kernel currently maintains two pools of random data, one that corresponds to /dev/random and another for /dev/urandom, as described in this 2015 article. The blocking pool is the one for /dev/random; reads to that device will block (thus the name) until "enough" entropy has been gathered from the system to satisfy the request. Further reads from that file will also block if there is insufficient entropy in the pool.

Removing the blocking pool means that reads from /dev/random behave like getrandom() with a flags value of zero (and turns the GRND_RANDOM flag into a noop). Once the cryptographic random-number generator (CRNG) has been initialized, reads from /dev/random and calls to getrandom(..., 0) will not block and will return the requested amount of random data. Lutomirski said:

I believe that Linux's blocking pool has outlived its usefulness. Linux's CRNG generates output that is good enough to use even for key generation. The blocking pool is not stronger in any material way, and keeping it around requires a lot of infrastructure of dubious value.

The changes were made with an eye toward ensuring that existing programs are not really affected; in fact, the problems with long waits for things like generating GnuPG keys will get better.

This series should not break any existing programs. /dev/urandom is unchanged. /dev/random will still block just after booting, but it will block less than it used to. getentropy() with existing flags will return output that is, for practical purposes, just as strong as before.

Lutomirski noted that there is still the open question of whether the kernel should provide so-called "true random numbers", which is, to a certain extent, what the blocking pool was meant to do. He can only see one reason to do so: "compliance with government standards". He suggested that if the kernel were to provide that, it should be done through an entirely different interface—or be punted to user space by providing a way for it to extract raw event samples that could be used to create such a blocking pool.

Stephan Müller suggested that his Linux random-number generator (LRNG) patch set (now up to version 26) might be a way to provide true random numbers for applications that need them. The LRNG is "fully compliant to SP800-90B requirements", which makes it a solution to the governmental-standards problem. Matthew Garrett objected to the term "true random data", noting that the devices being sampled could, in principle, be modeled accurately enough to make them predictable: "We're not sampling quantum events here." Müller said that the term comes from the German AIS 31 standard to describe a random-number generator that only produces output "at an equal rate as the underlying noise source produces entropy".

Beyond the terminology, though, having a blocking pool as is proposed by the LRNG patches will just lead to various problems, at least if it is available without privilege, Lutomirski said:

This doesn’t solve the problem. If two different users run stupid programs like gnupg, they will starve each other.

As I see it, there are two major problems with /dev/random right now: it’s prone to DoS (i.e. starvation, malicious or otherwise), and, because no privilege is required, it’s prone to misuse. Gnupg is misuse, full stop. If we add a new unprivileged interface, gnupg and similar programs will use it, and we lose all over again.

Müller noted that the addition of getrandom() will now allow GnuPG to use that interface since it will provide the needed guarantee that the pool has been initialized. From discussions with GnuPG maintainer Werner Koch, Müller believes that guarantee is the only reason GnuPG currently reads directly from /dev/random. But if there is an unprivileged interface that is subject to denial of service (like /dev/random today), it will be misused by some applications, Lutomirski asserted.

Theodore Y. Ts'o, who is the maintainer of the Linux random-number subsystem, appears to have changed his mind along the way about the need for a blocking pool. He said that removing that pool would effectively get rid of the idea that Linux has a true random-number generator (TRNG), which "is not insane; this is what the *BSD's have always done". He, too, is concerned that providing a TRNG mechanism will just serve as an attractant for application developers. He also thinks that it is not really possible to guarantee a TRNG in the kernel, given all of the different types of hardware supported by Linux. Even making the facility only available to root will not solve the problem:

Application programmers would give instructions requiring that their application be installed as root to be more secure, "because that way you can get access the _really_ good random numbers".

Müller asked if Ts'o was giving up on the blocking pool implementation that he had added long ago. Ts'o agreed that he was; he is planning to take the patches from Lutomirski and is pretty strongly opposed to adding a blocking interface back into the kernel.

The kernel can't offer up any guarantees about whether or not the noise source has been appropriately characterized. All say, a GPG or OpenSSL developer can do is get the vague sense that TRUERANDOM is "better" and of course, they want the best security, so of *course* they are going to try to use it. At which point it will block, and when some other clever user (maybe a distro release engineer) puts it into an init script, then systems will stop working and users will complain to Linus.

For cryptographers and others who really need a TRNG, Ts'o is also in favor of providing them a way to collect their own entropy in user space to use as they see fit. Entropy collection is not something that the kernel can reliably do on all of the different hardware that it supports, nor can it estimate the amount of entropy provided by the different sources, he said.

The kernel shouldn't be mixing various noise sources together, and it certainly shouldn't be trying to claim that it knows how many bits of entropy that it gets when [it] is trying to play some jitter entropy game on a stupid-simple CPU architecture for IOT/Embedded user cases where everything is synchronized off of a single master oscillator, and there is no CPU instruction reordering or register renaming, etc., etc.

You can talk about providing tools that try to make these estimations --- but these sorts of things would have to be done on each user's hardware, and for most distro users, it's just not practical.

So if it's just for cryptographers, then let it all be done in userspace, and let's not make it easy for GPG, OpenSSL, etc., to all say, "We want TrueRandom(tm); we won't settle for less". We can talk about how do we provide the interfaces so that those cryptographers can get the information they need so they can get access to the raw noise sources, separated out and named, and with possibly some way that the noise source can authenticate itself to the Cryptographer's userspace library/application.

There was a bit of discussion about how that interface might look; there may be security implications for some of the events, for example. Ts'o noted that the keyboard scan codes (i.e. the keys pressed) are mixed into the pool as part of the entropy collection. "Exposing this to userspace, even if it is via a privileged system call, would be... unwise." It does seem possible that other event timings could provide some kind of side-channel information leak as well.

So it would seem that a longtime feature of the Linux random-number subsystem is on its way out. Given the changes that the random-number subsystem have undergone recently, it effectively was only causing denial-of-service problems when it was used; there are now better ways to get the best random numbers that the kernel can provide. If a TRNG is still desired for Linux, that lack will need to be addressed in the future, but likely will not be done within the kernel itself.

Index entries for this article
Kernel	Random numbers

(Log in to post comments)

Removing the Linux /dev/random blocking pool

Posted Jan 7, 2020 11:06 UTC (Tue) by cesarb (subscriber, #6266) [Link]

I worry about using the same pool for the "block until fully initialized" and the "never blocks" interfaces. Suppose the pool has not been fully initialized yet, and has only four bits of entropy, which means it has only sixteen possible states; a malicious program can call the "never blocks" interface, and do a brute force search for the pool state. Then we add another four bits of entropy, the malicious program calls the "never blocks" interface again, and given the previous state (which the malicious program knows) there are again only sixteen possible states, which it can brute force search. Repeat until the pool is filled; while it's now supposed to have a number of possible states so huge that brute forcing the pool state is not viable (which is why we can allow drawing from it infinitely without waiting), the malicious program which observed it while it was being filled never had to brute force too much to know the full pool state.

(The solution would probably be to still have two pools during initialization, one which is temporarily used only for the "never blocks" interfaces and is thrown away after the other one is fully initialized.)

Removing the Linux /dev/random blocking pool

Posted Jan 7, 2020 12:37 UTC (Tue) by hkario (subscriber, #94864) [Link]

are you sure the kernel mixes in just single bits? I thought that it mixed in the full input, it just credited it with few bits of entropy.

and of course, the sources are underestimated for the amount of entropy provided

Removing the Linux /dev/random blocking pool

Posted Jan 7, 2020 13:34 UTC (Tue) by cesarb (subscriber, #6266) [Link]

The entropy of the input matters. If the input has only 15 bits of unpredictability, it doesn't matter how large it is, how much it's mixed or what the entropy estimation was, there are only 32768 possible states (given an empty or known pool). If one reads 15 or more bits from the pool after that input, but before the next input, and the previous pool state is known, the current pool state can be guessed by trying the 32768 possible different values for the input, and seeing which one produces the output which was just read.

Removing the Linux /dev/random blocking pool

Posted Jan 7, 2020 16:44 UTC (Tue) by Cyberax (✭ supporter ✭, #52523) [Link]

But what would happen after you add 40 more bits into the pool?

Removing the Linux /dev/random blocking pool

Posted Jan 7, 2020 16:43 UTC (Tue) by Cyberax (✭ supporter ✭, #52523) [Link]

That's not exactly how it works.

You can't reconstruct the state of the pool without storing all possible intermediate results. So if you want to reconstruct the pool state after 32 bits of entropy were added, you'd need a lookup table of at least 4Gb in size.

Removing the Linux /dev/random blocking pool

Posted Jan 7, 2020 20:41 UTC (Tue) by nivedita76 (guest, #121790) [Link]

That is the point of how state extension attacks work. You may have added 32 bits of entropy, but if you added them 1 bit at a time while the attacker was reading the output of your RNG, you've lost.

https://lwn.net/ml/linux-kernel/20190919143427.GQ6762@mit...

Removing the Linux /dev/random blocking pool

Posted Jan 7, 2020 13:45 UTC (Tue) by kooky (subscriber, #92468) [Link]

I kind of agree.

I my current job (system admin, database programmer), I've had some many problems over the years caused /dev/random blocking. From CGI type programs which became fork rate limited, to java/tomcat not starting on virtual machines.

I solved the problem years ago by installing Entropykeys in every machine. Now I do the same with chaoskeys.

I'm not even sure if chaoskeys will actually do anything useful under the new system?

Removing the Linux /dev/random blocking pool

Posted Jan 7, 2020 19:14 UTC (Tue) by nix (subscriber, #2304) [Link]

> I'm not even sure if chaoskeys will actually do anything useful under the new system?

They're still mixing more entropy in, even if the kernel no longer bothers to block reads if there is insufficient entropy (after initialization). (AIUI, it can still block *additions* of entropy when there *is* believed to be sufficient entropy in the pool, so things like the chaoskey don't needlessly eat CPU time throwing entropy into the pool when it already probably has lots and nobody's using any of it.)

Removing the Linux /dev/random blocking pool

Posted Jan 7, 2020 15:19 UTC (Tue) by scientes (guest, #83068) [Link]

Surprised no one has posted this yet: https://dilbert.com/strip/2001-10-25

Removing the Linux /dev/random blocking pool

Posted Jan 7, 2020 15:52 UTC (Tue) by matthias (subscriber, #94967) [Link]

And of course, this one also has to be posted: https://xkcd.com/221/

Removing the Linux /dev/random blocking pool

Posted Jan 7, 2020 15:29 UTC (Tue) by gebi (guest, #59940) [Link]

Entropy is a very sad story under linux (not directly kernel related!), imho needlessly so.

First with haveged before rdrand became a thing and for vmware environments where rdrand is still not forwarded for various customer reasons.
Then after the haveged period we would have used rng-tools from debian, but debian insists on it's own fork of rng-tools which did not support rdrand, thus back to a home-grown solution (https://github.com/mgit-at/rngd-rdrand, thx Ben Jencks :)! we just cleaned it up a bit and packaged it for debian).

After that and for reasons like this we switched mostly to ubuntu and there it's just installing rng-tools (which is version 5 and supports rdrand) and be done.
One newer debian systems one can now install rng-tools5 and also get the new upstream instead of the debian fork.

In the end just have a small daemon running using rdrand to seed the pool and forget about all the problems.
Except when they come back because of performance reasons and bite you.

e.g: From a user point of view it would be nice to have a _fast_ /dev/random that never blocks, and delivers more than 1MB/s, but at least if it never blocks it's ok. But even the 160-200MB/s (1k vs 1M block size) of /dev/urandom could be faster (but for basic tasks it's ok :).

Removing the Linux /dev/random blocking pool

Posted Jan 8, 2020 8:19 UTC (Wed) by k3ninho (subscriber, #50375) [Link]

rdrand is all good until you buy a CPU that answers the call with 0x00000000...0 [1].

1: AMD's Ryzen 3 does, see https://arstechnica.com/gadgets/2019/10/how-a-months-old-...

K3n.

Removing the Linux /dev/random blocking pool

Posted Jan 8, 2020 13:30 UTC (Wed) by gebi (guest, #59940) [Link]

For all practical purposes and when using rng-tools(5) it does not matter.
The injected entropy is first encrypted with aes and rng-tools(5) has support for both rdrand and rdseed.

Removing the Linux /dev/random blocking pool

Posted Jan 8, 2020 15:44 UTC (Wed) by leromarinvit (subscriber, #56850) [Link]

But a stream of encrypted 0's (or 0xFF, as seems to be the case here) still adds precisely as much entropy as the key has to the pool. And if a buggy RDRAND is your only source of "entropy" (I know that shouldn't happen in practice), then the entire stream of random numbers is trivially predictable.

Removing the Linux /dev/random blocking pool

Posted Jan 8, 2020 16:13 UTC (Wed) by gebi (guest, #59940) [Link]

exactly, so it's still just "For all practical purposes and when using rng-tools(5) it does not matter."

Removing the Linux /dev/random blocking pool

Posted Jan 7, 2020 21:32 UTC (Tue) by dkg (subscriber, #55359) [Link]

I'm surprised to see all the commentary about GnuPG and OpenSSL reading from /dev/random.

OpenSSL has traditionally read from /dev/urandom and for over a year now, GnuPG has been reading from the getrandom syscall even for key generation.

Removing the Linux /dev/random blocking pool

Posted Jan 8, 2020 18:57 UTC (Wed) by ncultra (subscriber, #121511) [Link]

I also noticed it and felt it was unwarranted and showed not a small amount of hubris. Blaming developers for using a kernel facility that was made available to them is deflecting the problem. Ironically so, because it was the kernel (via ext4) that caused a blocking regression (DOS?) and started this work.

Removing the Linux /dev/random blocking pool

Posted Jan 15, 2020 2:17 UTC (Wed) by luto (subscriber, #39314) [Link]

This patch set actually predates the ext4 issue.

the hardware RNG on a Raspberry Pi?

Posted Jan 8, 2020 1:05 UTC (Wed) by gus3 (guest, #61103) [Link]

For a while, I had connected the RPi's /dev/hwrng to a TCP port, just for laughs, to see if I could export a better entropy for my home network. It worked, until my desktop system failed and I lost the notes I'd taken.

But this whole discussion suggests a more... *radical* possibility: a Raspberry Pi Zero, attached as a USB dongle, running dedicated kernel-level code to transmit entropy from the HWRNG to the USB bus. And on the host side, a kernel driver(?) to read the external entropy from the RPi Zero.

Yes, the security cautions are numerous. But 95Kb/s of entropy, for less than $10, seems an avenue worth exploring.

small correction

Posted Jan 8, 2020 1:36 UTC (Wed) by gus3 (guest, #61103) [Link]

95KB/s, kilo*bytes* per second, not bits.

FWIW, that's over 800 feet of randomly punched paper tape per second. ;-)

the hardware RNG on a Raspberry Pi?

Posted Jan 11, 2020 14:39 UTC (Sat) by naptastic (guest, #60139) [Link]

the hardware RNG on a Raspberry Pi?

Posted Feb 18, 2020 1:42 UTC (Tue) by ttelford (guest, #44176) [Link]

You may be interested in the “entropybroker” package in Debian/Rasbpian, which lets you use one hardware RNG, and share it over a network. The documentation is not so great, though. Also: I like the bit babbler http://www.bitbabbler.org/

Removing the Linux /dev/random blocking pool

Posted Jan 8, 2020 7:40 UTC (Wed) by joib (subscriber, #8541) [Link]

How does this all now work with the jitter RNG that was recently introduced?

So this change makes /dev/random work mostly like getrandom(0). If there's not enough entropy, they will block until enough entropy has been generated (including by running the jitter RNG) to seed the CRNG? After the CRNG has been seeded, they never block.

What about /dev/urandom? It will never block (including not kicking the jitter RNG into action?), so it will seed the CRNG with whatever entropy there is? This is the same as getrandom() with the new GRND_INSECURE flag? And then presumably some protection against state extension attacks?