Random number generator updates for Linux 5.17

tptacek · on Jan 7, 2022

Also discussed here: https://news.ycombinator.com/item?id=29742977

zx2c4 · on Jan 7, 2022

Commits of note in this pull:

- https://git.kernel.org/pub/scm/linux/kernel/git/crng/random....

loeg · on Jan 7, 2022

Do you have further plans / ideas for Linux random infrastructure enhancements? I’d love to hear about them.

zx2c4 · on Jan 7, 2022

The general idea is to clean things up piecemeal and gradually, until we've got a coherent design that's easy to reason about and enables us to then start working on proofs and formal verification, discussions with the academic cryptography community, and even test vectors. This initial pull request is mostly getting feet wet with fixes and low hanging fruits, but long term I'm interested in more comprehensive improvements.

The thing is, with Linux development, these things tend to work their way in _gradually_, rather than totally radical rewrites. (If you recall, I tried the radical rewrite thing with my Zinc replacement [1] for the crypto API, which wasn't too well received.) So I'm definitely not looking to cause a ruckus, but I do really want to get this into good shape. Provably secure constructions and slow-but-steady work will hopefully be key in doing that.

So, that may not the concrete answer perhaps you were hoping for, but that's a snapshot of my general approach.

[1] https://www.youtube.com/watch?v=a9A80i6noLw

Genbox · on Jan 9, 2022

Thank you for the contribution and for thinking about the issue as a fundamental one that has to be proven correct rather than hacking on it until it seems to work.

The use of RDRAND was especially overzealous. The problem is always finding the line between retaining entropy and performance - in this case (without having done the statistics) I believe you have done the right thing.

If I wanted to contribute an entropy mixer not based on CRC32 (as the current one is) that is much faster, includes test vectors and entropy analysis - where would be a good place to start?

throw0101a · on Jan 7, 2022

For comparison:

     A random device appeared in FreeBSD 2.2.  The implementation was changed
     to the Yarrow algorithm in FreeBSD 5.0.  In FreeBSD 11.0, the Fortuna al-
     gorithm was introduced as the default.  In FreeBSD 12.0, Yarrow was re-
     moved entirely.

* https://www.freebsd.org/cgi/man.cgi?query=random&sektion=4

* https://www.schneier.com/academic/fortuna/

Also a video presentation "A Deep Dive into FreeBSD's Kernel RNG" from vBSD 2017:

* https://papers.freebsd.org/2017/vbsdcon/gurney-a_deep_dive_i...

loeg · on Jan 7, 2022

Specific to these improvements in Linux CRNG:

* FreeBSD generation (getrandom(), /dev/urandom, arc4random, etc) does not use RDRAND in the hot path.

* The Fortuna state hashing algorithm uses a SHA256-based construction; not SHA1 or Blake2s[1].

[1]: https://github.com/freebsd/freebsd-src/blob/main/sys/dev/ran... , https://github.com/freebsd/freebsd-src/blob/main/sys/dev/ran...

throwawaylinux · on Jan 7, 2022

Do you have any idea about what Linux uses now, and how it differs from Fortuna?

I must say (and I say this very much as a rabid Linux fanboy) that it bothered me a bit how Linux's random number generator for a long time did not appear to be developed to a particular published and reviewed specification, but rather it seemed to be a somewhat ad hoc affair. This might be completely mischaracterizing it unfairly, but that's just how it comes across.

Does Linux's latest RNG design (which people say is finally good) conform to any mature and well analyzed and characterized published standard? I'm not asking if it uses standardized ciphers in particular stages, but whether the entire RNG follows a standard.

loeg · on Jan 7, 2022

There are good reasons not to use Fortuna-as-written nowadays, so I don't fault Linux that. But it would be nice if someone familiar with the design published something like the Fortuna whitepaper, or Microsoft's Windows 10 RNG whitepaper[1].

(Why not use Fortuna as-written? The main reason is that it is not designed to scale. It's a reasonable design for the single core machines of the era when it was written, but it does not worry about contention at all. You could also swap out AES for Chacha as PRF, because: Chacha is faster to seed, faster to generate on machines without AES-NI, and doesn't have some of the theoretical problems -- discussed in the Fortuna chapter -- that AES has with its relatively small 128-bit blocks.)

[1]: https://aka.ms/win10rng

throw0101a · on Jan 7, 2022

    random(4): Fortuna: allow increased concurrency
    Add experimental feature to increase concurrency in Fortuna.  As this
    diverges slightly from canonical Fortuna, and due to the security
    sensitivity of random(4), it is off by default.  To enable it, set the
    tunable kern.random.fortuna.concurrent_read="1".  The rest of this commit
    message describes the behavior when enabled.

    […]

* https://github.com/freebsd/freebsd-src/commit/179f62805cf05c...

    random(4): Flip default Fortuna generator over to Chacha20
    The implementation was landed in r344913 and has had some bake time (at
    least on my personal systems).  There is some discussion of the motivation
    for defaulting to this cipher as a PRF in the commit log for r344913.
    
    As documented in that commit, administrators can retain the prior (AES-ICM)
    mode of operation by setting the 'kern.random.use_chacha20_cipher' tunable
    to 0 in loader.conf(5).
    
    Approved by: csprng(delphij, markm)
    Differential Revision: https://reviews.freebsd.org/D22878

* https://github.com/freebsd/freebsd-src/commit/68b97d40fbe826...

kevans91 · on Jan 8, 2022

I note that the author of the comment you're responding to and the author of these commits is one and the same, fwiw.

tptacek · on Jan 7, 2022

The haphazard design of the LRNG has bugged cryptography engineers for awhile, but at this point it's also pretty well studied†. I would be careful with the assumption that something like Fortuna is a "published and reviewed specification"; Fortuna is really just a case study Ferguson and Schneier wrote in _Practical Cryptography_. It's not a standard, or the winner of some kind of RNG design competition.

There are standard RNG designs that NIST publishes, but they're low level, on the order of constructions, not whole system designs. A lot of the important details are OS-specific. It shouldn't make you more comfortable to hear that a system uses something like HMAC_DRBG; that detail doesn't tell you a whole lot about how the system as a whole works.

† See, for instance: https://eprint.iacr.org/2013/338.pdf --- skip to Section 5 for a pretty detailed description of the LRNG.

throwawaylinux · on Jan 8, 2022

At which point? A quick skim of git log --after="2013" drivers/char/random.c to pick out a few that seem to just decide to change some part of the design. They might all be done for good reasons as I said it just seems to be pretty ad hoc.

    random: try to actively add entropy rather than passively wait for it
    random: only read from /dev/random after its pool has  received 128 bits
    random: Return nbytes filled from hw RNG
    random: mix rdrand with entropy sent in from userspace
    random: use a different mixing algorithm for add_device_randomness()
    random: add backtracking protection to the CRNG
    random: replace non-blocking pool with a Chacha20-based CRNG
    random: use an improved fast_mix() function
    random: cap the rate which the /dev/urandom pool gets reseeded
    random: account for entropy loss due to overwrites
    random: allow fractional bits to be tracked

And often very little justification or reasoning is recorded:

    random: replace non-blocking pool with a Chacha20-based CRNG
    
    The CRNG is faster, and we don't pretend to track entropy usage in the
    CRNG any more.

> I would be careful with the assumption that something like Fortuna is a "published and reviewed specification"; Fortuna is really just a case study Ferguson and Schneier wrote in _Practical Cryptography_. It's not a standard, or the winner of some kind of RNG design competition.

Well that's basically what a standard is, in my mind. I place little weight on additional initials of organizations which publish standards, I just think they should be there so the design can be analyzed and examined as a whole, and by people who are not experts in C programming or want to wade through the Linux source.

The section of the paper you linked would be a fine standard for Linux's RNG except that it's a continually moving target.

tptacek · on Jan 8, 2022

I think the ad hoc stuff you're talking about is mostly the kind of systems design detail you're going to get plugging any CSPRNG into a kernel. In particular: there's no way to get to a point where you can just reconcile a commit against a standard to know whether it's good or not; you're going to have to do the work of following and grokking the changelog no matter what, because no "standard" for a kernel CSPRNG is going to capture all the details of safely providing randomness to a Linux (or FreeBSD) kernel.

throwawaylinux · on Jan 8, 2022

It's not mostly that, there are many which change internal details of the algorithms (my post butchered the list of commits, I edited it and now they show up).

And a complete design specification can certainly address the practical details of integration into a kernel, the random number subsystem in Linux has just a small interface with the rest of the kernel that can easily be captured abstractly.

tptacek · on Jan 8, 2022

I don't know what I'm getting myself into rhetorically here. I'm not an apologist for the LRNG. My takes are just these:

1. It's not an unalloyed good thing to use "Fortuna", which is not so much a peer-reviewed standard as "a thing in a book with Bruce Schneier's name on it". I might be a little more nervous about a design that advertises being based on "Fortuna" than a random design, because the hard parts of doing a kernel CSPRNG are I think mostly not what they wrote about in that book.

2. Whether or not you have a reference design to base off of, you're still going to end up with a stream of fiddly commits determining i.e. when the generator is seeded well enough to release random bits to callers, or the order in which raw events are fed into the generator. Those annoying fiddly bits are the challenge of maintaining a secure RNG, and there's no standard anywhere that will release you from having to pay attention to that stuff (thankfully, Jason Donenfeld is doing that thankless work for us now).

The rest of it, sure. The LRNG sucks, and is ill-specified (if relatively well studied), and has a janky history. It's in good hands now! It should not be rewritten to comply with some "standard"! But sure, the rest of your arguments, well taken, &c &c.

pbsd · on Jan 9, 2022

Fortuna was analyzed (and generalized) in https://eprint.iacr.org/2014/167. It's a solid design.

stefanpie · on Jan 7, 2022

As a novice Linux (Ubuntu) user, I am curious as to how these new changes for rand will speed up a Linux distro overall or if there is even a way to measure this. It seems that, in addition to an overdue replacement SHA-1, the new code is a significant increase in performance. I would think that there are a lot of places where random used in the OS and this update would translate into performance improves in those other places. I'm not an expert and would love to head any thoughts on this.

loeg · on Jan 7, 2022

> As a novice Linux (Ubuntu) user, I am curious as to how these new changes for rand will speed up a Linux distro overall

Totally negligible. It speeds up benchmarks of getrandom() and /dev/urandom, which is a good thing, but if you were spending any significant CPU there, it was due to an architectural problem that this doesn't solve.

> It seems that, in addition to an overdue replacement SHA-1, the new code is a significant increase in performance.

No, the big improvement is removing RDRAND (arch_get_random_long) from the hot path, which only existed on relatively fast x86 hardware. (The only other arch with a non-empty arch_get_random_long implementation was s390, a mainframe.) SHA1 to Blake is only used periodically (~0.003 Hz) and probably not noticeable.

Genbox · on Jan 9, 2022

Encrypted file systems wants to have a word with you. :)

I don't think it is hard to imagine an increase in throughput for disk encryption due to less time spent in the RNG.

loeg · on Jan 9, 2022

I don’t believe this will make any appreciable difference for encrypted filesystems. AFAIK the popular scheme (XTS) does not heavily use the RNG.

CiPHPerCoder · on Jan 7, 2022

I am stoked for updating all my systems to use Linux 5.17 now.

1over137 · on Jan 7, 2022

Headline seems overly generic: 5.17 of what?

0xJRS · on Jan 7, 2022

I think they assumed that the reader would know 5.17 and (kernel.org) next to it would mean the Linux kernel

dr_zoidberg · on Jan 8, 2022

Well it's the subject of the email sans the "[GIT PULL]" at the start. Maybe they just went with it for the title? It's been changed to "...for Linux 5.17" so it's clear now for those just reading the headlines.

nicce · on Jan 7, 2022

Yeah, Linux is the name of the kernel, not some distribution.