Sloppy APIs and getrandom

Posted on . Updated on .

Not long ago, the getrandom system call was introduced to Linux with a patch from Ted Tso. It was included for the first time in kernel 3.17, released a few days ago. It attempts to provide a superset of the functionality provided by the getentropy system call in OpenBSD. The purpose is having a system call that will let you obtain high quality random data from the kernel without opening /dev/urandom. This helps when the process is inside a chroot and protects the process from file descriptor exhaustion attacks.

If you’re a C programmer and as paranoid as I am about clean APIs and integer types, you may have noticed getrandom is a bit weird in this regard, and you’ll also notice getentropy is much cleaner. getrandom takes a buffer pointer and the number of requested bytes as a size_t argument. size_t is unsigned and is supposed to be able to represent the size of any object in the program. Specifically, size_t is usually an unsigned 64-bits integer in the typical 64-bit Linux system.

However, the return type of getrandom is a simple integer (32-bit, signed) that may indicate the number of bytes that were actually read. Out of context, I think such an interface is a bit sloppy. The call to getrandom may result in a short read just because it wouldn’t be able to return the proper result due to data types. Short reads are possible in other situations with getrandom, but are mostly mentioned in the context of passing a flag to read from /dev/random instead of /dev/urandom.

In context, it does make sense. If you request a very large number of random bytes you’re going to wait a long time while they are being computed. This may catch you by surprise. So it probably doesn’t make sense to request a large number of random bytes in the first place. In fact, if you check Ted Tso’s patch, you’ll notice getrandom returns an error if the requested size is over 256 bytes.

The interface in OpenBSD solves all these problems right away. First off, the manual page mentions you shouldn’t be using it directly, and provides references to better APIs. In any case, an error will be returned if you request more than 256 bytes (minor complaint: provide a named constant for this just in case the value changes in the future). This is mentioned explicitly in the manual page. If not, there will be no short reads, and the integer value returned will only be used to signal errors, and not to tell how many bytes were actually read. Super-clean. If you want to request more than 256 random bytes, you’re responsible of splitting the requests over a loop, but generally you won’t need to do so. 256 bytes are "more than enough for everybody".

Contrast that with getrandom. Even if you request only 4 bytes, nothing in the API tells you there won’t be a short read. In practice, the implementation will not result in short reads for requests below 256 bytes with the default behavior, sure, but the API leaves that open. So you end up having to code something like this if you want a direct equivalent to getentropy that’s guaranteed to work well (note: coded on-the-go and not tested).

int getentropy(void *buf, size_t nbytes)
        if (nbytes > 256) {
                errno = EIO;
                return -1;

        int ret;
        size_t got = 0;

        while (got < nbytes) {
                ret = getrandom(buf + got, nbytes - got, 0);
                if (ret < 0)
                        return ret;
                got += ret;

        return 0;

Specifically, the simple getentropy equivalent from Ted Tso will work in practice, but it’s not solid according to the proposed API.

I think it’s unfortunate that a system call may fail, partly, just because of API data types. The surprising part is this happens for some common system calls too. The history trail is not specially good. For example, take a detailed look at the common standard read system call. It takes a size_t argument to indicate how much to read, yet returns the amount that was really read as a ssize_t, which is signed (hence the initial S) and only allows representing numbers half as big. This allows returning -1 to signal errors, but then you have to read further to notice that -1 will be returned and errno set to EINVAL if the requested size is larger than SSIZE_MAX.

The standard write system call has a similar problem. However, not all manpages in every system mention the SSIZE_MAX limitation. For example, in my Linux system the manpage for read mentions it, but the manpage for write does not. It may not have that limitation (I haven’t checked), but if you’re coding portably, the manpage won’t help you detect the problem.

In my very humble opinion, having the requested size as a size_t is very practical because it allows the following kind of code to work without casts.

struct foo a;
ssize_t ret = write(fd, &a, sizeof(a));

But would it really hurt to have one more argument to separate the error signaling from the amount of bytes written in a short write?

int write(int fd, void *buf, size_t nbytes, size_t *wbytes);

In systems and programming languages with exceptions this is not a problem. Real errors are signaled by raising exceptions and the API is not polluted. write could return size_t just like its input argument (note: no intention to start a flamewar about exceptions vs error codes). In the Go programming language, functions many times return two values, one is the normal return value and the other one is an error code.

End of the rant.

comments powered by Disqus