Not long ago I started using Google Authenticator as a two-factor authentication mechanism for my FastMail and Github accounts. I tried to learn its internals beforehand and found them surprisingly simple.
I think part of my initial confusion about Google Authenticator stems from the fact that the name can refer to both a generic two-factor authentication mechanism based on the TOTP algorithm from RFC 6238, as well as a specific app that’s available at the Play Store or can also be built from source or obtained prebuilt from those sources using a software repository like the amazing F-Droid. This app implements the standard mechanism I mentioned. In any case, it’s important to know right from the start that using Google Authenticator does not force you to have a Google account or involves data passing through Google servers. If you use a Google account, Google Authenticator can be used to improve the security of your logins because Google has implemented two-factor authentication for their services with Google Authenticator compatibility in mind.
Other service providers can implement a two-factor authentication mechanism compatible with Google Authenticator just like Google does, and you would be able to use your phone to improve the security of the log-in mechanism.
Starting with the basics, what is two-factor authentication? The Wikipedia article does a very fine job at explaining it. In short, there are three type of factors that can be used when authenticating a user in a system. You can request from the user something only he knows, something only he has, or something only he is. Two-factor authentication means using two of those factors in the authentication process.
When you log in to a service using a username and a password, only one of those factors is in use (the password would be something that only you know).
With two-factor authentication, the service additionally requests from the user something only he is or something only he has. In the case of Google Authenticator and some other two-factor authentication mechanisms, the goal is to get an authentication code from your phone, that you normally have to type together with a password when you log in. Your phone is supposed to be an item only you have and the code is only available from it. It’s also better if obtaining the code does not involve using the computer or network you’re trying to log in from (different communication channels, if any).
Some two-factor authentication mechanisms send a short message with the code to your phone when you attempt to log in. Google Authenticator uses a different mechanism that doesn’t require any type of connectivity from your phone at the moment you log in.
Another typical two-factor authentication routine you may be used to, outside of the computer world, takes place when getting cash from an ATM machine. In that case the machine requests something only you have (the credit card) and something only you know (a PIN). Notice the PIN is somehow saved in the card too, so this example is indeed a bit flawed.
Google Authenticator uses a time-based factor to provide you a numeric code. This means the clock in your cell phone must be a bit accurate and in sync with the clock in the system you’re trying to authenticate to.
For the sake of simplicity, let’s suppose we take the number of 30-second periods that have passed since the Unix epoch (1970-1-1T00:00Z). We calculate an SHA1 sum from that number and have a predetermined way of transforming all or part of that SHA1 sum to a 6-digit numeric code. We would have an apparently random code that changes every 30 seconds.
The problem with that scheme is that anybody who knows what the current time is, which is not hard, could do the same calculations you do and get the code. We’re on the right track but we haven’t improved the security of the system yet.
Now, if we mix the time (number of 30-second periods…) with a shared secret only your phone and the service provider know, we could get somewhere. Let’s say you concatenate the time and an agreed-upon 10-byte random sequence, and use that to calculate the SHA1 sum from which you derive the 6-digits code. That would almost get us where we want. The problem is, simply joining the time with the 10-byte key is bad cryptography practice.
The cryptographically right way of doing that is calculating an HMAC, which is more sophisticated and secure. That’s the proper way of mixing the cryptographic message (the time) with a key and that’s what Google Authenticator uses. The result is still a 20-byte SHA1 sum that depends on both the shared secret and the current time.
Getting the 6-digits code
Google Authenticator follows a “simple” scheme to transform the calculated 20-byte SHA1 sum to a 6-digit code: it takes the last 4 bits of the sum and interprets them as a natural number. That number, from 0 to 15, gives an offset in bytes in the sum itself from which it takes a subsequence of 4 bytes. It clears (sets to zero) the most significant bit of those 4 bytes and interprets the rest as a natural number. It calculates the remainder of that number when divided by a million and pads the result with zeros.
There’s your six digits code from 000000 to 999999 with a validity of 30 seconds. In reality, systems accept the current, previous and next code as valid, just in case the clocks are not perfectly in sync and to account for delays while typing the code and introduced by the network.
How to agree on the shared secret?
The only missing piece, but simple too. The secret is generated by the service provider when setting up two-factor authentication, and displayed on your computer screen as a QR code. This code contains a suggested service name and the secret key encoded as a base32 string. The code can be scanned with your phone camera and passed to the Google Authenticator app, or typed in base32 by hand. In other words, this saves the secret under a meaningful name in your phone. The app can store secrets for any number of services.
I think that covers it all. In my humble opinion, it’s quite clever but at the same time any computer-literate user can understand what’s going on behind the scenes. There’s also a PAM module that implements the Google Authenticator algorithm and allows you to use two-factor auth to log in to your Unix box.