This article is in the same spirit that the one about socialist millionaires, aiming to popularize the concepts, not to be an in-depth technical explanation: other people already did this very well, and the implementation is publicly available here. You should go read it if you know some C.
Also, this article only applies to the latest Android version, the 6, also known as Marshmallow, which comes with encryption enabled by default.
For this article, we need to define what is key stretching, and for this, we need hash function. A hash function is a function to map data or arbitrary size to data of fixed size. For example, the modulo operator can be see as a hashing function:
1 % 42 = 1 1337 % 42 = 35 1234 % 42 = 16
This one will map any number to the interval
For our purpose, we need hash function with at least those four properties:
- It's easy to compute the hash for any given message
- It's impossible to retrieve the message given its hash
- It's impossible to modify a message without changing its hash
- It's impossible to find two different messages with the same hash.
Our toy function modulo doesn't satisfy the last one.
Back to key-stretching: As a user, you don't care that much if your password verification time is extended by 5 seconds, but as an attacker, it frustrates you if you need to add 5 seconds per guess. This is exactly the goal of key stretching: make the password verification computationally intensive, in computation time, and in required storage space. The one used by Android is scrypt: it (roughly) iteratively hashes the password a great number of time, and does some magic shuffling between every iteration. There is no way to parallelize this operation on a big cluster, thanks to our 4 properties, since you need to get the current value to compute the next one. The only thing that you can do, it to parallelize different password guesses, not the guesses themselves.
It's worth mentioning that scrypt is not only taking your password as argument,
but also a salt,
which is a non-secret per-hash-unique long-enough™ value,
appended to your password before stretching,
aimed at defeating rainbow tables:
without salts, two people with the same password would have the same hash.
This would also allow attackers to precompute massive tables with common
to do a simple lookup to find the value corresponding to a given hash.
But with salts, the attacker would have to build a rainbow table per salt!
hash(admin) = "4015bc9ee91e437d90df83fb64fbbe312d9c9f05" hash(password) = "c8fed00eb2e87f1cee8e90ebbe870c190ac3848c" hash(passw0rd) = "e343f0ffdbc90c692ed1a4b0962fd02e52f25cf0" hash(admin) = "4015bc9ee91e437d90df83fb64fbbe312d9c9f05"
hash(admin + iKuviej9ea7rooqu) = "63f42e39c44bf4736adeba618413a3b25d9e2a79" hash(password + ahxohveiw9GohQuu) = "d0ceb9acdfd26158810f9949a9164cc96c6d270d" hash(passw0rd + Nee7zah2uPh6yaer) = "79c0f381e9ca67308ec771fab5d82b9473656328" hash(admin + ohNgur3xiok5veeb) = "d5ea26f9ab8c921f322b5c6c7e5efda42d5d6470"
Feel free to check the scrypt webpage to read a more in depth explanation.
So, to prevent a bruteforce attack, you can either have a complex password that will take time to guess, or increase the stretching. You can of course do both, but I'm quite sure that you prefer to increase a bit your boot delay than to type a 140 characters password on your cellphone's keyboard ;)
The default encryption scheme is the well-known AES algorithm, with at least a 128 bits key. This key is generated on first boot, meaning that your cellphone is encrypted by default, but with a default password, namely "default_password". The crux here is that the key that is used to encrypt/decrypt your data is not your password! Your password is used to encrypt/decrypt this key. This brings several advantages:
- Thanks to key stretching, you can have a password less complex than 128 bit, without lowering the time needed to successfully compromise your data.
- You can change your password without having to complete re-encrypt your device.
But even with key stretching, a 6 digits PIN could still be bruteforced in a couple of hours on a laptop. This is why your phone is (likely) using a TEE, for Trusted Execution Environnement. It's like a safe-room, from which cryptographic material is (supposedly) hard to extract without fancy equipements. You can ask the TEE to perform cryptographic operations, like signing things. A cryptographic signature is pretty much like a regular every-day pen-powered signature: it's a proof of provenance (the document comes from the signer), and a proof of intention (the signer agrees with the document).
This is exactly what is used to defeat off-line attacks: Stretching a password on a smartphone is considerably slower than doing the same thing on a cluster of CPU. To force an attacker to run the stretching on your device, Android first stretch your password with scrypt, ask the TEE to sign the result, then scrypt it again. Since an attacker is unable to extract the intel from the TEE to perform signatures, the only way to bruteforce your device is to do it on your (slow) device!
Here is a recap drawing, with operations in circles and data in squares.
Things are even complicated by the fact that your device will throttle guesses: enter wrong passwords too many times in a row, and your smartphone will enforce longer delays, reboot, and ultimately, it you insist too much, wipe itself.
It's worth noting that currently, your passphrase to unlock your phone is the same that the one used to encrypt/decrypt it. It's possible to use a different one, but this feature is not (yet?) exposed to the end user.
Of course, the whole process is a bit more complex, some parts were omitted to make this article more intelligible ;)