Title: Android encryption's resistance against bruteforce, explain it like I'm five
Date: 2015-12-22 15:10

After my [previous]( {filename}/crypto/cyanogen_crypto.md ) article,
a friend of mine asked me how a 4-digits code could be enough to protect a phone,
while a complex password is [not enough]( https://xkcd.com/936/ ) for her email account.

This article is in the same spirit that the [one about socialist
millionaires]( {filename}/crypto/millionaire_socialist.md),
aiming to popularize the concepts, not to be an in-depth technical explanation:
other people already did this [very]( http://www.nostarch.com/androidsecurity )
[well]( http://nelenkov.blogspot.fr/2014/10/revisiting-android-disk-encryption.html ),
and the implementation is publicly available
[here]( https://android.googlesource.com/platform/system/vold/+/master/cryptfs.c ).
You should go read it if you know some C.

Also, this article only applies to the latest Android version,
the 6, also known as [Marshmallow]( https://en.wikipedia.org/wiki/Android_Marshmallow ),
which comes with encryption enabled by default.

## Key stretching

For this article, we need to define what is **key stretching**, and for this,
we need **hash function**.
A [hash function]( https://en.wikipedia.org/wiki/Cryptographic_hash_function )
is a function to map data or arbitrary size
to data of fixed size. For example, the *modulo* operator can be see as a hashing function:

```python
1 % 42 = 1
1337 % 42 = 35
1234 % 42 = 16
```

This one will map any number to the interval `[0, 41]`.
For our purpose, we need hash function with at least those four properties:

1. It's easy to compute the hash for any given message
2. It's impossible to retrieve the message given its hash
3. It's impossible to modify a message without changing its hash
4. It's impossible to find two different messages with the same hash.

Our toy function *modulo* doesn't satisfy the last one.

Back to key-stretching: As a user, you don't care that much if your password
verification time is extended by 5 seconds, but as an attacker,
it frustrates you if you need to add 5 seconds per guess. This is exactly the goal of
key stretching: make the password verification computationally intensive,
in computation time, and in required storage space.
The one used by Android is [scrypt]( https://en.wikipedia.org/wiki/Scrypt ):
it (roughly) iteratively hashes the password a great number of time, and does some
magic shuffling between every iteration. There is no way to parallelize this
operation on a big cluster, thanks to our 4 properties,
since you need to get the current value to
compute the next one. The only thing that you can do, it to parallelize
different password guesses, not the guesses themselves.

It's worth mentioning that *scrypt* is not only taking your password as argument,
but also a [salt]( https://en.wikipedia.org/wiki/Salt_(cryptography) ),
which is a non-secret per-hash-unique long-enough™ value,
appended to your password before stretching,
aimed at defeating [rainbow tables]( https://en.wikipedia.org/wiki/Rainbow_table ):
without salts, two people with the same password would have the same hash.
This would also allow attackers to precompute massive tables with common
passwords like `admin`, `password`, `passw0rd`, …
to do a simple lookup to find the value corresponding to a given hash.
But with salts, the attacker would have to build a rainbow table **per salt**!

Without salt:

```python
hash(admin)    = "4015bc9ee91e437d90df83fb64fbbe312d9c9f05"
hash(password) = "c8fed00eb2e87f1cee8e90ebbe870c190ac3848c"
hash(passw0rd) = "e343f0ffdbc90c692ed1a4b0962fd02e52f25cf0"
hash(admin)    = "4015bc9ee91e437d90df83fb64fbbe312d9c9f05"
```

With salt:

```python
hash(admin + iKuviej9ea7rooqu)    = "63f42e39c44bf4736adeba618413a3b25d9e2a79"
hash(password + ahxohveiw9GohQuu) = "d0ceb9acdfd26158810f9949a9164cc96c6d270d"
hash(passw0rd + Nee7zah2uPh6yaer) = "79c0f381e9ca67308ec771fab5d82b9473656328"
hash(admin + ohNgur3xiok5veeb)    = "d5ea26f9ab8c921f322b5c6c7e5efda42d5d6470"
```

Feel free to check the [scrypt webpage]( https://www.tarsnap.com/scrypt.html )
to read a more in depth explanation.

## Encryption with an encrypted key

So, to prevent a bruteforce attack, you can either have a complex password that
will take time to guess, or increase the stretching. You can of course do both,
but I'm quite sure that you prefer to increase a bit your boot delay than to
type a 140 characters password on your cellphone's keyboard ;)

The default encryption scheme is the well-known
[AES](https://en.wikipedia.org/wiki/Advanced_Encryption_Standard) algorithm,
with at least a 128 bits key. This key is generated on first boot, meaning that your
cellphone is encrypted by default, but with a default password, namely
"default_password". The crux here is that the key that is used to
encrypt/decrypt your data is **not** your password! Your password is used to
encrypt/decrypt this key. This brings several advantages:

- Thanks to key stretching, you can have a password less complex than 128 bit,
  without lowering the time needed to successfully compromise your data.
- You can change your password without having to complete re-encrypt your
  device.

# Protection against off-line attacks

But even with key stretching, a 6 digits PIN could still be bruteforced in a
couple of hours on a laptop. This is why your phone is (likely) using a [TEE](
https://en.wikipedia.org/wiki/Trusted_execution_environment ), for
Trusted Execution Environnement. It's like a safe-room, from which
cryptographic material is (supposedly) hard to extract without fancy
equipements. You can ask the TEE to perform cryptographic operations, like
[signing]( https://en.wikipedia.org/wiki/Digital_signature ) things.
A cryptographic signature is pretty much like a regular every-day pen-powered
signature: it's a *proof* of provenance (the document comes from the signer),
and a proof of intention (the signer agrees with the document).

This is exactly what is used to defeat off-line attacks:
Stretching a password on a smartphone is considerably slower than doing the
same thing on a [cluster of CPU]( http://openwall.info/wiki/HPC/Village ).
To force an attacker to run the stretching on your device,
Android first stretch your password with *scrypt*, ask the
*TEE* to sign the result, then *scrypt* it again.
Since an attacker is unable to extract the intel from the TEE to perform signatures,
the only way to bruteforce your device is to do it **on** your (slow) device!

Here is a recap drawing, with operations in circles and data in squares.

![Recap scheme of android crypto]({static}/images/android_crypto.svg)

Things are even complicated by the fact that your device will throttle guesses:
enter wrong passwords too many times in a row, and your smartphone will enforce
longer delays, reboot, and ultimately, it you insist too much, wipe itself.

It's worth noting that currently, your
passphrase to unlock your phone is the same that the one used to
encrypt/decrypt it. It's possible to use a different one, but this feature is
not (yet?) exposed to the end user.

Of course, the whole process is a bit more complex, some parts were omitted to make this article 
more intelligible ;)
