Storing Passwords in a Highly Parallelized World

Why “Use bcrypt.” is not the best recommendation (anymore).

Preamble: if you’re hashing your passwords with bcrypt / scrypt / PBKDF2 today, there’s nothing to worry about in the immediate future. This article is for you if you’re choosing a password hash today and want a future-proof solution. Not a call for action.

The Past

Five years ago, the world was a simpler place. You could shame other people for using cryptographic hashes like SHA-1 just by yelling at them to use bcrypt and you were mostly right.

bcrypt is a password hash. The difference to cryptographic hashes like SHA-1 is that it adds a computational cost to password hashing. In other words: it’s intentionally slow. The reasoning is that if someone steals the hashes of the passwords of your customers, it’s going to be much more expensive to compute the passwords (which are probably also the passwords to their email accounts) to those hashes.

The Present

Fast-forward to 2016.

The attackers caught up big time. Making a password hash computationally expensive is not enough anymore because people started utilizing GPUs and building highly parallel hardware specifically for cracking as many passwords in parallel as possible: ASICs¹.

Thus the next step in this arms race was to introduce an additional memory cost to hashing passwords. That makes the highly parallelized cracking of passwords infeasible by significantly raising the costs.

Currently, the most popular memory hard implementation is scrypt by the former FreeBSD Security Officer Dr. Colin Percival. Sadly, scrypt never got the attention it deserved; mostly due to the popularity of bcrypt and the NIST approval of PBKDF2 (that unfortunately also is not memory hard).

The Future

In the fall of 2012 Jean-Philippe Aumasson summoned an impressive round of cryptographers and security researchers, and initiated the Password Hashing Competition (PHC)².

Password hashing is everywhere, from web services’ credentials storage to mobile and desktop authentication or disk encryption systems. Yet there wasn’t an established standard to fulfill the needs of modern applications and to best protect against attackers. We started the Password Hashing Competition (PHC) to solve this problem.

In 2015, they announced the winner: Argon2.

Argon2 is a secure, memory hard password hash. It comes in three variants but for password hashing only the side-channel hardened Argon2id is relevant. In September 2021 RFC 9106 was released that made Argon2 an official IETF standard.

Practice

The Argon2 authors released a reference implementation in portable C with an optimized version for SSE2-enabled CPUs under the permissive CC0 license (~ Public Domain). This implementation is packaged by most operating systems and due to its license, it’s already possible to build bindings against it by vendoring it.

If you use Python you’re in luck: I’ve released CFFI bindings for the official Argon2 implementation: argon2-cffi with wheel files for all relevant Python versions and platforms.

After installing it from PyPI, all you need to do is:

>>> from argon2 import PasswordHasher
>>> ph = PasswordHasher()
>>> hash = ph.hash("s3kr3tp4ssw0rd")
>>> hash
$argon2id$v=19$m=65536,t=3,p=4$kg9p7GpfPUDuF2+CLtTzGA$Hd0F9Ku7ib3AkNCxEz7PQqiEcLHlvL+VRXSXalgF2zE
>>> ph.verify(hash, "s3kr3tp4ssw0rd")
True
>>> ph.verify(hash, "t0t411ywr0ng")
Traceback (most recent call last):
  ...
argon2.exceptions.VerifyMismatchError: The password does not match the supplied hash

As you can see, hash() returns a self-contained hash with all parameters (including a random salt that can be fed into verify()³. All parameters can be set using keyword arguments when instantiating PasswordHasher.

If you want to build your own high-level abstractions, the argon2.low_level module is for you. It offers direct bindings to all relevant APIs:

>>> import argon2
>>> argon2.low_level.hash_secret(
...     b"secret", b"somesalt",
...     time_cost=1, memory_cost=8, parallelism=1, hash_len=64,
...     type=argon2.Type.D
... )
(b'$argon2id$v=19$m=8,t=1,p=1$c29tZXNhbHQ$eg9zX0w3gjPikvNcKvp5eQg3kmo1oTUqYNTJT'
 b'a31PsyxQZj0oyoXjaaiy6qAT6a3AJp2QxXuGllhfGIe1/CRPw')

Finally it also comes with a CLI interface that allows you to benchmark its defaults and play with the parameters:

$ python -m argon2
Running Argon2i 100 times with:
hash_len: 16
memory_cost: 512
parallelism: 2
time_cost: 2

Measuring...

0.618ms per password verification
$ python -m argon2 -t 4 -m 1024 -p 5
Running Argon2i 100 times with:
hash_len: 16
memory_cost: 1024
parallelism: 5
time_cost: 4

Measuring...

1.7ms per password verification

Please have a look at the relevant documentation on how to determine the optimal parameters for your use case.

A brief web search shows there are bindings for most other platforms too. Official Django integration is available since the 1.10 release.

If your programming language or framework of choice is lacking an implementation, I encourage you to help out. I have experienced the Argon2 authors as most helpful while fighting the ancient Visual Studio 2008 so I can offer Python 2.7 wheel files for Windows⁴. Of course I’ll happily help you out with Python-related woes.

I’d like to close with another quote from the PHC website:

We recommend that use you use Argon2 rather than legacy algorithms.

So don’t panic but consider Argon2 and argon2-cffi when choosing a password hash the next time.

Also popular for mining Bitcoin. ↩︎
This used to be the NIST’s job however their opinions slightly lost on value. ↩︎
In case you wonder why verify() raises an exception instead of just returning False: when I released the first version of the bindings, the Argon2 library has no concept of a “wrong password” error. Therefore an exception with the full error was raised so you can inspect what went wrong if needed. Also IMHO a wrong password should raise an exception such that it can’t pass unnoticed by accident. Either way, it would be very dangerous to change that behavior now, that it would be possible. ↩︎
It took me five betas and the help of five people. ↩︎