Cryptography in Layman's Terms
This is a post I’ve been wanting to write for a while now. I want to put down in layman’s terms what various cryptographic operations represent, and common use cases for them. When designing large systems we don’t need to get bogged down in details of how a Diffie-Hellman handshake works, what we need are abstractions, patterns and black boxes we can effectively use together and reason about.
With this post I hope to present terms you may or may not have heard before, hopefully with a few examples, one for anyone to understand and at least one for developers to understand.
Before we go any further it's worth mentioning the old adage - never roll your own crypto. Always build on proven, audited libraries and preferably get the implementation looked at by someone who knows what they're doing. Cryptographic operations can fail (often silently) in unpredictable ways, leaving you stuck with the assumption that what you've built is secure, when it's really not.
Hashing has many, many use cases in the digital world. For our purposes it can be easiest to explain hashing as a deep check for equality. Given two things (files, sentences, words, images) how do we ascertain that they are identical? This can be hard to reason about outside of computing because almost never are two items perfectly equal.
In a system
The most common use case almost all of us will be familiar with is password hashing. When users enter their password we don’t want to store that in plaintext, if a database of raw user passwords was leaked it’d be catastrophic. What we do instead is hash the password and store that. If the database is ever leaked the user passwords can’t be reversed from the hash.
I don’t need to provide another example for hashing because it’s so common, however let’s briefly look at how we might leverage hashing to build a custom image caching system. When we browse user profiles on a mobile application, the mobile application (hopefully) doesn’t fetch the images every time you see the image. What it could do, to save bandwidth, is just send the hash of the current image to the server. The server checks if this hash matches the hash of the latest upload, and only sends the new image if it doesn’t match, meaning the clients version is outdated.
Symmetric encryption is something most of us are probably familiar with, however if not, it’s simply the act of encrypting something with a key. If you don’t have the key, you can’t decrypt the data.
A good analogy can be to liken symmetric encryption with a safe. It’s big, heavy and you know whatever’s in there is going to be private. If you’ve got the combination - awesome - you’ve got full access to it. If not, you’re out of luck and you’re gonna have to try break it open.
In a system
As a practical example of symmetric encryption in a wider digital system, imagine you want to save data in the cloud. You’ve run out of space on your laptop and want to use Dropbox to store your data. Just sending your data to Dropbox is a recipe for disaster, anyone at Dropbox would be able to view your private information. However if you generate a symmetric key and encrypt your data with it before sending it to Dropbox, you know no one else is going to be able to see it.
Asymmetric encryption is a really powerful concept we can model entire systems around. It all boils down to there being a different key for encryption and decryption.
As a simple example, let’s liken asymmetric encryption to a letterbox. With a letterbox anyone can put mail in it, however only people with the key can read said mail. This is at odds with symmetric encryption from our previous example, where you need the key to put anything into the box.
In a system
I’ve struggled here to just pick one example to explore, but perhaps the most relevant is encrypted communication. With asymmetric encryption you have a public key (this is what others see, like the name on your letterbox) and you have a private key (like the name suggests, this one should be kept secret - otherwise anyone could open your letterbox). When two users want to communicate securely they simply encrypt their messages to the recipients public key. Now no one else other than those two can read their messages. To spell this out:
- User B (recipient) publishes their public key
- User A (sender) encrypts a message to B’s public key
- User A sends the encrypted message to B
- User B receives the message and decrypts it with their private key.
Another great aspect of asymmetric cryptography is the ability to provide proof that a message came from someone in particular. To use the mail analogy again, this can be thought of as a stamp on the letter. You know that only Steve has the stamp “Steve’s stamp”, so you can trust any letter with this to be from him.
In a system
Digital signatures are used almost everywhere in the internet today. In the context of cryptocurrency and blockchains, every transaction you send it accompanied by a signature over the payload you’re sending - asserting that it came from you.
Let’s also take a moment to think about internal or external communication between various services you’ve written. How do you know when a request comes from your payments service that it’s really your payments service sending this message? When a user of your website does an action (buying an item, for example), how do you tie that action to the right user?
A common solution is signed payloads, JSON Web Tokens being a popular specification to follow. A session token is any many aspects just a signature over some payload you’ve provided, an assertion that at some point this user authenticated with you. A basic user authentication flow might go:
- User sends you their username and password
- You verify that the password hash matches the saved hash for this username
- You generate a signed payload that includes their unique ID and potentially anything else you don’t want to have to read from the database for every request.
Now every request that comes to your server can just verify the signature provided, and you allow them access to the system based on that.
A blind signature is one where the signer doesn’t need to about the contents of the message. This can be hard to reason about but one simple example is in an election voting situation. Alice comes in to vote and proves her identity to the officials at the location. After voting (her choice remains a secret) she hands the sealed envelope to the official, who signs and submits it. In this case the signer has no idea what choice Alice has made, however later at the vote counting station we can verify that this envelope is valid because it was signed by an official.
In a system
We could improve upon the above voting example and think about payments. Imagine we’re wanting to make a payment somewhere and not have the merchant know our identity, this is where we could leverage blind signatures to make transactions privately.
- We go to the bank and ask for a cheque for $100
- We take this cheque and send our order to a shop (anonymously), asking for $100 worth of goods
- The shop can verify this blind signature with the bank and send our goods to our specified address
- We’ve successfully completed a purchase without either the bank knowing what we’re buying, or the merchant knowing who we are.
I’ve wracked my brain for hours trying to think of a good analogy for secret sharing, but I wasn’t able to come up with one. Essentially it comes down to (what seems like) magic, you and another party are able to agree on something without ever communicating, it’s pretty cool.
What we’re talking about here are Diffie-Hellman key exchanges. They underpin almost all modern communication. Most of your browser traffic these days is secured via TLS (which implies some kind of key exchange between your browser and the server) however I’d like to explore another cool use case in the context of privacy preserving applications.
Before we dive into the example, at the highest level a Diffie-Hellman key exchange works as follows:
DH(Alice's public key, Bob's private key) = shared_secret
DH(Bob's public key, Alice's private key) = shared_secret
shared_secret is only computable by Alice and Bob, no one else would be able to figure it out.
A use case
Let’s explore how we could build a privacy preserving messaging platform using a centralised database. At its most basic (and totally open) a message between two parties would look like this:
|Bob||Alice||Hey! What’s up?|
This is how most applications have been built for the last twenty years. Maybe data is encrypted at rest, but in my experience working at software companies encryption at rest means nothing, every employee still has access to this data on demand, in the worst case it’s a Jira ticket away.
Step one in our privacy preserving journey is to encrypt the data swapped between parties, Alice encrypts messages to Bob’s public key and vice-versa. Now we as the providers of this service can’t see the data being swapped:
However here we can unfortunately still tell that Alice and Bob are communicating. What could we do to hide the communication between these two?
One solution I’ll propose - which also has the benefit of sender privacy, so we’re killing two birds with one stone here - would be to generate a shared secret and hash it with the recipients identifier. Now messages just have a
payload and a
to property. The
Remember only Alice and Bob can generate this shared secret, and the HASH function is some irreversible function that hides its content. Now whenever Bob wants to find messages from Alice he computes
HASH(alice_bob_shared_secret, Bob) and queries all messages that match that
As you can see now we only know that these two users are using our platform, but not who they’re communicating with, their privacy has been preserved!