What is a birthday attack, and how to evade it?
A birthday attack is a cryptographic attack that uses the mathematics behind the birthday problem to crack ciphertext. Cybercriminals use birthday attacks to trick systems by cracking digital authentication methods.
Below, you will find the birthday problem description, also known as the birthday paradox. Understanding how this paradox fosters cryptographic attacks will reveal ways to dodge them.
What is the birthday problem?
The birthday problem is a problem in probability theory. It asks for the probability that at least two will share a birthday in a random group of people.
It is also widely known as the birthday paradox because the results it gets are counterintuitive for most people. For example, let’s say we have a randomly selected group of 23 people. What are the odds that at least two of them share a birthday?
As a clickbait headline would say - “the answer will shock you!”. It is over 50%.
And with 70 people in the group, the probability of at least one shared birthday is over 99%.
Although, at first glance, these odds seem weird, there is nothing paradoxical about them. They come from plain probability calculations.
The result surprises us because we lose focus on what the problem actually asks about a group of 23 people.
We might think about the odds of two people in the group being born on a particular day, like October 6th. Or, we might imagine ourselves in the group and think, what are the odds that someone else has our birthday? Very low.
But that is different from what the problem asks. The problem asks what are the odds of a birthday collision of any kind. Which people collide or on what date does not matter.
With the growing number of people in the group, the chances of avoiding any collisions decrease exponentially.
What is the birthday attack in cybersecurity?
A birthday attack is a brute force attack that exploits the exponentially growing probability of collision. In cybersecurity, collision attacks aim to find a clash of the hash function outcomes.
The goal of the birthday attack is often to gain system access by forging security certificates or cracking passwords.
Cryptographic hash function
Cryptographic hash function (CHF), or simply cryptographic hash, is a mathematical function that typically converts variable-size input to fixed-size output.
In other words, cryptographic hash performs operations on the provided plaintext to transform it into ciphertext. This output is also known as the hash value.
Ideally, the hash function should have three main properties:
- It should be easy to use the hash function on any given input.
- It should be impossible to invert it. One should not be able to use the hash function on the output to trace back to the input.
- It should be collision-free, meaning two slightly different inputs should not produce the same output.
These are the ideal properties of the hash function. But, as we shall see, reality does not always conform to the ideals.
Birthday-type collision
Having a theoretical chance to avoid hash value collisions would require infinite possible outputs. However, there are two important limits to what hash functions can produce.
- Fixed length. Cipher text cannot go on forever. It has to have a finite length. Most hash functions have one fixed length for all outputs. Even having variable length would only enlarge the set of possible outcomes but not make it infinite.
- A limited set of characters. Some encryptions use numbers from 0 to 9, while others - the letters of the alphabet. Some might combine the two and add more symbols. Whatever the set of characters is, it will always have limits to be usable.
Thus, any hash function can produce only a limited number of different ciphertexts. Just like every person will have a birthday from the set of possible birthdays.
Unlike outputs, inputs can be of any length. In other words, one might want to encrypt very short or very long messages or anything in between. This makes the set of possible inputs larger than the set of possible outputs. Thus, some inputs will inevitably produce the same hash values.
Analogically, due to there being more people in the world than days in a year, some necessarily share a birthday.
Finding the collision
Generating all possible outputs to find out how each input is coded would take years. Hackers have learned from the birthday problem that finding two colliding results of different inputs is much easier.
In order to find the collision, hackers will still have to play the numbers game. But rather than concentrating on generating all the outputs, they will concentrate on finding two identical ones:
- The hacker will use a computer program to repeatedly run the hash function on randomly or pseudorandomly chosen inputs.
- The program stores every input-output pair in some database.
- The application checks every output in the database until it finds a collision of the same outputs with different inputs.
- The hacker can potentially use the hash collision to trick a system into treating two different messages as the same.
What can hackers hack with birthday attacks?
Hackers often target digital signatures with birthday attacks. After finding the collision in how digitally signed documents are encrypted, hackers can tamper with them.
Judging by the encryption, the system would find the contract legit even if its details were changed after signing it.
Additionally, hackers can potentially crack all block ciphers that use small block sizes. Small block size means a small set of possible hash values, thus more colliding values.
A 64-bit block size makes even otherwise robust Blowfish encryption vulnerable to birthday attacks. The same goes for all algorithms that use 64-bit or smaller cipher blocks.
How to protect yourself from birthday attacks?
In order to protect yourself and your data from birthday attacks, use the following best practices:
- Use encryption methods with more potential hash values. For example, the AES-256 algorithm uses 256-bit-size blocks. It gives it a broad set of different hash values. Finding a collision in such a set would require either more lifetimes or more computing power than hackers have.
- Use a two-factor authentication for all digital signatures.
- Encrypt the message with multiple hash functions. It highly reduces the probability of cracking the message, as the hacker would need to find multiple unrelated collisions.
Taking one of these steps can make the odds of a successful birthday attack too small to be worth trying.