# The Birthday Attack – From Probability to Cryptography

Instead, we are going to discuss a cryptographic attack that exploits the math behind the birthday problem, named the ‘Birthday Attack’ which sounds just as much fun.

### The Birthday Problem

The birthday problem is concerned with the likelihood that some pair of people in a group of N randomly picked people would share the same birthdate.

The birthday problem is often called ‘The birthday paradox’ since it produces a surprising result — A group of 23 people has a more than 50% chance of having a common birthdate, whereas a group of 70 has a 99.9% chance of having a shared birthday.

Let’s go over the math quickly.
Say we randomly take one person out of the group of N people. this person obviously has a birthdate which we will notate with A.
Now, let’s pick another person, with birthdate B. what is the probability the A is different from B?

To find the probability that these two people share a birthday we need to calculate 1-P, which is 0.0027.

Let’s take another step and try to calculate the probability of choosing randomly 23 people so that no one of them shares a birthdate.

Similarly, to find the probability of a shared birthdate we need to calculate 1-P which is 50.7.

We can generalize this with some combinatorics trickery, namely, the pigeonhole principle

### The Birthday Attack

A birthday attack is a type of cryptographic attack which exploits the mathematics underlying the birthday problem in probability theory.

As explained in the birthday problem, the attack is based on a fixed number of permutations (pigeonholes) and the greater chance of collisions detected between random attack attempts.

To better understand what all of this means, let’s get familiar with hash functions.

A hash function is any function that can be used to map data of arbitrary size to fixed-size values.
For example, given a hash function H, the mapping can look like
H(“hello”) = 7a
H(“pasta”) = b9
H(“goodbye”) = 7a

As we can see, the inputs “hello” and “goodbye” got the same hash — which is how a hash collision is defined.

The birthday attack finds two different messages m₁, m₂, such that H(m₁) = H(m₂), namely a hash collision between two messages.

This is where the birthday problem discussed earlier comes in handy — we can think of a hash collision as of two people that are having the same birthdate.
Applying the results we had with the birthday problem, we can get that given 23 messages, there is a 50% that 2 of them share a hash.

#### Why do we care if two messages share a hash?

Whenever you make a password for a website, this password is encrypted into a hash that stores and identifies the combination of characters you input.

For example, given the password “12345678” — which is my bank account password (joking), MD5(message-digest algorithm 5) encryption algorithm would output a 32 hexadecimal characters hash: 25d55ad283aa400af464c76d713c07ad (you can play around with MD5 with this online free tool)

The attack’s goal was to find and force a hash collision for any two values.

If there was indeed a collision, we could change information only in one of the messages and affect both of them.

It is worth mentioning that hash algorithms nowadays are not vulnerable to this kind of attack since it is not feasible computationally.

### Shameless Self-Promotion

• My girlfriend and I recently opened a greeting card shop together.
In case a loved one has a birthday soon or you need some Christmas cards in advance, we got you covered.
• Don’t forget to subscribe with your email to get the latest posts to your inbox