Birthday Paradox (lab)

Homework review

The Birthday Paradox

The Birthday Paradox is a classic of counting and probability, because it's so darn surprising. It's a paradox not because it's logically contradictory, but because the true answer is so different from the "intuitive" answer.

The Birthday Problem: How many people do you need to gather together to have a 50-50 chance that two of them will share a birthday?

The intuitive answer: 365/2 or about 180.

The actual answer: 23!!! Those exclamation points are for surprise, not factorial.

Why? Let's first compute the probability that no one shares a birthday, as a function of the number of people.

Nprob
2(365 x 364)/3652
3(365 x 364 x 363)/3653
4(365 x 364 x 363 x 362)/3654
......
N365!/(365-N!)/365N

Big Numbers

It's appealing to try to just type that kind of formula into Excel and see what it does. Alas, it gives you a mysterious error:

#NUM!

Q: By trial and error, find out the largest value that Excel can compute the factorial of. If you're clever, you can find this value in less than 15 guesses; fewer, if you're lucky.

What's happening? There is a limit to how big the numbers that Excel uses can get. Unfortunately, this means that if the number we are computing is the quotient of two big numbers, we often can't compute the big numbers, even if the quotient is quite reasonable.

Q: Give two ways to compute 365 x 364 x 363 x ... 355, one of which works and the other which doesn't.

Computing the Probabilities in the Birthday Problem

Set up a table like this:

1365=product($b$1:b1)/power(365,A1)
2364=product($b$1:b2)/power(365,A2)
3363=product($b$1:b3)/power(365,A3)
4362=product($b$1:b4)/power(365,A4)

Take a minute to see how that formula works. The idea is that the product function computes the product of all the cells in a range. This range is defined to be from B1 (absolute, so it doesn't change when we copy/paste it) to the current row. Since the first row is row 1, it's a product from B1 to B1, which is just 365. For the second row, it's the product from B1 to B2, which is 365*364. And so on.

Type in by hand some formulas for the birthday probabilities and check these values. Make sure you can trust the numbers you're getting.

Using copy/paste, increase this table so that it's at least 50 rows long.

Q: How many people do you need for the probability of no repeats to be less than 75 percent?

Q: How many people do you need for the probability of no repeats to be less than 50 percent?

Q: How many people do you need for the probability of no repeats to be less than 30 percent?

Q: How many people do you need for the probability of no repeats to be less than 10 percent?

Q: With 50 people gathered together, what is the chance that none of them will share a birthday?

Amazing, isn't it?

StarLogo Birthday Simulation

Here's a simulation in Starlogo of the Birthday Paradox.

birthday.slogo

Download it to your desktop, start StarLogo, and open the simulation. Try it.

Play around with it. Does it look random to you?

Poker

Many of you already know how to play poker. If not, here's a crash course that omits most of the game:

So, what we're really interested in is the value of different hands. There are a number of web sites that discuss how poker hands are valued. This is a good web site about poker. Essentially, the more likely a hand is, the less valuable; rare hands are more valuable. Therefore, we want to compute the probability of different poker hands.

The denominator of all of our probability calculations is the number of poker hands. That number is combin(52,5). Why?

Now, let's count hands:

  1. Straight Flush: consecutive cards, all of one suit. To count these, realize that once you choose the suit and the top card of the straight flush, everything else is determined. There are 4 ways to choose the suit and 10 ways to choose the top card, and these are independent choices, so there are 4*10 or 40 ways total.

    =4*10

  2. Four of a kind: four cards of the same rank, and one other card. There are 13 possible ranks, and then there are 48 choices for the other card. Therefore, the total is 13*48.

    =13*48

  3. Full house: three cards of one rank and two of another. Okay, this is hard, so take a deep breath. You have to choose two ranks. There are 13 ways to choose the triple and 12 ways to choose the pair, for a total of 13*12 ways to make those choices. For the triple, there are combin(4,3) ways to choose the suits that they have. For the pair, you have combin(4,2) ways to choose the suits they have. Therefore, the total number of ways is

    =13*12*combin(4,3)*combin(4,2)

  4. Flush: all cards of the same suit. There are four ways to choose the suit. Once you've chosen the suit, you have combin(13,5) ways to choose the 5 cards. However, you have to subtract off the 10 straights (since we don't want to count straight flushes). Therefore, the answer is:

    =4*(combin(13,5)-10)

  5. Straight: There are 10 ways to choose the high card of your straight; all the other ranks are forced. There are power(4,5) ways to choose the suits for the 5 cards in your straight. However, four of those are flushes, so subtract those off. Thus, the total number of ways is:

    =10*(power(4,5)-4)

  6. Three of a kind: one triple and two different cards. There are 13 choices for the rank of the triple, and combin(4,3) ways to choose the suits for those cards. There are then combin(12,2) ways to choose the remaining two ranks, with four possibilities for the suit of each. Thus:

    =13*combin(4,3)*combin(12,2)*4*4

  7. Two pair. There are combin(13,2) ways to choose the two ranks, and combin(4,2) ways to choose the suits for each, and then 44 choices for the fifth card.

    =combin(13,2)*combin(4,2)*combin(4,2)*44

  8. One pair. There are 13 ways to choose the rank of the pair, and combin(4,2) ways to choose the suits, and then there are combin(12,3) to choose the other three ranks and power(4,3) ways to choose the suits for the remaining cards:

    =13*combin(4,2)*combin(12,3)*power(4,3)

  9. nothing. There are combin(13,5) ways to choose the five different ranks, but 10 of those are straights, so subtract them. There are power(4,5) ways to choose the suits for the five cards, but four of those are flushes, so subtract those. Multiply these two quantities.

    = (combin(13,5)-10)*(power(4,5)-4)

Build a spreadsheet to compute these possibilities. Total them. Compare that with the combin(52,5) possible poker hands. Compute the probabilities of each hand.

Using Simulation

The previous development is difficult and error prone. A somewhat better way is simulation. Look at the following model and see if you can determine how it works. The code in the equation block is a doozy, so take your time and skim.

cards-sort.mox

Q: What are some disadvantages of using simulation instead of calculations to determine the probability of various hands?

We'll talk about this together.

Generating Hands

Hand Distribution


Answers

This work is licensed under a Creative Commons License | Creative Commons License | Viewable With Any
Browser | Valid HTML 4.01! | Valid CSS!