If you've never played blackjack, well, this is not the time to learn. The initial deal is two cards and the goal is to have a total close to 21, but not more. Each number card counts the number on it, face cards count for 10, and an Ace counts as either 1 or 11. So an Ace and a card worth 10 is an instant winner, often called a "natural."
In a recent homework, you computed the probability of different pairs of cards from a "well-shuffled deck." The calculations are difficult, and a conceptual error can get you way off. Let's look at using simulation to approach it differently.
Download the following model and examine it.
blackjack-pairs.mox
Here's an overview of the idea: We generate pairs of numbers from 0-51, each representing a card. We send them to an equation block that outputs a 0 if the cards are the same (since that can't happen in real blackjack). The equation also checks to see if they are a pair of spades (in which case it outputs a 1), or a pair of aces (in which case it outputs a 2). On any other hand, it outputs a 5.
A key aspect of this model is representation. Somehow, numbers from 0-51 have to mean particular cards. We could just sit down with a deck of cards and number them, and that's essentially what we're doing. But, there are a couple of tricks that make that numbering easier.
The first trick is that, in this language, if we divide a number X by 13, we get the integer quotient. In other words, 13/13 is 1, and so is 14/13 and 15/13, all the way up to 25/13, but 26/13 is 2.
So, if we take all the numbers from 0-51 and divide each by 13, we will get 13 zeros, 13 ones, 13 twos, and 13 threes. Therefore, if the card is number X, we take X/13 to mean (represent) its suit. For convenience, let's assume that:
The second trick is that, in this language, we can also find out the integer remainder when we divide. To do that, we use the percent symbol. That is, X%13 gives the remainder when we divide X by 13. For example, 13%13 is 0, 14%13 is 1, 15%13 is 2, all the way up to 25%13 which is 12, and 26%13 is 0 again.
So, if we take all the numbers from 0-51 and divide each by 13, we will get remainders of 0, 1, 2 ... 12. Therefore, if the card is number X, we take X%13 to mean (represent) its rank. Let's assume that:
Okay, now we're ready to look at the code in the Equation block. The Extend people call their programming language "ModL" (for modeling, ha!). This is not a course in programming, but since a little programming is useful in doing simulation and modeling, let's talk just a little about some of the constructs you see: The Extend language is based on C, so if you know any C, it's a lot easier.
if( some condition ) {
stuff to do when the condition is true
} else {
stuff to do when the condition is false
}
Q: Thoroughly examine the model. Look at how the histogram is defined, including the number of bins and their range. Configure the simulation to do just a few dozen simulation steps. Add an Plotter I/O so that you can look at the two card values and the output of the Equation block. Once you understand that, get rid of the Plotter I/O and configure the simulation to do 1000 or 10,000 steps. What is the probability that a pair of cards is rejected? What is the theoretical probability? How well do they agree?
Q: What is the theoretical probability of a pair of spades? What does the simulation show?
Q: What is the theoretical probability of a pair of aces? What does the simulation show?
Q: What are some disadvantages of using simulation instead of calculations to determine the probability of various hands?
There's a problem with estimating the probability of a pair of spades in blackjack by counting the number of pairs of spades by simulation. The problem is those 1.9 percent of hands that are rejected.
Suppose we run the simulation for 1000 steps and get exactly 20 rejections. That means we dealt 980 legal blackjack hands. Suppose our histogram shows that we got 57 pairs of spades. The proper estimate of the probability should then be 57/980, not 57/1000. That is, the denominator of our estimate should be the number of legal hands, not the number of steps.
Q: Go re-compute your estimate of the probability of a pair of spades and pair of aces. Is it closer to the theoretical value?
There's actually a way to understand this using conditional probability:
Pr( pair | legal ) = Pr( pair AND legal ) / Pr( legal )
In our example,
Pr( pair AND legal ) = 57/1000
Pr( legal ) = 980/1000
Here's another variation on the theme. The probability of a legal hand is just 1 minus the probability of the rejected hands. So, we have:
Pr( pair | legal ) = Pr( pair AND legal ) / (1-Pr( reject ))
Pr( pair | legal ) = (57/1000) / (1-(20/1000))
Let's generalize. Let N be the number of trials, C be the count of the number of hands we're looking at (pairs of spades or whatever), and R be the number of rejected hands. The estimated probability is
P = (C/N) / (1-R/N)
= (C/N)/((N-R)/N)
= C/(N-R)
Q: Build a simulation to determine the probability of a hand consisting of a King and a Queen. Compute the theoretical probability and compare.
Q: Build a simulation that uses the A%13+2 representation of a card's rank and use it to determine the probability of a hand consisting of a King and a Queen. Do you like this code better or worse?
Q: Build a simulation to determine the probability of a "natural." (A hand consisting of an Ace and a card worth 10.) Compute the theoretical probability and compare.
Poker is more complicated than blackjack. The purpose of this section, then, is twofold. First, the theoretical calculations are more complicated, so that's good practice. If you don't understand them, ask! Secondly, the ModL code is more complicated, so you can see a more complex example of what we did with the blackjack example. Also, the code introduces a few new concepts. If we don't have time to get to this in lab today, it's not a tragedy. Still, if you're interested and looking for more challenge, I encourage you to look at this stuff.
Many of you already know how to play poker. If not, here's a crash course that omits most of the game:
So, what we're really interested in is the value of different hands. There are a number of web sites that discuss how poker hands are valued. This is a good web site about poker. Essentially, the more likely a hand is, the less valuable it is; rare hands are more valuable. Therefore, we want to compute the probability of different poker hands.
Q: The denominator of all of our probability calculations is the number of poker hands. That number is combin(52,5). Why?
Now, let's count hands:
=4*10
=13*48
=13*12*combin(4,3)*combin(4,2)
=4*(combin(13,5)-10)
=10*(power(4,5)-4)
=13*combin(4,3)*combin(12,2)*4*4
=combin(13,2)*combin(4,2)*combin(4,2)*44
=13*combin(4,2)*combin(12,3)*power(4,3)
= (combin(13,5)-10)*(power(4,5)-4)
Q: Build a spreadsheet to compute these possibilities. Total them. Compare that with the combin(52,5) possible poker hands. Compute the probabilities of each hand.
Here's a simulation model that estimates poker probabilities. It's like the blackjack models, but just more complicated. Don't worry if you don't understand every detail. Just try for a high-level understanding.
poker.mox
Generally, what the code does is to first compute the rank and suit of each card. Then it stores these into a 2 by 5 array. The array is then sorted by rank. The reason we sort is because sorting makes it much easier to check to see if there are ties (pairs, four of a kind), straights, and so forth. Next, we compute whether it's a flush and whether it's a straight and the number of matches, since those are building blocks below. Finally, we go through the different kinds of hands, outputing a code whenever a kind is recognized.
The code in the equation block is a doozy, so take your time. First, just guess what things mean and try to get the overall sense of the code. I'm happy to look in more detail at it with you.
There are some new things in the Poker code. Let's look at them:
The theoretical probability is the probability that the first card matches the second card, which is 1/52. That works out to 0.01923077 or about 1.9 percent. When I ran the simulation for 1000 steps, I got 12 rejections, which isn't a very good agreement. Then I ran it again and got 19 rejections, which is spot on. YMMV.
The theoretical probability is (13*12)/(52*51) = 0.05882353. When I ran the simulation for 1000 steps, I got 56 spade pairs, so that seems like a probability of 5.6 percent. (But we'll amend this in the next section.)
The theoretical probability is (4*3)/(52*51) = 0.00452489, so in 1,000 legal hands, you should get 4 or 5 of them. When I ran the simulation for 1000 steps, I got 5. (But, again, we'll amend this probability in the next section.)
There are basically two disadvantages. The first is that, just as it's hard to be sure your theoretical calculations are right, it can be hard to determine that your program (the equation in the equation block) is correct. That's why it's good to use both methods; they can confirm one another.
The second disadvantage is that if something is very unlikely (such as a pair of aces or a straight-flush in poker), you may have to generate millions and millions of hands before your probability estimate is reliable.
For example, the probability of a pair of aces is 0.00452489. Adjusted for the simulation, the probability is 0.00443787. That means that in 1 million trials, we expect to get 4438 pairs of aces. (Move the decimal point 6 places.) What if we get 4446? That's only 8 extra. (How likely is that? We'll compute that later in the course.) Now there's an error in the third digit of our probability estimate. Maybe not a big deal, but in some applications it might be. So we might need to do 10 million or 100 million trials to really nail it down.
For a more extreme example, consider the probability of a straight flush in poker, which is theoretically 0.0000154. Adjusting for simulation, that's 0.0000126248. That means out of a million hands, you expect to get about 12 straight flushes. If, by chance, we get 13 or 14 or 11, our probability estimate becomes 0.000013 or 0.000014 or 0.000011 percent, which is significantly different from the theoretical probability. In general, estimating the probability of very rare events requires generating zillions of samples, so that the law of large numbers can work for us.
Q: Go re-compute your estimate of the probability of a pair of spades and pair of aces. Is it closer to the theoretical value?
The theoretical probability are unchanged: (13*12)/(52*51)=0.05882353 and (4*3)/(52*51) = 0.00452489, so in 100,000 legal hands, you should get about 5,882 and 4,525 of them, respectively. But we're not generating legal hands! Instead, we're generating pairs of numbers, 1.9 percent of which are rejected.
Instead, in 100,000 trials, you should get about 5,764 and 4,434 of them. How did I get those numbers? The probability of a legal hand is about 98 percent. 5764 is about 98 percent of 5,882, and 4,434 is about 98 percent of 4,525. That's because:
Pr(aces and legal) = Pr(aces | legal)*Pr(legal)
Here's a simulation model that does that.
blackjack-kq.moxThe theoretical probability is 2*(13*12)/(52*51). The factor of two out front is because we have to count both KQ (first card K, second card Q) and QK (the other order).
Here's one that does:
blackjack-kq+2.moxThe code is almost exactly the same. Maybe it's a little clearer, but I don't know if it's worth the effort.
Here's such a model:
blackjack-natural.moxThe theoretical probability is 2*(4*16)/(52*51). The two out front is because we don't care about order. There are 4 aces and 16 cards worth 10 (4 kings, 4 queens, 4 jacks and 4 tens).
That's because you're choosing 5 cards from a deck of 52, and you don't care about the order in which you get the cards.
Here's a spreadsheet that has a bunch of poker probabilities:
poker-probs.xls