CS199 Probability Rules

Mathematical Expectation

(See section 4.5)

The notion of expected value is derived from gambling games: how much do you expect to win, given the probabilities of different outcomes and their value in dollars. Your book rightly emphasizes that this is very different from the outcome you expect.

definition: The expected value is the sum of the value of each outcome multiplied by its probability. Using our previous terminology, it's a weighted sum, where the probabilities are the weights:

EV = v1*p1 + v2*p2 + ... vN*pN

Q: If a coin turns up heads, I'll pay you a dollar. If it comes up tails, you pay me a quarter. What's the expected value of the game? Equivalently, how much would you be willing to pay to play this game with me?

Q: The house will pay you as many dollars as there are dots showing on a six-sided die. How much should they charge to play the game? If they charge $4 to play, how much profit do they make?

Q: What's the expected value of playing black (or red) in roulette, if there are 36 red/black numbers, plus green zero and green double-zero? Assume you're betting one dollar and the house is paying even odds.

Q: A friend working at Microsoft is offered 1-year options on the stock for a price of $28.00. One analyst suggests that there's a 90 percent chance that Microsoft will be trading at $30 in a year, against a 10 percent chance it'll be down. What is the value of the options?

Decision Problems

(See section 4.6)

If we have to decide between two (or more) choices, one way is to compare their expected values, and choose the largest. Just compute the expected value for each and pick the biggest.

Q: You estimate that the probability of rain on your wedding day is 20 percent. You have to decide whether to hold the ceremony indoors or outdoors. What with ruined gowns and unhappy guests, costs of renting a hall versus a tent, great sunshiny photos and all that, you decide to put dollar figures on the outcome as follows:

weatherindoorsoutdoors
rain-1000-3000
sun-1500-200

Given these numbers, which should you choose?

What other decision criteria can you think of?

Sample Space

(See section 5.1)

Mathematics often seems to be about numbers, but the mathematics of probability is about events: what's the probability of various outcomes given a description of the set of all possible outcomes. The sample space is this set of all possible outcomes. Examples:

These are mostly intuitive, but sometimes things are tricky. For example, the days this month that you have homework due seems like its sampled from the set of numbers {1,...,28}. Actually, the sample space is all the subsets of { 1, ..., 28 }. That's because possible outcomes are:

and so on.

Events

(see section 5.2)

Another thing that isn't so intuitive is that events are defined as

any subset of the sample space, including the whole space and the empty set

Let's take (American) roulette as an example. The sample space is

{Green 0, Green 00, Red 1, Black 2, ... Red 36 }

Clearly, the ball lands in exactly one of these. But those aren't the events. The events include all of those plus:

And those are just the bets you can make in roulette. In general, you could bet on "prime" or "my favorite numbers" or any set of numbers at all. These are all events

Since events are defined using sets, you can specify an event using the set operations of union, intersection, complement and so forth.

Venn diagrams can be helpful sometimes.

Q: If someone bets on both black and odd, what are the chances that he'll lose all his money?

Probability Rules

We can assign a number to each event. If they satisfy the following rules, they can be the probabilities of the events. (A different assignment might be the correct probabilities.) This is just talking about what's permitted.

  1. Real numbers between 0 and 1, inclusive. 0 <= P(A) <= 1
  2. The event that certainly occurs has probability 1; an event that cannot occur as probability 0.
  3. If two events are mutually exclusive (their Venn diagrams don't overlap), then the probability that either occurs is the sum of their individual probabilities.
  4. The probability of an event and its complement sum to one: P(A)+P(A')=1

Let's take some examples:

Odds

(See section 5.4)

You'll sometimes hear people quote odds rather than probabilities. Suppose the odds are 3:2. That means for every 3 chances to win, there are 2 chances to lose. Even odds are 1:1.

If the odds are a:b, the probability of winning is a/(a+b) and the probability of losing is b/(a+b).

Q: If the probability of winning is 18/36, what are the odds?

Addition Rule

(See section 5.5)

The addition rule is fairly intuitive if you keep the Venn diagrams and the definitions of probability in your mind:

The probability of a set of k mutually exclusive events is just the sum of their individual probabilities:
Pr(A1 U A2 U ... U Ak) = Pr(A1)+Pr(A2)+...+Pr(Ak)

Q: If the probability that someone buys a Ford is 10 percent and the probability that someone buys a GM is 15 percent, what's the probability that someone buys either a Ford or a GM car?

Q: If the probability that someone takes CS110 is 5 percent and the probability that someone takes CS111 is 2 percent, what's the probability that someone takes either of those two courses?

The General Addition Rule

The reason we specify mutually exclusive is because we don't want to double-count the outcomes that are in the intersection of the two events. But if we know how many there are there, we can just subtract the extra counts:

Pr(A union B) = Pr(A) + Pr(B) - Pr(A intersect B);

There are even fancier rules for when you have three or more events, but this will do fine for us.

Q: If the probability that someone takes CS110 is 5 percent and the probability that someone takes CS111 is 2 percent, and the probability of someone taking both is 0.1 percent, what's the probability that someone takes either of those two courses?

Conditional Probability

(see section 5.6)

Conditional Probability is not intuitive, at least at first, and maybe never. Still, it has some important applications, so it's worth laboring over. Let's start with an example:

Q: Suppose 51 percent of Americans are women (and 49 percent are men). Suppose that 90 percent of American women want to have kids and 80 percent of American men want to have kids:

And so on. (I thought of this example because of a Boston Globe Magazine article, 2/22/2004, about people who don't want kids. It's pretty interesting, particularly the parts about anti-kid zoning restrictions and building codes. I recommend the article! I made up all the stats in this example, though.)

Let's try to understand that with the following table:

AmericansWomenMenMargins
Kids   
No Kids   
Margins5149100

Let's figure out the Venn diagram here and fill in all the empty cells.

Once we understand the Venn diagram, we can compute some conditional probabilities. Let's define the following events:

Our original information is the following formulas:

This is based on the following definition of conditional probability (for brevity, let's notate A intersection B as AB):

Pr(A|B) = Pr(AB) / Pr(B)

This is read as "The probability of A, given B, is ..." What it means is the probability of event A if we know (or stipulate) that event B occurs.

Some other things we can compute are:

and so on.

Independent Events

(see section 5.7)

If two events, A and B, are independent, then

Pr(AB) = Pr(A)*Pr(B)
Pr(A|B)Pr(B) = Pr(A)*Pr(B)
Pr(A|B) = Pr(A)

The concept of independence is a very important one. Essentially, it means:

If A and B are independent, the probability of A is not affected by the occurrence or non-occurrence of B.

Examples:

However, we have to distinguish between causally independent and statistically independent. The latter is just about the probability statements, particularly the one that says the conditional probability is the same as the "prior" probability. The former is a harder statement to make because it says a lot about the world and the way it works. However, we can say that if two events are causally independent, they will certainly be statistically independent. The latter isn't necessarily true: they might be statistically independent just "by coincidence."

This work is licensed under a Creative Commons License | Creative Commons License | Viewable With Any
Browser | Valid HTML 4.01! | Valid CSS!