CS199 Fallacies

Hot Hands

Many sports fans believe in the phenomenon of "streaks." In basketball, this is called "hot hands," and it means that someone gets "hot" and they can make all kinds of great shots until the streak ends. Suddenly, for a while, almost everything goes in. There are similar beliefs about other sports, too. Commentators in tennis, my favorite sport, are constantly talking about some "lifting their game."

Why does this happen? A sudden burst of confidence? Confidence is indeed important in sport, and there's no reason to think that the probability of making a shot in basketball or tennis is always the same, in the way that a coin is always 50-50. Nevertheless, it might be that a streak is just a phenomenon of probability: if you're tossing a coin, occasional streaks are a normal and expected thing.

So, which is it? Is the streak because the person really is hot, or is it a probability thing?

This belief matter, too, because if the athletes and coachs believe in hot hands (and they do), they'll be more likely to feed the ball to the "hot shooter." The "hot shooter" may also be more likely to take risky or low-percentage shots.

Here are some web sites well worth reading:

The Gambler's Fallacy

Many gamblers rely on a theory that is a variation of the law of large numbers. Here's the theory:

If an outcome (number, color, or whatever) hasn't come up in a while, it's more likely than usual. That's because it has to come up in order to even things out for the long-run behavior.

This makes sense, because we know that if we are going to toss a coin 10 times, it's likely to come up heads about 5 times. If the coin comes up heads in all of the first 5 tosses, it has to be less likely to come up heads in the next 5, just to even things out. Right?

The Law of Large Numbers and Pattern Perception

These theories are contradictory, of course. Hot hands says that the probability of something during a run is higher, and the Gambler's fallacy says that the probability is lower. Which is it?

For gambling games, like dice or roulette, we know that the probabilities are independent, so the probability is neither higher nor lower; it's just the same as always.

But what about the law of large numbers? Take our example of the coins: doesn't the law say that heads has to come up in the next 5 tosses? First of all, 10 isn't big enough for the law to apply. If you've gotten 5 heads in a row, consider the next 100 tosses, not the next 5. Suppose you get 50 heads in the next 100, so you actually got 55 out of 105. The frequency is 52.4 percent, then, which is much closer to 50 percent. If we considered the next 1000, we'd be even closer.

In short, the law of large numbers isn't trying to "compensate" for earlier imbalances. Instead, it's just going to swamp those imbalances with lots of balanced outcomes.

The solution is dilution.

What about the "hot hands" idea? We're on thinner ice here. Since we're dealing with athletes, most of whom are human beings, there might be something to this. Confidence is a real factor, and a bit of good or bad luck can swing things from one competitor to another. So, maybe there's something to it.

Or maybe not. Human beings are remarkably good at seeing patterns. It's a wonderful ability. Unfortunately, we can see patterns that aren't there.

Perceptions of Randomness

For this part of the lab, I want us to focus on sequences that are random and exercise our intuitions. In particular, I want us to investigate the "hot hands" idea. Maybe it's just a case of a sequence of random events looking non-random.

Q: Assuming a fair coin and 10 tosses, which sequence is more likely:

HTHTHTHTHT
HHHHHTTTTT

Q: Assuming a fair coin and 10 tosses, which sequence is more likely:

HTHTHTHTHT
HHHHHHHHHH

Q: What would you guess is the probability of getting a run of 5 heads in a row if you toss a coin 100 times? Let's make this multiple choice:

  1. pretty unlikely (less than 10 percent)
  2. unlikely, but not strange (less than 50 percent)
  3. likely (more than 50 percent)
  4. quite likely (more than 90 percent)

A Model

A negative binomial waiting for one success can give you some insight into run probabilities, since the outcome is by definition the length of the run of failures preceding the success.

Q: First, build a model that just shows sequences of random 0/1 numbers, using Plotter I/O. Look at a few sequences of 100. (I suggest changing the plotter so that the plot is like a cityscape and not a mountainscape.) What's the longest streak you see? How likely does a streak of 5 seem?

Q: Think about how to configure a negative binomial to get insight into this question. Or, better yet, talk about it with someone else in the class. Use your ideas in an Extend model to estimate the probability above. We'll talk about this as a class, too.

A Simulation to Count Runs

Conceptually building a simulation to answer this question is fairly easy, and if we were doing this by hand, here's what we would do:

  1. Initialize a counter of the length of a run to zero
  2. Start a loop over the following steps:
    1. Compute a random 0/1 integer, where 0=tails and 1=heads
    2. If the integer is the same as the previous one from the last time around this loop, increment the counter.
    3. If it's different, add the run counter to a histogram, exit the loop and go back to step 1.

If we were all programmers, we could turn that algorithm into a program. There are a few technical issues, such as what to do on the first time through the inner loop (when there is no previous number), but otherwise it's straightforward.

However, building this model in Extend isn't so easy. This is a fairly common problem with commercial software: it's just not as flexible as we'd like. Here are some issues and solutions for modeling this in Extend:

Q: Build a draft model that follows this algorithm. Look at the values with a Plotter I/O block. Look at a lot of information, so that it starts to make sense. When you feel like you have a good draft model, take a look at mine and investigate it. We'll talk collectively about the various weirdnesses in mine. Note that I left in some of the "scaffolding" that I put in while building the model, because you can learn more about how I build things by seeing the scaffolding as well as the model.


Solutions

Q: Assuming a fair coin and 10 tosses, which sequence is more likely:

HTHTHTHTHT
HHHHHTTTTT

They are equally likely! Each has a probability of 2-10 or 1/1024. The first looks "more random," doesn't it?

Q: Assuming a fair coin and 10 tosses, which sequence is more likely:

HTHTHTHTHT
HHHHHHHHHH

Again, they are equally likely! Each has a probability of 2-10 or 1/1024. We "know" that it's more likely that you'll get 5 heads than 10, but that's because there are lots of different sequences that have 5 heads (10 choose 5 of them), while there is only one sequence with 10.

Q: What would you guess is the probability of getting a run of 5 heads in a row if you toss a coin 100 times?

Here are two possible models:

run-length1c.mox
run-length2.mox

I wasn't sure myself, but it turns out that to happen more than 90 percent of the time.

This work is licensed under a Creative Commons License | Creative Commons License | Viewable With Any
Browser | Valid HTML 4.01! | Valid CSS!