See section 7.2 of Freund
Karl Friedrich Gauss invented this distribution when he was analyzing the errors he made observing star positions. He reasoned that he was more likely to make small errors than large ones, that he was just as likely to be on one side than the other, and that the larger the error, the less likely. What he came up with is often called the normal distribution, but sometimes called the Gaussian in his honor. I like to call it the Gaussian not only to honor him but because I don't like the implication that every other distribution is abnormal.
Nevertheless, the Gaussian is incredibly important, for reasons that will become more clear when we discuss the Central Limit Theorem (in a later lecture). For now, suffice it to say that many things are distributed in a Gaussian way.
Here's a nice visualization of the Gaussian. Note that there are a couple of typos on that site.
The PDF for the Gaussian is very similar to the exponential, the major differences being that the exponent is squared (which gets us symmetry around zero) and some constants:
pdf(x) = exp(-power(x,2)/2)/sqrt(2*π)
The preceding is called the standard normal and has a mean of 0 and a standard deviation of 1. A more general form is allows a mean of μ (mu) and a standard deviation of σ (sigma):
Don't let this intimidate you! It's not that bad.pdf(x) = exp(-power(x-μ,2)/(2*power(σ,2)))/(σ*sqrt(2*π))
Some convenient properties
Q: Sketch some Gaussians:
Now, here's a kick in the head: even though the Gaussian is one of the most important distributions, it has no closed-form integral! So we don't have a CDF!
Most probability and statistics courses make do with a table in the book. (These tables come from complex and tedious numerical methods that are of no interest to us.) That's what Freund has, just inside the back cover. However, we have Excel, which implements those complex, tedious numerical methods as:
normdist(x, mean, standard_deviation, cumulative)
normsdist(x)
Q: Use one of the functions above to compute the Gaussian from -3 sd to +3 sd.
Q: Plot the Gaussian for μ=10, and &sigma=2.
Q: Verify one of the properties of the Gaussian, above.
There you are, on a desert island, without Excel, and you need to compute a probability using Gaussian tables. Fortunately, a statistics textbook washed up on shore with you. How do you use it?
In Freund, the table gives the area from 0 to z. Some statistics texts work a little differently (say, giving the area from negative infinity) to z, but that's a minor difference.
Say that you need to find the probability of a value less than 1.23:
Pr(0<X<1.23) = 0.3907
Our original question was what's the probability of a value less than 1.23. That's:
Pr(X<1.23) = 0.5+0.3907 = 0.8907
What's the probability of a value greater than 1.23?
Pr(X>1.23) = 0.5-0.3907 = 0.1093
We commonly transform the usual Gaussian (mean=0, variance=1) to have a different mean and variance. But, we can go the other way and transform our scores to the standard Gaussian, in which case everything is described with respect to how many standard deviations it is away from the mean. These are often called z-scores
z = (x-μ)/σ
Q: Generate some random numbers using Excel, using a formula such as:
and compute the z-scores for them.randbetween(0,25)+ randbetween(0,25)+ randbetween(0,25)+ randbetween(0,25)
There are two main ways we can compute the probability of some event (remember, an event is a subset of the number line).
We'll do the latter now. These often involve using some facts about the distribution, such as the fact that it is symmetrical around zero.
This work is licensed under a Creative Commons
License |
|
|
|