The Exponential Distribution

An extremely important distribution in modeling is the exponential distribution. Mathematically, it's defined as:

pdf(x) = (1/μ)exp(-x/μ)

The use of the greek letter μ here is on purpose: the mean of this distribution is μ. The mean is also a parameter of the distribution.

We've actually seen this function before: it's the exponential decay function!

Q: Using Excel, graph this for three different values of μ, say 0.5, 1, and 2. Put the three functions on a single chart so that we can easily compare them.

In simulation, the exponential is often used for things like waiting times. In general, the amount of time you wait for something:

Another reason that the exponential is so important is that it is memoryless. This term means that, if you've already been waiting for a few minutes, the amount of time you expect to keep waiting is the same .

Example: You're playing craps and your point is a 4. You've tossed the dice three more times and not gotten a 4. Someone else comes up and bets the Pass line (bets on you to roll a 4 before rolling a 7). The expected number of rolls before you roll a 4 is the same for each of you, even though you've been waiting for a while.

Example: The busses in a town have no schedule: they just wander around on their routes. Suppose that the distribution of interarrival times is exponential. You come to a bus stop and wait. Two minutes later, someone else arrives. You both have the same expected time to wait for the bus, even though you have been waiting longer.

This memoryless property can be defined mathematically:

Pr(X > s+t | X > t) = Pr(X > s)

which we read as "the probability that you'll wait more than s+t minutes given that you've already been waiting t minutes is equal to the probability that you'll wait s minutes." Translating yet again, this means that "the probability that you'll wait s more minutes given that you've already been waiting t minutes is the same as the probability that you'd wait for more than s minutes from the beginning."

Although this seems weird, it turns out to be surprisingly common.

Here's an interesting connection:

The interarrival times of events in a Poisson process is exponential

More specifically:

The interarrival times of events in a Poisson process with rate r is exponential with mean 1/r

Q: What does that bizarre sentence mean in ordinary English?

The Cumulative Distribution Function (CDF)

As we know, the probability of an event in a continuous distribution is equal to the area under the curve of the PDF for the given event. So far, we've been finding the area by geometry calculations. What about the exponential? No geometry will help us here. However, if we had the integral of the PDF, we could find areas pretty easily. Here's the basic idea:

The CDF is the definite integral from negative infinity to x of the PDF.

Q: What the heck does that mean?

Q: Draw a picture of this and spend some time understanding it.

Q: What's the CDF of the standard uniform distribution?

Q: What's the CDF of the triangular distribution from 0 to 1 with a mode at 0.5?

Okay, given a CDF, what good is it? The answer is that we can use it to compute the probability of an event with a simple subtraction:

Pr(a < x < b ) = cdf(b) - cdf(a)

Q: Draw a picture of this.

Q: What's the probability of a number between 0.4 and 0.6 in the triangular distribution we did above?

Integrating the Exponential Distribution

Okay, then, what's the cdf of the exponential? It's:

cdf(x) = 1 - exp(-x/μ)

which is surprisingly simple as these things go.

Q: Graph this using Excel. Make sure this makes sense to you.

Q: With an exponential distribution with mean 2, what's the probability of a number less than 1? Between 1 and 2?

Q: What's the median of this distribution? You can find this by trial and error if you don't remember how to algebraically manipulate exponentials.

Solutions

Excel file for the conception problem: conception.xls

Excel file for the triangular distribution: triangular-cdf.xls

Excel file for the exponential distribution: exponential-cdf.xls

pdf of given triangular distribution has two cases:

integral of f(x)=4x is F(x)=2x2+C

integral of f(x)=4-4x is F(x)=4x-2x2+C

cdf of given triangular distribution has two cases:

Why is that latter one correct?

This work is licensed under a Creative Commons License | Creative Commons License | Viewable With Any
Browser | Valid HTML 4.01! | Valid CSS!