Reading: Section 2.4 on Graphical Presentations
Charts and Graphs
Presenting data is important, and graphs, charts, histograms and other
visual presentations are very important. Being able to visualize data
helps to focus on what is important: outliers, peaks and valleys, trends
of growth or decline, and so forth.
Let's start with a spreadsheet of data about the application of the death penalty in the US.
Please download the Excel file and open it in Excel.
On January 1, 2006, The Boston Globe Magazine published a cover
story on the occasion of the 1000th execution in the U.S. since the
Supreme Court reinstated the death penalty in 1976. The article had a
breakdown of that data by state, which is interesting as a dataset.
Thanks to Irene Laursen in the Wellesley's Science Library, we have that
data, plus additional detail. Take a few minutes to look over the data.
Here are some additional links:
Line graphs are useful when the x-axis is ordered in some way.
The most common way for it to be ordered is time. The graphs
we've seen in Extend for the amount of money or the number of rabbits or
whatever are interesting because of how they change over time,
and a Line Graph is a good way of doing this.
Let's graph the number of executions in the U.S. since the
reinstatement of the death penalty in 1976. To get started, choose
"Insert / Chart" or click on the icon of a bar chart. This starts the
"chart wizard," which helps you create a chart in four steps.
- Chart Type:
- We'll take a minute to discuss the different types, but for this
example, choose "Line"
- We'll also take a minute to discuss the different sub-types, but for
now, choose the one in the upper left (also "Line")
- click "next"
- Chart Source Data:
- Click in the "Data Range" box and then select the row with the data
you want. By data, it means the y-axis numbers. The data
can be in a column if you want; it doesn't matter.
- For this example, choose the row with "U.S. Total" in it. When you're
done, your data range formula should look like "Sheet1!$c$13:$AE$13"
- click "next"
- Chart Options. These are mostly labels and other non-data stuff
- The Chart Title can only be a fixed string. Make up an appropriate
one, such as "U.S. Executions since 1976"
- The X axis can only be a fixed string. Type in "year"
- The Y axis can only be a fixed string. Type in "executions"
- We'll leave the other options at their defaults for now.
- click "next"
- Chart Location. The chart can either be placed in an existing sheet or a new sheet.
The default is reasonable.
Look at the result. Feel free to compare it with the chart under
"Chart1" in the tabs below the spreadsheet. Notice:
- The X axis is 1-29, which is not nearly as usful as knowing the
particular year. How can we change this?
- The Y axis goes from 0 to 120, but none of the data are over 100, so
maybe that would be a better choice. How can we change this?
- The legend calls this line "Series1," which isn't very informative.
How can we change this?
- The line is light blue. What if we wanted a different color?
You should see a little window floating in front of your plot. Some of
the things we are interested in changing can be changed there. For
example, to change the line color, pull down the menu and choose "Series
1". The little icon next to the pull-down menu brings up a window called
"Format Data Series." Alternatively, you can try double-clicking on the
line itself. From the Format Data Series window, you can choose different
colors and such.
If you double-click the "120" on the y-axis, you can bring up a "Format
Axis" window, which has a tab for "scale." Change it to 100.
We'll fix the others by doing them right the first time. (We could
also do it after the fact, but I think it's better and easier to see how
to get it right the first time.) So we'll re-do our chart. Delete the
first one and then:
- Same choices for the chart type
- For the Chart Data
- click on "series" at the top of the window. This is used if there is
more than one dependent variable (y-axis value) and if there is a
non-trivial x axis variable.
- By default, Excel has grabbed a bunch of rows, with each a
series. This is actually pretty good. Let's keep it for now.
- Click below that in the box next to "Category (X) axis labels:" and
select the C11:AE11 range.
- Click "finish," since we're just curious about what this will do.
Pretty cool, hunh? Notice that the X axis is now correct. However,
the Y data is more than we want, at least for this graph. Let's fix that.
- Click on "Chart/Source Data" and you'll get a window much like step 2
of the chart wizard.
- Click on "series" to switch to that tab
- Delete the series you don't want
- Add the series you do want. You might want only the total, in which
case the "Name" box will be "=Sheet!$A$42" and the "Values" box will be
- Click on OK when you are done. Feel free to compare it with the chart
under "Chart2" in the tabs below.
Examine the result. Looking at the breakdown by the three major
regions is worthwhile. This is a good time to consider the other chart
sub-types. You can change chart sub-types with "Chart/Chart Type."
Another way to display data is via a bar chart. A bar chart is more
appropriate when there isn't necessarily any order to the X axis, or,
even if there is, "trends" are not the point. For example, we could look
at the row totals, broken down by region or even by state.
Executions by Region
Use the chart wizard again.
- Chart Type: Choose "Column," and choose the sub-type in the upper left ("clustered
- Chart Source Data. You can see that Excel has already made some
"guesses." These will typically be useless, unless you have a simple
spreadsheet. Here, delete it's notion of the data, and replace it with
four cell addresses, separated by commas:
- B13 (the U.S. total)
- B17 (the northeast)
- B28 (the midwest)
- B42 (the south)
Then, switch to the "series" tab and choose a "name" cell for each
series, namely the data in column A.
- Chart Options: fill in the title and axis labels as you see fit.
- Chart Location: put it anywhere you like. Compare with "Chart3"
Notice that the data from the Department of Justice doesn't include the
2005 data. I added a column to compute the totals through 2005; it's
Executions by State
Redo the chart, building a bar chart by state. Unfortunately, Excel's
desire to guess something useful will work against us. To see, let's try it:
- Same choice for the first step
- For the second step, try to get the data ranges correct. It's a pain,
especially since there are 50+ of them.
Start over. This time, copy the first two columns (starting at US
total) to a blank sheet.
- Same choice for the first step
- For the second step, choose "series in columns" and switch to the
series tab. Then, amend the formula for values for Series1 to be the B
column (B1:B62) and the Category (X) axis labels to be the A column
- Choose an appropriate chart title and axis labels
- Finish and compare with Chart4
Having the state labels vertical is hard to read. This is a good case
for the horizontal orientation.
- Switch the chart type to "Bar"
- Notice that not all of the states are listed. Double-click on the
y-axis labels and you'll see a "Format Axis" window. Under the "Scale"
tab, make sure "number of categories between tick mark labels" is 1.
- Reduce the font so that you can read all the text.
- Compare to "Chart5"
Consult Flick Coleman's histogram
tutorial in Flash.
Note that it's actually not necessary to sort the data; the histogram
tool will sort a copy of the data.
Do you agree with summarizing by region?
How would you summarize the data about the 1000 total executions since 1976?
- What is the mean?
- What is the median?
- What is the mode?
Sheet3 has the data with just the basic 52 regions (50 states, plus DC
You could sort the data by state name or by magnitude. Let's try both.
- On Sheet 3, notice that I added a header row with column labels
(region name and executions). This can be helpful with sorting, but it's
- Select the whole sheet (command-A or the little diamond) then choose
"Data/Sort." Notice the option where you can say whether the data has a
header row (in case Excel guesses wrong).
- Choose to sort by region name, ascending.
- Make a chart of the data. What sticks out?
- Sort by number of executions and make another chart.
Assume you can report several numbers, not just one. What do you
- What would you report if you could report only one number?
- What would you report if you could report only two numbers?
- What would you report if you could report only three numbers?
- What would you report if you could report only four numbers?
Look at the chart Executions
by Year 1608-2000? (It's at the bottom of the page.) What do you
think of it? Why?
More Chart Analysis
I'll hand out a copy of the January 30, 2006 Newsweek cover story on
boys and girls, which has two charts on the bottom of page 52 that I think
are worth trying to re-do. Please do them.
If you want to compare, here is my charts for the boys and girls reading and
What is the effect of zooming in on the data?
This work is licensed under a Creative Commons