Assignment 3

Due: Thursday, February 27 by 5:00pm

You can turn in your assignment up until 5:00pm on 2/27/20. You should hand in both a hardcopy and electronic copy of your solutions. Your hardcopy submission should include printouts of three code files: grades.m, energy.m, and recognize.m. To save paper, you can cut and paste all of your code files into one script, but your electronic submission should contain the separate files. Your electronic submission is described in the section How to turn in this assignment. If you need an extension on this assignment, please see the Late Assignment Policy on the Course Information Page. Keep in mind that the first exam will take place in class on Monday, March 2, and will include topics explored in this assignment.

This assignment contains one programming exercise and two extended problems. You will start working on the exercise with a partner in class and should complete the assignment with that partner. We ask that your partner for this assignment be different from those who you worked with on the first two assignments. (Starting with Assignment 4, you'll be able to choose any partner(s), including those who you worked with previously.)

Reading

The following material from the fifth or sixth edition of the text is especially useful to review for this assignment: pages 39-53, 63-68, 72-78. You should also review notes and examples from Lectures #6-8.

Getting Started: Download assign3 folders

Use Cyberduck to download the assign3_exercises folder. This folder contains one image file for an exercise that you'll complete in class, and one code file for the exercise in this assignment. Later, to begin your work on the programming problems, download the assign3_problems folder.

Uploading your completed work

When you have completed all of the work for this assignment, your assign3_exercises folder should include the code file for the exercise, grades.m. Your assign3_problems folder should contain two code files named energy.m and recognize.m. Use Cyberduck to upload your assign3_exercises and assign3_problems folders into the cs112/drop/assign03 folder. More details about this process can be found on the webpage on Managing Assignment Work.

Exercise: Working with gradesheets

The following table provides the semester grades for six CS112 students:

   
Name Homework Participation Exam 1 Exam 2 Final Project
Camila 83 9076 80 88
Barb 79 8080 81 70
Shana 90 9092 98 94
Diana 72 7563 75 73
Nour 88 9078 8982
Alisha 67 6569 7280

Open the file called grades.m in the assign3_exercises folder. This file creates a matrix named scores that contains the information in the above table, and two cell arrays called names and measures that store the names of the students and the components of the grade. Add code to the grades.m code file to perform the tasks listed below. Try to write code that is as compact as possible but still readable and understandable.

  • Plot the Exam 1 and Exam 2 scores for the six students on a single graph, with the y-axis ranging from 50 to 100.
  • Add a new column to the scores matrix with each student's final course score. The course score is calculated in CS112 as follows: Homework counts 40%, Participation 5%, Exams are each 20% and the Final Project is 15%.
  • Create a new vector avgScores with the average (across students) of each component of the final grade, as well as the average final course grade.
  • Print the names of the students whose final course score is higher than the class average.
  • Print the names of the grade components that received the highest and lowest average scores.
  • Plot the final course grades for the six students on the same graph as the Exam 1 and 2 scores, and add a horizontal line across the graph whose height is the average final course grade (for reference).
  • Add a title, axis labels and legend to the graph.

Add comments to your grades.m file describing each of the above tasks, and also add comments with both your name and your partner's name.

Problem 1: Energy production and consumption

In this problem, you will explore data on U.S. energy production and consumption using energy statistics available from the Energy Information Administration (EIA) within the U.S. Department of Energy. The assign3_problems folder in the course download directory contains two Excel spreadsheets, production.xls and consumption.xls, that were downloaded from the Energy Overview page at the EIA website. These spreadsheets contain both numerical and textual data related to the production and consumption of various energy sources over the years 1949-2006, which you can view by opening these files in Excel (the current government site includes data through most of 2019, but we will just use an earlier subset of the data for this problem). Unfortunately, a separate MATLAB toolbox is needed to load data from complex spreadsheets such as this, and the public computers at Wellesley do not currently have this toolbox. We can, however, read Excel spreadsheets that contain only numerical data into MATLAB. The two files, produce.xls and consume.xls, contain most of the numerical data from the original EIA spreadsheets. These files can be read into MATLAB using the xlsread function, for example:

productionData = xlsread('produce.xls');

The contents of both produce.xls and consume.xls will be loaded into matrices with 58 rows corresponding to the years 1949-2006. The matrix created from produce.xls has 14 columns representing (1) year, (2) coal, (3) natural gas, (4) oil, (5) NGPL (natural gas plant liquids), (6) total fossil fuels, (7) nuclear, (8) hydroelectric, (9) geothermal, (10) solar, (11) wind, (12) biomass, (13) total renewable energy and (14) total energy produced. The matrix created from consume.xls contains 13 columns representing (1) year, (2) coal, (3) natural gas, (4) oil, (5) total fossil fuels, (6) nuclear, (7) hydroelectric, (8) geothermal, (9) solar, (10) wind, (11) biomass, (12) total renewable energy and (13) total energy consumed. The file population.mat in the assign3_programs contains a column vector of the U.S. population over the years 1949-2006.

Create a script file named energy.m in the assign3_problems folder that reads in the contents of the produce.xls, consume.xls and population.mat files and performs the following tasks:

Task 1: Plot the raw data

In a single figure window, plot the following data in one graph: amount of coal, gas, oil, NGLP, nuclear and total renewable energy produced, and the amount of coal, gas and oil consumed (note that the U.S. consumes all of the renewable energy sources that it produces). Plot this data as a function of the year. There should be 9 line plots drawn in a single plotting area. Complete this code without creating any additional variables - each call to the plot function should refer directly to the two matrices storing the data. Use one line style for all of the production data and a different line style for all of the consumption data, and use different colors for the plots. (You should specify the colors explicitly in your code, rather than allowing MATLAB to choose the colors automatically.) Add a title, axis labels and legend for the graph. Note that you can drag the corners of the figure window to expand its size, and drag the legend to a new location if desired.

Task 2: Graphical analyses of the data

Open a second figure window and create four graphs that display the following information for each year. In each case, the year can be plotted on the x axis. Use subplot to define a 2 x 2 grid of plotting areas for drawing the four graphs.

  1. the fraction of total energy consumed that is produced in the U.S.
  2. the amount of foreign oil that was needed to support the demand (i.e. the difference between oil consumed and produced)
  3. the percentage of energy consumed that comes from renewable sources
  4. the per capita total consumption of all energy sources combined

These observations are unfortunately a bit disturbing...

Add comments to your energy.m code file to document your code and also provide the names of you and your partner.

Problem 2: Recognizing famous Wellesley alums

Wellesley College is proud to have some very distinguished alumnae! For this problem, you'll write a program to recognize the faces of four of our special graduates: Madeleine Albright '59, Jane Bolin '28, Pamela Melroy '83, and Soong Mei-Ling '17. The assign3_problems folder contains four face images whose identity is assumed to be known (albright.jpg, bolin.jpg, melroy.jpg, soong.jpg) and four face images to be recognized by your program (face1.jpg, face2.jpg, face3.jpg, face4.jpg). The file recognize.m contains initial code that loads the 8 face images into variables in the MATLAB workspace and displays the "known" face images using subplot and imshow:

figure window with 4 face images

The initial code also creates a cell array that stores the four names of our Wellesley alumnae. An individual name can be accessed in a cell array using an index, as shown in the following code snippet (note the curly braces around the index!):

>> names = {'Ellen', 'Stella', 'Hannah', 'Nicole'};
>> disp(['Our lab instructor is ' names{2}])
Our lab instructor is Stella

Add code to recognize.m that displays the four "unknown" face images in a 2 x 2 arrangement, similar to that shown above, with a title above each image indicating its variable name (face1, face2, etc.). Use the figure command to display these four images in a new figure window.

One strategy we can use to recognize an unknown image is to measure the difference between the pattern of brightness values in the unknown image and the patterns of brightness in each of a set of known images. We can then select the known image that represents the closest match. Consider a very simple example where we have two known image patterns corresponding to a vertical or horizontal edge, as shown below:

edge images

Suppose we are given a "mystery" image and want to determine whether it has a vertical or horizontal edge pattern:

mystery image

We can first calculate the element-by-element difference between each known image and our new mystery image:

difference images

We are really only interested in the amount of difference between the two patterns, so we can take the absolute value of the differences (this can be done with the abs function in MATLAB):

difference images

On average, the brightness values in the mystery image differ from the brightness values in the vertical image by only 0.2, while the brightness values differ from those in the horizontal image by 0.5 (on average). Thus there is a closer match between our mystery image and the vertical image, so we recognize it as a vertical edge.

Recognizing the unknown faces

We can use the above strategy to calculate how well the pattern of brightnesses match between two face images. We can then determine which of the known faces is the "best match" to each unknown face, to discover its identity. First add code to recognize.m to implement the following steps to recognize the face1 image:

  • Calculate the average absolute difference (as described above) between the face1 image and each of the four known face images stored in the variables albright, bolin, melroy, and soong. Store these four values in a vector, in the order that these names are listed here.
  • Determine which location of the vector contains the smallest value (the least difference, or best match). Hint: the min command can return two values, the minimum value in a vector and the index of the first occurrence of this minimum value, as shown in the following example:
    >> nums = [4 1 8 2];
    >> [minValue, minIndex] = min(nums)
    minValue =
         1
    minIndex =
         2
  • Use the index of the minimum value together with the names cell array, to get the name associated with this face, and print a message that contains this name.

Repeat these steps to recognize the person portrayed in the other three images, face2, face3, and face4. (Cutting and pasting, followed by small modifications to the copied code, will save a lot of time!) Your program should be able to recognize each of the four faces correctly.

Add comments to your recognize.m code file, describing the code. Also add comments at the top of the file with your partner names.

How to turn in this assignment

Step 1. Complete this online form.
The form asks you to estimate your time spent on the problems. We use this information to help us design assignments for future versions of CS112. Completing the form is a requirement of submitting the assignment.

Step 2. Upload your final programs to the CS server. When you have completed all of the work for this assignment, your assign3_exercises folder should contain the code file named grades.m. Your assign3_problems folder should contain the two code files named energy.m and recognize.m. (You can keep the original image and data files in these folders.) Use Cyberduck to connect to your personal account on the server and navigate to your cs112/drop/assign03 folder. Drag your assign3_exercises and assign3_problems folders to this drop folder. More details about this process can be found on the webpage on Managing Assignment Work.

Step 3. Hardcopy submission.
Your hardcopy submission should include printouts of three code files: grades.m, energy.m, and recognize.m. To save paper, you can cut and paste your four code files into one script, and you only need to submit one hardcopy for you and your partner. If you cannot submit your hardcopy in class on the due date, please slide it under Ellen's office door.