|
CS 112
Assignment 2
|
|
You can turn in your assignment up until 5:00pm on 2/16/11 without penalty, but
it is best to hand in the assignment at the beginning of class. Your hardcopy
submission should include a
cover
sheet and printouts of five code files: lab3.m, redsox.m, colonTest.m,
smartPhones.m and analyzeData.m. Your electronic submission is described in the
section Uploading your saved work. (If you'd like to save paper, you can
cut and paste all of your code files into one script, but your electronic submission should contain
the five separate files.)
assign2_programs folder from the cs112d
directory onto your Desktop. Rename the folder to be yours, e.g. sohie_assign2_programs.
In MATLAB, set the Current Directory to be your renamed assign2_programs folder
on your Desktop.
drop/assign02 folder
assign2_programs folder (that you renamed) into your
drop/assign02 folder
assign2_programs folder from the Desktop by dragging
it to the trash can, and then empty the trash (Finder--> Empty Trash).
When you are done with this assignment, you should have five code files stored in
your assign2_programs folder: lab3.m, redsox.m, colonTest.m,
smartPhones.m, analyzeData.m.
Create a new file in MATLAB called lab3.m (that you will turn in).
Make sure your current directory is set appropriately.
In lab3.m, first use MATLAB's
input function to prompt the user for three pieces of information:
1) a numerical month (between 1 and 12), 2) a day (between
1 and 31) and 3) the user's name. Store each of these values in a variable
(e.g. month is 7
(for July), day is 28 and name is 'Rosa'). To prompt
the user for a string, provide a second input 's' when calling
the input function:
name = input('Enter your name: ', 's'); % name is a string
Then write MATLAB expressions that correspond to the following:
valentine that is true on
February 14 and false otherwise. cs112midterm that is true on
February 23 and April 5 and false otherwise. springBreak that is true between March 17 and March 25 (inclusive).luckyDay that is true if the month
and day are both odd or both even, and false otherwise.
month and day are your birthday, then
print out a personalized birthday greeting such as "Happy Birthday to Sohie!",
otherwise print "Not Sohie's birthday". The disp function can be
used to print text that combines literal strings with variables whose value
is a string:
>> place = 'SCI 160a'; >> disp(['class will be held in ' place ' today']); class will be held in SCI 160a today
month is December, January or February, then print the lyrics of the
first stanza of the song, "Frosty the Snowman" (on four separate lines), otherwise print a message
of your choosing (this can be a single line).
|
Frosty the Snowman was a jolly happy soul, With a corncob pipe and a button nose and two eyes made out of coal. Frosty the Snowman is a fairy tale they say, He was made of snow but the children know how he came to life one day. |
month and
day and then seeing if your variables contain the correct values.
Add comments containing your name and date to your lab3.m file and
upload it with your other MATLAB files in your assign2_programs folder,
when turning in Assignment 2.
Exercise 2: The Red Sox roster |
|
|
In this exercise, you'll work with the following subset of data from the Boston Red Sox baseball team roster from the 2007 (World Series winning!) season:
| Player Name | Player Number | Weight | At Bats | Home Runs | Batting Average | 2007 Salary | Runs Batted In | Runs | Stolen Bases |
|---|---|---|---|---|---|---|---|---|---|
| Jason Varitek | 33 | 230 | 435 | 17 | .255 | 11,000,000 | 68 | 57 | 1 |
| David Ortiz | 34 | 230 | 549 | 35 | .332 | 13,250,000 | 117 | 116 | 3 |
| Manny Ramirez | 24 | 200 | 483 | 20 | .296 | 17,016,381 | 88 | 84 | 0 |
| J.D. Drew | 7 | 200 | 466 | 11 | .270 | 14,400,000 | 64 | 84 | 4 |
| Mike Lowell | 25 | 210 | 589 | 21 | .324 | 9,000,000 | 120 | 79 | 3 |
| Julio Lugo | 23 | 175 | 570 | 8 | .237 | 8,250,000 | 73 | 71 | 33 |
| Kevin Youkilis | 20 | 220 | 528 | 16 | .288 | 424,500 | 83 | 85 | 4 |
| Coco Crisp | 10 | 180 | 526 | 6 | .268 | 3,833,333 | 60 | 85 | 28 |
| Dustin Pedroia | 15 | 180 | 520 | 8 | .317 | 380,000 | 50 | 86 | 7 |
redsox.m creates 9 separate vectors for each numerical statistic.
Look closely at the code in redsox.m and note that the names are stored in a
cell array called names.
Think of a cell array as a special kind of vector that allows us to
store strings. Note that names is created using the curly brace { }
rather than the square bracket [ ] that we use for numerical vectors. Although a cell array is created using the curly braces, you do not need to use curly braces to access its contents. For example, here is a clip of MATLAB code accessing the contents of names:
>> powerHitter = names(homeRuns > 20) powerHitter = 'Ortiz' 'Lowell'
Note that you can place a semi-colon at the end of the initial assignment statements in
redsox.m to suppress the printout of the statistics.
In the exercise below, you may use MATLAB's
mean, sum, length, and any.
totalStolen with the total number of stolen bases in 2007.
avgWt with the average weight of a Red Sox player.
bestBatters with the name(s) of player(s) whose batting
average is greater than or equal to 0.300.
expensiveHomer that is true
if any player costs more than $500,000 per homerun.
bigHitter that is true if any
players hit more than 10 homeruns with batting averages less than .290.
highRBI with the name(s) of player(s) whose Runs Batted In is greater than or equal to Runs.
bigBatter that contains the number(s) of the player(s) with more than 550 at bats or more than 20 stolen bases.
weightInGold that contains the number of players who are paid more than $25,000 per pound of body weight in the 2007 season.
Add comments to your code so that it is clear and easy to read. Always include your name and date
at the top of each file. Save your final version of redsox.m in your
assign2_programs folder to upload to the cs server.
In lecture you learned how to use colon notation to specify a sequence of regularly spaced numbers. You also learned how to use indexing to read and store values in specific locations of a vector. This exercise combines these two concepts. Colon notation can be used to specify an evenly spaced sequence of vector indices or contents, as shown in the following examples:
>> nums = 1:8 nums = 1 2 3 4 5 6 7 8 >> nums(1:2:5) = 10 nums = 10 2 10 4 10 6 7 8 >> nums(2:3:8) = [13 9 16] nums = 10 13 10 4 9 6 7 16 >> nums2 = nums([1:3 6:8]) nums2 = 10 13 10 6 7 16 >> nums([1:3 6:8]) = 12:-2:2 nums = 12 10 8 4 9 6 4 2
The following program, colonTest.m is contained in your assign2_programs
folder. Follow the instructions in the comments to rewrite the existing code statements and add five
additional statements that use colon notation:
% colonTest.m % program that provides practice with colon notation and indexing % rewrite each of the next 4 statements using colon notation nums1 = [10 9 8 7 6 5 4 3 2 1] nums2 = nums1([2 4 6 8 10 7 4 1]) nums1([3 4 5 6]) = [9 6 3 0] nums3 = [1 2 3 1 2 3 1 2 3] % replace the next 3 statements with a single assignment statement % that uses colon notation nums2(6) = 10 nums2(7) = 20 nums2(8) = 30 % for each of the following examples, use "end" in the colon % notation, for example: nums8 = nums1(3:end) % write a statement that assigns nums4 to a vector that contains % the odd-indexed elements of nums1 % write a statement that assigns nums5 to a vector of the % elements contained in the top half (higher indices) of nums2 % write a statement that assigns nums6 to a vector that contains % every 3rd element of nums1, starting with index 2 % write a statement that places the value 0 in all of the % evenly indexed locations of nums2 % write a statement that places the numbers 8 12 16 20 in the % successive odd-indexed elements of nums2
If you write each code statement with no semi-colon at the end, so that the value generated is printed out during execution of the code, then your program should generate the following printout:
>> colonTest nums1 = 10 9 8 7 6 5 4 3 2 1 nums2 = 9 7 5 3 1 4 7 10 nums1 = 10 9 9 6 3 0 4 3 2 1 nums3 = 1 2 3 1 2 3 1 2 3 nums2 = 9 7 5 3 1 10 20 30 nums4 = 10 9 3 4 2 nums5 = 1 10 20 30 nums6 = 9 3 3 nums2 = 9 0 5 0 1 0 20 0 nums2 = 8 0 12 0 16 0 20 0
Place comments at the top of your file with your name and
date. Your final submission should include a copy of your
final colonTest.m code file.
Problem 1: Smartphones |
|
|
With her curiosity piqued by a couple recent surveys on trends in the smartphone market and use of smartphones by healthcare professionals, Wendy Wellesley decided to collect some data on smartphone preferences and uses among Wellesley students. Wendy conducted a survey of smartphone owners with the following questions:
The file smartPhones.m in your assign2_programs contains
Wendy's survey data. The file creates two vectors, currentPhones and
newPhones that contain the integers 1-4 indicating the smartphone brand
owned and desired by each of the 150 students who completed the survey. The file also creates
six vectors that each contain the number of minutes per day spent on each smartphone activity,
for each survey participant.
Add code to the smartPhones.m code file to perform the following tasks:
To complete these tasks, consider the following background, tips and guidelines:
The bar function can be used to create a bar graph, as shown in
the example below that displays the party affiliations of Massachusetts voters. This
example prints strings on the X axis using the XTickLabel property.
The three strings are stored in a cell array, designated by the surrounding
curly braces {}, similar to the names of the Redsox players in Exercise 2.
The final statement sets the Position property for the current figure
window, which allows the user to specify the location and dimensions
of the figure window. The four numbers in the vector [100 100 500 250] specify, in order,
the distance from the left side of the figure window to the left side of the computer screen,
the distance from the bottom of the figure window to the bottom of the screen, the width
and the height of the window.
>> voterParty = [37.1 11.4 51.2];
>> bar(voterParty)
>> set(gca, 'XTickLabel', {'Democrats' 'Republicans' 'Independents'})
>> ylabel('% of registered voters')
>> title('Party Affiliation of Massachusetts Voters')
>> set(gcf, 'Position', [100 100 500 250])

In your code, the zeros function can be used to create a vector to
store the percentages of iPhones, Androids, Blackberries and other smartphones, and
the percentages of each brand can then be calculated and stored in each location of
the vector.
subplot, described here, to
display the three bar graphs in one figure window in 3 x 1 configuration.
For each of the three main phone brands, print a message something like this:
iPhones would decrease from 46% to 34%
Remember that numbers need to be converted to strings when using disp:
>> slugs = 32.15; >> disp(['there are ' num2str(slugs) ' slugs in a pound']); there are 32.15 slugs in a pound
Your final submission should include a copy of your final smartPhones.m
code file.
Unreliable measurement instruments or unpredictable environments can sometimes yield data that is clearly erroneous. To obtain a reliable assessment of simple properties like the mean value of the data, it may be desirable to remove data samples that are clearly outside the expected range. Such samples are sometimes referred to as outliers. An advantage to analyzing data in MATLAB, with its general programming language, is that we can easily write a program to preprocess the data in a customized way. In this problem, you will complete a program that removes outlying data samples, using the mean and standard deviation of the data.
Imagine that you collected sonar data on the depth of the ocean floor over a large
region that is essentially flat. Due to instrument problems and the occasional large marine
animal, some measurements are clearly invalid. For simplicity, assume that all of the
erroneous measurements are underestimates of the true depth of the ocean floor.
The file analyzeData.m in your assign2_programs folder uses the
load command to load 1000 depth measurements from a file named depthData.mat
into a vector named depthData, and creates a plot of this initial data:

Most of the data is at a depth of around 10,000 feet. The erroneous data samples appear as downward spikes in the data, at depths that are significantly less than 10,000 feet. One principled way to go about removing outlying data is to remove samples whose value is far from the mean value, using the standard deviation to determine the range of values to remove. The standard deviation captures how spread out the data values are, and is given by the following formula:

N is the number of samples in the data, vi is the ith data sample, and is the mean value of the data. If the distribution of the data follows a bell-shaped curve (which is not really the case here), almost all of the data should lie within three standard deviations of the mean value (see here for more information).
For the ocean floor depth data, we could just remove all samples that are more than three standard deviations away from the mean depth value. A problem with this strategy is that an initial calculation of the mean and standard deviation of all of the data will be biased by the presence of the outlying data samples. Thus, we will instead use a more conservative approach that removes data in two stages, as described in the following steps, which also print, display and save the data:
newData.mat
Expand the analyzeData.m code file to perform the above steps.
When completing this program, keep in mind the following background, tips and guidelines:
mean function can be used to calculate the mean value of the data,
but you must compute the standard deviation using the above formula. It is OK
to use the built-in std function to check your calculation - you should obtain
similar results. Keep in mind that the number of samples in the data changes after each
modification.
depthData vector, and re-use these two variable names
to store the new mean and standard deviation in steps 3 and 4b. When modifying the
data in steps 2 and 4a, re-use the same variable name, depthData.
disp.
yes or no. Two strings can be
compared with the strcmp function that returns true (logical value 1) if the two
strings are equal:
name = input('Enter your name: ', 's');
if strcmp(name, 'Sohie')
disp('I know you! You''re our Lab Instructor!')
else
disp('I don''t know you')
end
.mat
files to store and retrieve variables using save and load.
subplot function, described here. The
analyzeData.m code file contains a call to subplot that specifies that
the figure window should have one row of three plotting areas, and that the first plot should
appear in the leftmost area. Use subplot to place the plot of the intermediate data in
the center plotting area and the final data (if further analysis is requested by the user) in the
rightmost area. You
can expand the window in the horizontal direction by dragging the borders of the window horizontally.
Note that the range of values plotted on the axes will differ for the three plots.
Your final submission should include a copy of your final analyzeData.m code
file.