CS 234 - Fall 2017

Assignments

Week Nr. Assignment Submission
[Recommended]
Final Project Submit the final project. See instructions in the Final Project page. 12/21/2017
Week 15 The notebooks can be downloaded from the Schedule (Week 14).
  1. Go over the notebook Text Classification (download ZIP file from schedule) on your own. There are two tasks in it for you to work (one is optional). Blog about what you learned from the activity in your digitial/blog.html page.
  2. Go over the notebook Review Hypothesis Testing to see an example of a very simple final project notebook. Blog about any similarities between this project and yours.
Upload your IPYNB files for both projects in dav/drop/week14 and link to the HTML version from your blog entries.
12/21/2017
Week 14 The tasks below were explained in detail in this document that was sent via email and is linked from the schedule. You can download the notebooks from the links in the schedule.
  1. Complete your Project 2 and update your blog with the appropriate links.
  2. Complete Notebook on linear regression, upload it on week13.
  3. Complete Notebook on clustering, upload it on week13
  4. Write a blog post about your understanding of the OLS models from the digital natives paper. This goes in the blog for Project 3 (digital/blog.html).
  5. Write a blog post that explains your plan for your final project. [Also in your digital/blog.html]
  6. Create a notebook to learn to build two seaboarn plots with your browser history data. Upload this in week14 (when ready). Blog about this in your digital/blog.html, to indicate how you're learning new skills for the project.
12/05/2017





12/08/2017
Week 13
  1. Draft of your Project 2 is due before class on Friday, 12/01/2017. Share link via email.
  2. Your blog page for the project should contain all tasks that you have accomplished so far and what you have learned doing them. What is due for the project is explained in these Project 2 notes.
12/01/2017
Week 12 Nothing due this week, it's Thanksgiving week. Enjoy your break!
Week 11
  1. Upload the labeling.xsxl file (with labels for Wellesley College searches) you created in class in your dav/drop/week11 folder.
  2. Upload the CSV file that contains your 20-minute browser history (about visualization of text data) in dav/drop/week11 folder. This is the Python script to extract the history. Some of you had an issue with accessing the browser history and contacted me about that. If you haven't, please do so.
  3. Complete the notebook Classification with NLTK, blog about it, upload the IPYNB file in dav/drop/week11, link to the HTML version from the blog. One additional task for the notebook is to try out the NLTK classifier for decision trees: nltk.classify.DecisionTreeClassifier too.
  4. Read the Probability Review material and write your questions/comments in your Google blog page.
  5. From the folder Google searches download the ZIP file of Wellesley College related searches. Brainstorm in your blog about what features can be created from a search page to support the labels "informational" or "navigational" for the query.
  6. Read the Project 2 page. Start working on the questions/get data/exploration parts which you already know how to do. The notebooks/data you'll create for this project will be stored in dav/drop/project02.
11/21/2017
Week 10 The tasks below were explained in detail in this document that was shared in class on Nov 3. You can download the notebooks from the links in the schedule.
  1. Start a blog page for the Google searches project.
  2. Read the paper "A taxonomy of web search" and the web article that revisits the taxonomy and blog about what you learned.
  3. Inspect your Chrome history (chrome://history/) and blog about the kind of Google searches you do or anything else you can observe.
  4. Start a blog page for the Digital Natives project. Skim the reading for the Digital Natives project, write a paragraph in the blog about your project plans, also share it with Eni. (see instructions in the link above)
  5. Complete the notebook SQL and Chrome History, blog about what you learned in your Google blog, upload the IPYNB file in dav/drop/week10, link the HTML version from your blog.
  6. Complete the notebook Selenium, Wellesley Searches, blog about what you learned, upload the IPNYB file in dav/ddrop/week10, link its HTML version from your blog.
11/14/2017
Week 9 This week we discussed in class your Wikipedia reports. Keep the first version of your report unchanged, because you got feedback on that version. If you decide to improve the report based on the feedback, create a second copy of the page and link to it separately, naming it as Version 2.
  1. Complete the two notebooks Wordcloud generation and Parsing edits in Wikipedia, blog about what you learned, upload the notebooks in the dav/drop/week09 folder, link to the HTML versions from the blog page. Play with the parameters of the wordcloud to generate something of your own design / liking.
  2. Upload all the notebooks that YOU created for the Wikipedia project on dav/drop/project01.
  3. Ensure that your team has a webpage that describes the results of your Wikipedia project and link to it form your blog and your portfolio page (with the clarification that it's the Version 1 of your report.)
11/07/2017
Week 8 The items below were originally described in the Friday section of the week 8 document from the schedule.
  1. Complete the notebook Timeseries with pandas. Upload the IPYNB file in dav/drop/week08, blog about what you learned in the Wikipedia blog page, link to its HTML version from the blog.
  2. Complete the notebook Hypothesis Testing with Wikipedia data. Upload the IPYNB file in dav/drop/week08, blog about what you learned in the blog page, link to its HTML version from the blog.
  3. Update the Google doc with the midway status of your Wikipedia project.
  4. Create a new notebook of your own to explore the Wikipedia data, especially using what you learned about timeseries. Blog about your progress. Upload this notebook in dav/drop/project01, link to it from your blog page.
10/31/2017
Week 7 More detailed descriptions for the items below are found in the week 7 PDF document as well as the reminder document for week 8.
  1. Start a blog page for the Wikipedia project in your website. Link to it from the portfolio.
  2. Work during the entire week on completing two notebooks (1. Time, Datetime, Dateutil and 2. Wikipedia Revisions) and blog about what you learned. Store the IPYNB files in dav/drop/week07, create the HTML versions and link to them from your blog page.
  3. Read at least one research paper about Wikipedia and blog about what useful things you learned.
  4. Create a new notebook of your own (or as a team) to explore the Wikipedia question you are focusing on. Store the IPYNB file in dav/drop/project01, as well as link to its HTML version from your blog page. If you have put this notebook in week07, please move it to project01.
  5. Complete the Statistical Significance notebook and put the IPYNB file in dav/drop/week07, blog about it and link to the HTML file from your blog.
  6. Blog about the article Science isn't broken in your Wikipedia blog page.
10/24/2017
Week 6
  1. Learn Dash: Part 3 [PDF + code + presentation]
10/13/2017
Week 5 For Task 1, upload the notebook as IPYNB and HTML in Week05 folder;
for Task 3, upload the PDF of your notes about the paper.
  1. Task 1: Notebook about Python Quiz
  2. Task 3: Notes on "Wikipedians are born"
10/06/2017
Week 4 For Task 1, upload the apps you wrote in Dash as well as the PDF
of your notes of what you learned during the tutorials;
For Task 2, upload the notebook for data cleaning (HTML and IPYNB) as
well as the cleaned CSV file you generated. All files go in week04 folder.
  1. Task 1: Learn Dash [Part 1+ Part 2]
  2. Task 2: Clean food data with Pandas
10/03/2017
Week 3 Your web page should be in the public_html folder in the CS server;
your formatted report on food data collection should be in your cs234/eating folder;

  1. Eating Habits: A summary [Task 2: Part 2]
  2. Web Presence: Personal page + course page
09/29/2017
Week 2 Share the food dataset as a Google spreadsheet via email;
upload your HTML page on eating habits as instructed in the instructions;
complete the Pandas notebook and upload the .ipynb and .html in the Week02 subfolder.

  1. Creating the food dataset from the photos. [DUE: 09/20/17 ]
  2. Eating Habits: A summary [Task 2: Part 1]
  3. Pandas Practice Notebook

To complete at your own pace, but no submission required:
a) Python review problems; b) Probability Problems with simulation.

09/22/2017
Week 1 For all three notebooks (once you complete them), upload the .ipynb and .html file
in the Week01 subfolder on the CS server.

  1. Create your own notebook (using Markdown) to solve the Python review exercise.
  2. Notebook: Data visualization with matplotlib
  3. Notebook: Getting started with Pandas
  4. Tidy data activity for food data: either typed text files (stored as PDFs) or scanned copies
    (or photos) of your handwritten notes. If they are not very legible, please type them.
    Upload file(s) on the Week01 folder.
09/15/2017