CS 315 // Spring 2017

Data and Text Mining for the Web

[... exploring the Web without the browser ...]

Class material is shared with the class as a Google Drive Folder.

Access Google Drive

Course Description

In the past decade, we have experienced the rise of socio-technical systems used by millions of people: Google, Facebook, Twitter, Wikipedia, etc. Such systems are on the one hand computational systems, using sophisticated infrastructure and algorithms to organize huge amount of data and text, but on the other hand social systems, because they cannot succeed without human participation. How are such systems built?

What algorithms underlie their foundations? How does human behavior influence their operation and vice-versa? In this class, we will delve into answering these questions by means of: a) reading current research papers on the inner-workings of such systems; b) implementing algorithms that accomplish tasks such as web crawling, web search, random walks, learning to rank, text classification, topic modeling; and c) critically thinking about the unexamined embrace of techno-solutionism using a humanistic lens.

Course Info

Instructor: Eni Mustafaraj
Contact: eni.mustafaraj
Office: SCI E112

Teaching Assistant: Kate Kenneally
Contact: kkenneal


Lecture: Tuesday/Friday 1:30-2:40 in SCI 256
Prerequisites: CS 230 or permission of the instructor
Distributions: Mathematical Modeling