|
Items in te future are tentative during the term.
Slides of the lectures are available. Most of the slides have been adopted or modified through the work of a large number of scholars in the area, including M. Levene, C. Manning, P. Raghavan, H. Schütze, A. Broder, B. Davison, F. Turbak and others.
1. Thu Sep 9 - Course Logistics and Overview
Reading: [H90] Efficient Reading of Papers in Science and Technology
[B45] Vanevar Bush: As we may think
2. Mon Sep 13 -History of Hypertext and the Web
Quiz 1 due
Reading:
TBL1-enquire
within;
TBL2-tangles,bits,webs;
TBL2-Appendix;
3. Thu Sep 16 -Network protocols
Reading:
TBL3-info.cern.ch;
TBL4-protocols;
HTTP Network Protocols Slides
[w3c] The original HTTP
4. Mon Sep 20 -Intro to Information Retrieval, Boolean retrieval
Quiz 2 due
Reading:
MRS1-Boolean Retrieval
5. Thu Sep 23 -The structure of the Web
Reading:
EK13 The structure of the Web
MRS19.5 Index size and estimation
[B02] A taxonomy of web search
6. Mon Sep 27 -Text Encoding
Homework 1 due
Reading:
MRS2-Term Vocabulary & Postings
7. Thu Sep 30 -Web & search engine basics
Reading:
MRS19.1-19.5-Web search basics
8. Mon Oct 4 -Evaluation; Precision and Recall
Quiz 3 due
Reading:
MRS8 (not 8.4, 8.5)-Evaluation in IR
[GM02]-Of course it's true, I saw it on the internet
9. Thu Oct 7 -Multimedia Search
Homework 2 due
Reading:
[G+07] Recent Progress in the MIT spoken Lecture Processing Project
[vA03]Labeling images with a computer game
[vA05] Peekaboom: A game for locating objects in images
Mon Oct 11 -Fall Break; NO CLASS!
10. Tue Oct 12 (Monday Schedule) -The Vector Space Model
Quiz 4 due
Reading:
MRS6.2 and 6.3-Scoring, Term Weighting and the Vector Space Model
You must come to Ben Schneiderman's talk on Wed Oct 13 at 4PM
11. Thu Oct 14 - Finish Vector Space; Web crawling
Reading:
MRS20.1 - 20.2-Web Crawling and Indexes
12. Mon Oct 18 - Link Analysis: PageRank
Reading:
EK14 Link Analysis and Web Search
MRS21.1-21.2-Link Analysis; Pagerank
13. Thu Oct 21 - Link Analysis: HITS; Web Spam techniques
Quiz 5 due
Reading:
MRS21.3-Hubs and Authorities
[GG05] Spam: It's not just for inboxes anymore
[M09] On the evolution of search engine rankings
14. Mon Oct 25 - Discovering Spam
Reading:
[M09b] Using propagation of distrust to find untrustworthy web neighborhoods
15. Thu Oct 28 - "If you can collect more data, what data should you collect?". Guest Lecture on Active Learning by Dr. Rachel Lomasky'01
Homework 3 due
Reading:
TBA
REMINDER: Visit our Web Search research papers repository (or do your own search) to select papers for your final paper
16. Mon Nov 1 - Paid Inclusion; Web analytics
Reading:
EK15-Sponsored Search Markets
DEADLINE: Email the title pf your final paper and the relevant papers
(see our Web Search research papers repository) for ideas
17. Thu Nov 4 - Power Laws with an application on Discovering Spam
Reading:
EK18-Power Laws and Rich-get-richer phenomena
[FMN04] Spam, Damn Spam and Statistics
18. Mon Nov 8 -Twitter bomb and Review for the midterm
Quiz 6 due
Reading:
[MM10] From Obscurity to Prominence in minutes: political Speech and Real-Time Search
19. Thu Nov 11 - In class MIDTERM
20. Mon Nov 15 - Social Networks
Reading: EK3 Strong and weak ties
[K08] The Convergence of Social And Technological Networks
21. Thu Nov 18 - The Sematic Web vs NELL (Never Ending Language Learning). Guest speaker: Eni Mustafaraj
Reading:
TBL12-Mind to Mind
[B-L01] The Semantic Web
22. Mon Nov 22 -Uncovering the Internet's Dark Side.
Guest speaker: Tyler Moore
Homework 4 due
Reading:
[MCA09] The economics of online crime
[MC09] Evil Searching: Compromise and Recompromise of Internet Hosts for Phishing
[ME10] Measuring the Perpetrators and Funders of Typosquatting (optional)
Your paper should be formatted in the ACM Publication Format
.
23. Mon Nov 29 - Preparation for presentations
Reading: N/A
24. Thu Dec 2 - Three Student Presentations (TBA)
Reading: Posted in the class conference
25. Mon Dec 6 - Three Student Presentations (TBA)
Reading: Posted in the class conference
26. Thu Dec 9 - Three Student Presentations (TBA)
Reading: Posted in the class conference
Monday Dec 13 - Final Research Paper Due
Extra -Text Classification; Naive Bayes models; Email spam filters
Reading:
MRS 13.1 - 13.2.1 Text Classification
Extra - Clustering; K-nearest neighbors
Reading:
MRS 16.1-16.4, 17.1 - 17.3, 17.7 Clustering k-means, hierarchical
[K08] The Convergence of Social And Technological Networks
Extra - Social Networks 1 -Basics
Reading:
EK1 Overview (optional), EK2 Graphs
TBL10 Web Of People
Extra - Social Networks 2 - Peer-to-Peer Networks, Collaborative filtering
Reading: EK3 Strong and weak ties
[K08] The Convergence of Social And Technological Networks
|