Just a picture
  The picture above is a snapshot from the cover page of the recent CACM on Web Science. Web Science is a new interdisciplinary field that includes Computer Science and Social Sciences, and studies the Web as an evolving entity with its own rules.
 
 


Syllabus

We will cover the following subjects:
  • Review of basic internet technologies: HTML, PHP, Java HttpURLConnection
  • Introduction to Information Retrieval (text).
  • Inverted indices and boolean queries.
  • Query optimization.
  • Unstructured vs semi-structured text.
  • Text encoding: tokenization, stemming, lemmatization, stop words, phrases.
  • The vector space retrieval model.
  • tf.idf weighting. Scoring documents. The cosine measure.
  • Introduction to data clustering.
  • Partitioning methods: k-means clustering| Hierarchical clustering
  • Introduction to text classification. Naive Bayes models. Email-Spam filtering.
  • The structure of the Web graph.
  • Zipf's and Pareto's Laws.
  • Web search overview, web structure, the user, paid placement, search engine optimization/spam
  • Web Crawling and web indexes
  • Link analysis; PageRank and HITS ranking methods
  • Recognizing web spam with statistical and graph-theoretic methods
  • The Social Web: Social networks, Blogs, Trust
  • Web Communities discovery