Last update: Fri Jan 16 18:50:12 2009

Overview

Description

The Web has become the largest data repository in the world. This course aims at introducing the basic and advanced techniques of (1) Web information retrieval (IR): How to search the large-scale Web data and (2) Web mining: How to discover knowledge from the diverse data resources on the Web.

The lecture will cover the topics of (1) Web IR, including the fundamentals of modern IR systems, crawling, ranking algorithms, Web page classification and clustering, Chinese IR, multimedia IR, and case studies of search engines, and (2) Web mining, including Web content/text mining, Web structure mining, Web query log mining, information extraction, and taxonomy generation.

Students in this course are expected to read research papers on a relevant topic to Web IR or Web mining, do a project, and then present their work in class.

Prerequisites

Prerequisites include data structure, algorithms, and programming (programming experience will be necessary for the homework and project).

Readings

Grading (tentative)

  • Assignments (60%)
  • Midterm Exam (20%)
  • Term Project (20%)