Information Retrieval
LING 467

Spring 2008

(Always) Under construction


Instructor:    Dr. George V. Wilson
Email:         wilsong@georgetown.edu

The course is a survey of principles and techniques in information retrieval with a focus on text databases, including automatic indexing, search techniques, query mechanisms (Boolean queries, topic hierarchies, natural language queries), relevance feedback, and evaluation methodology. Students will examine the performance of selected commercial and web-based systems.

Class materials

  • Course text: "Modern Information Retrieval" by Baeza-Yates and Ribeiro-Neto.
  • Book web site in Brazil or Chile
  • Description of Class Project
  • Schedule
  • Assignments
  • Data for use in classroom presentations

  • Perl Sites

  • Perl Tutorial
  • Perl Language Home Page
  • CPAN Comprehensive Perl Archive Network
  • ActiveWare - You can download perl for Windows from here!
  • Perl Resources and Reviews
  • Perl monks - tutorials and other useful information
  • CGI Form - Handling in Perl
  • Not really perl   ConText - Freeware editor (recommended)

  • Search Engines on the Net

  • Google
  • Yahoo!
  • MSN
  • Beaucoup MetaSearch Engine (10 at once)
  • SE Spying What do people search on?

  • Information Retrieval Sites

  • The IR Group at the University of Glasgow including
    C.J. van Rijsbergen's book INFORMATION RETRIEVAL
  • SIGIR Information Server
  • MINDS IR Research Agenda

  • Question Answering

  • TREC Q&A Web Site
  • TREC Questions TREC8 TREC9 TREC10
  • START QA System
  • Ask Jeeves Question answering system
  • Brain Boost Question answering system

  • Corpus Linguistics Resources

  • Usage Demo from GVW.     NOTE: If you access the demo from off-campus, you will not have access to all datasets unless you have a username and password.
  • LDC - The Linguistic Data Consortium - Georgetown is now a member.
  • Project Gutenberg Source for many electronic texts
  • British National Corpus
  • Penn Treebank Project
  • WordNet - Semantic Network