Information Retrieval
LING 467
Description of Class Project

Each student will implement a basic document retrieval system.
The system must be written in perl or C to work on a PC or gusun.
The system will consist of three main logical components:
       the indexer, the search engine and the user interface.

For this project the user interface need not be very elaborate.
The goal is to develop the functionality of a good system, but
perhaps some of the features that would make the system nice
for the user will be left out. Further, the time available will
not allow the development of a really elaborate indexer or search
engine. All components should be developed as if this is just
the first step in a phased development. Some functionality will
be provided now, but more can be added later.

The system components will relate to each other as in Figure 1.

User <---> UI <-----> Search <------ Database <----- Indexer
                      Engine         (indices)       /
					\	    /
				    	 \	   /
					  \   	  /
                               		     Text


The indexer will take free text as input and create a database of
indices that can be used to perform rapid searches of the corpus.
The search engine will accept queries from the user and, using the
indices, determine which documents best match the user's query.
It can also provide access to the original text at the user's request.

We will work out a more complete design for the system throughout
the term. Homework assignments will provide pieces that will lead
towards the completed class project.