Ling467 - Information Retrieval

Data for use in Classroom Presentations

  • XinHua - sample of simplified Chinese
  • Data to study word frequency distributions
  • Descriptions of data sets from Usage demo
  • Articles with similarity ranked sentences
  • Examples of User Queries
  • Examples showing use of (something related to) Mutual Information to find conceptually related words
  • 1990 Census Data - Last Names, Female First Names, Male First Names
  • Scan for named entities with the (lower case) word "of" embedded
  • Dogless Mentions of breeds in articles that do not contain the word dog