Skip to content

Latest commit

 

History

History
29 lines (15 loc) · 760 Bytes

README.md

File metadata and controls

29 lines (15 loc) · 760 Bytes

Information Retrieval Course Project

Instructor: Dr. A. Nikabadi

Course content: CS276 Standford University

Project Overview

  1. Preprocessing on data (Noramlization, Tokenization, Stemming, Removing Stopwords)

  2. Created a positional inverted index

  3. Used Zipf's law

  4. Used Heaps law

  5. Searching by Normal quries, Phrase Queries (used permuterm index), Boolean queries

  6. Ranking results

  7. Show words in vector representation

  8. Compute tf-idf

  9. Compute cosine similarity between query terms and documents

  10. Used Index elimination techniques such as creating champion list

  11. Rank results based on most relevent results