This repository contains the code for a basic search engine as part of our Algorithms for Information Retrieval Assignment.
- Posting list and bigrams posting list are created from the Snippets (run
- These are used to narrow down candidate documents for word queries, phrase queries, wilcard queries
- Vector space models of the candidates are built with TF-IDF scores
- Candidate documents are ranked based on relevance and displayed
AIR-Dataset: Contains 418 CSV files, with approximately 95k rows. Program to create inverted index and postings list and save to file Program to load index from file and functions to search given a query Program to provide user interface for queries templates/index.html: HTML file for query page templates/results.html: HTML file for results page
- Run python3 to create the different indexes.
- Run python3 to launch the flask server (for user interface).
- On a web browser, open http://localhost:5000 to open the query page.
- Use the user interface to perform queries.