an Information Retrieval project with the collaboration of "Amirali Toori" and "Mohammad Hosseini" for an IR course at the university,
Implement a simple search engine that uses cosine similarity to rank the relevant docs; we use Cranfield
dataset for this project which
is available in the repo with the name of cran.all.1400
, all calculations for tf-idf and cosine in this project are done without any packages.
Just run the main.py, and a window pops up, select the cran.all.1400
dataset or any dataset you desire, and after a few minutes another window pops up, which has query entry and rerank option, enter the query to see the top 10 list of the relevant docs.
Just a playground to write the logic code of the project.