Skip to content

Latest commit

 

History

History
50 lines (33 loc) · 1.19 KB

README.rst

File metadata and controls

50 lines (33 loc) · 1.19 KB

pysearchlite

Lightweight Text Search Engine written in Python

Usage

Prepare a JSON file which contains lines of "id" and "text".

For example, stn/search-backend-game has made such corpus. You can use it.

$ git clone https://github.com/stn/search-benchmark-game.git
$ cd search-benchmark-game
$ make corpus

This will result in a corpus file corpus.json, which is about 8GB. The corpus has more than 5 million documents, but it is too large for our development, so we will extract only the first some lines.

$ head -n 100 corpus.json > corpus100.json

How to run

To run a sample script,

$ python -m pysearchlite.commands.main < corpus100.json

To run search-backend-game,

# Go to the search-benchmark-game dir.
# assume it's next of this repo.
$ cd ../search-benchmark-game
$ make index
$ make bench
$ make serve