Skip to content

stn/pysearchlite

Repository files navigation

pysearchlite

Lightweight Text Search Engine written in Python

Usage

Prepare a JSON file which contains lines of "id" and "text".

For example, stn/search-backend-game has made such corpus. You can use it.

$ git clone https://github.com/stn/search-benchmark-game.git
$ cd search-benchmark-game
$ make corpus

This will result in a corpus file corpus.json, which is about 8GB. The corpus has more than 5 million documents, but it is too large for our development, so we will extract only the first some lines.

$ head -n 100 corpus.json > corpus100.json

How to run

To run a sample script,

$ python -m pysearchlite.commands.main < corpus100.json

To run search-backend-game,

# Go to the search-benchmark-game dir.
# assume it's next of this repo.
$ cd ../search-benchmark-game
$ make index
$ make bench
$ make serve

About

Lightweight Text Search Engine written in Python

Resources

License

Stars

Watchers

Forks

Packages

No packages published

Languages