This sample application contains the code for the text search tutorial. Please refer to the text search tutorial for more information.
See also the MS Marco Ranking sample application for ranking using state-of-the-art retrieval and ranking methods. There is also a Ranking with Transformers sample application.
The following is for deploying the end to end application including a custom front-end.
- Docker Desktop installed and running. 10GB available memory for Docker is recommended. Refer to Docker memory for details and troubleshooting
- Operating system: Linux, macOS or Windows 10 Pro (Docker requirement)
- Architecture: x86_64 or arm64
- Minimum 10 GB memory dedicated to Docker (the default is 2 GB on Macs)
- Homebrew to install Vespa CLI, or download a vespa cli release from GitHub releases.
- python 3
- Java 17 installed.
- Apache Maven
This tutorial uses Vespa-CLI, Vespa CLI is the official command-line client for Vespa.ai. It is a single binary without any runtime dependencies and is available for Linux, macOS and Windows.
$ brew install vespa-cli
$ vespa clone text-search text-search && cd text-search
$ ./bin/convert-msmarco.sh
$ docker run --detach --name vespa-msmarco --hostname vespa-msmarco \ --publish 127.0.0.1:8080:8080 --publish 127.0.0.1:19112:19112 --publish 127.0.0.1:19071:19071 \ vespaengine/vespa
$ vespa deploy --wait 300
$ vespa feed ext/vespa.json
$ vespa query 'yql=select title,url,id from msmarco where userQuery()' 'query=what is dad bod'
Instead of using the vespa feed
command above, we can use Logstash to feed data. This way:
- You don't need to convert the data to JSON via
./bin/convert-msmarco.sh
. - You can more easily adapt this sample application to your own data (e.g. by making Logstash read from a different file database).
You'll need to install Logstash. Then:
- Install Logstash Output Plugin for Vespa via:
bin/logstash-plugin install logstash-output-vespa_feed
-
Change logstash.conf to point to the absolute path of msmarco-docs.tsv.
-
Run Logstash with the modified
logstash.conf
:
bin/logstash -f $PATH_TO_LOGSTASH_CONF/logstash.conf
Remove app and data:
$ docker rm -f vespa-msmarco