Skip to content

2.2.1 Tutorial: Start your own instances of LAPIS and SILO

Fabian Engelniederhammer edited this page Aug 7, 2023 · 18 revisions

Starting SILO and LAPIS yourself

Every LAPIS instance needs to be backed by a SILO instance, that acts as data source. SILO could be operated stand-alone. LAPIS is meant as a layer of convenience and abstraction around SILO.

We provide Docker images of SILO and LAPIS that are ready to use. We recommend using those Docker images, so in this tutorial, we explain how to use them.

Prerequisites

  • You have Docker installed.
  • Some knowledge how to use Docker and Docker Compose.
  • Make sure you have the latest Docker images: docker pull ghcr.io/genspectrum/lapis-v2 && docker pull ghcr.io/genspectrum/lapis-silo
  • Create a directory for the example: mkdir ~/lapisExample

Writing Configuration

Both LAPIS and SILO need to know which metadata columns are available in the dataset. Furthermore, you need to define which column acts as primary key and which column should be used to generate partitions in SILO. Also we configure LAPIS to be an open instance, meaning that the underlying data requires no visibility restrictions.

~/lapisExample/config/databaseConfig.yaml:

schema:
  instanceName: testInstance
  metadata:
    - name: gisaid_epi_isl
      type: string
    - name: date
      type: date
    - name: region
      type: string
      generateIndex: true
    - name: country
      type: string
      generateIndex: true
    - name: division
      type: string
      generateIndex: true
    - name: pango_lineage
      type: pango_lineage
    - name: age
      type: int
    - name: qc_value
      type: float
  opennessLevel: OPEN
  primaryKey: gisaid_epi_isl
  dateToSortBy: date
  partitionBy: pango_lineage

SILO currently supports the following metadata types:

  • int
  • float
  • string: String columns support indexing (configured via generateIndex: true). SILO internally stores precomputed bitmaps for those columns to speed up queries. Generating an index makes most sense for columns with many equal values.
  • pango_lineage: Systematic classification of lineage with inheritance structure that can be computed for some pathogens. Also see https://github.com/GenSpectrum/LAPIS/wiki/4.6-Pango-lineage-query.
  • date: Values must be valid dates in the form YYYY-MM-DD.
  • insertion: A comma separated list of insertions. Each insertion has the form <position>:<symbols>. Example value: 123:CCG,501:AAAGGG.

Starting SILO

This section might change soon, as we're currently reworking how SILO is supposed to be started

Download the example dataset from the SILO repository:

  • pangolineage_alias.json
  • reference_genomes.json
  • small_metadata_set.tsv
  • fasta files for the sequences

SILO expects fasta files (possibly compressed via zstandard or xz) in the same directory with naming scheme nuc_<sequence_name>.fasta for nucleotide sequences or gene_<sequence_name>.fasta for amino acid sequences. The sequence_namess have to match the names defined in the reference_genomes.json.

Put those files into the folder ~/lapisExample/data/.

Now SILO needs to know where it can find those files. You have to provide a config for that. Note that you need to provide the paths where the files will be stored in the Docker container.

~/lapisExample/config/siloConfig.yaml:

inputDirectory: "/data/"
outputDirectory: "/data/output/"
metadataFilename: "small_metadata_set.tsv"
pangoLineageDefinitionFilename: "pangolineage_alias.json"
referenceGenomeFilename: "reference_genomes.json"

Start the SILO Docker container with the options:

  • expose port 8081 to the host.
  • mount the config into the container.
  • mount the data into the container.
  • provide the path to the SILO config.
  • provide the path to the database config.

The following command puts it all together:

docker run --detach \
  --publish 8081:8081 \
  --volume ~/lapisExample/config:/app/config \
  --volume ~/lapisExample/data:/data \
  ghcr.io/genspectrum/lapis-silo \
    --api \
    --preprocessingConfig=/app/config/siloConfig.yaml \
    --databaseConfig=/app/config/databaseConfig.yaml

Now SILO should be available at http://localhost:8081.

Starting LAPIS

Now you can start LAPIS. You have to:

  • expose port 8080 to the host.
  • mount the previously created database configuration into the Docker container.
  • provide LAPIS with the SILO URL.
  • tell LAPIS where to find the database configuration.

The following command puts it all together:

docker run --detach \
  --publish 8080:8080 \
  --volume ~/lapisExample/config/databaseConfig.yaml:/workspace/databaseConfig.yaml \
  ghcr.io/genspectrum/lapis-v2 \
    --silo.url=http://localhost:8081 \
    --lapis.databaseConfig.path=/workspace/databaseConfig.yaml

Now LAPIS should be available at http://localhost:8080. LAPIS offers a Swagger UI that serves as a good starting point for exploring it's functionalities.

Using Docker Compose

We recommend using Docker Compose to start LAPIS and SILO. The above docker run commands can be combined into a docker-compose.yaml file:

~/lapisExample/docker-compose.yaml

version: "3.9"
services:
  lapis:
    image: ghcr.io/genspectrum/lapis-v2
    ports:
      - "8080:8080"
    command: --silo.url=http://silo:8081 --lapis.databaseConfig.path=/workspace/databaseConfig.yaml
    volumes:
      - type: bind
        source: ~/lapisExample/config/databaseConfig.yaml
        target: /workspace/databaseConfig.yaml
        read_only: true
      - type: bind
        source: ~/lapisExample/logs
        target: /workspace/log
  silo:
    image: ghcr.io/genspectrum/lapis-silo
    ports:
      - "8081:8081"
    command:
      - "--api"
      - "--preprocessingConfig=/app/config/siloConfig.yaml"
      - "--databaseConfig=/app/config/databaseConfig.yaml"
    volumes:
      - type: bind
        source: ~/lapisExample/logs
        target: /data/logs
      - type: bind
        source: ~/lapisExample/config
        target: /app/config
      - type: bind
        source: ~/lapisExample/data
        target: /data

This requires a logs directory: mkdir ~/lapisExample/logs. Then LAPIS and SILO can be started via

  • cd ~/lapisExample
  • docker compose up -d

Logs from LAPIS and SILO will be available in the previously created logs directory.

Further Reading