Skip to content

Latest commit

 

History

History
143 lines (102 loc) · 9.57 KB

docker.md

File metadata and controls

143 lines (102 loc) · 9.57 KB

Docker installation

Image location

Docker image is available at DockerHub Lifesciences repository

Versions

There are two version of NGB in the repository:

  • ngb:latest - a "core" version - contains image of NGB without any data in it, only binaries
  • ngb:latest-demo - a "demo" version - contains demo data set, which does not require any data registration, you need only to run an image

Running demo version

Warning: a demo version could take up to 2Gb of the disk space (FASTA sequence, genes annotations, BAM, VCFs) For a demo version run the following command

$ docker run -p 8080:8080 -d --name ngbcore lifescience/ngb:latest-demo

You can go to http://localhost:8080/catgenome or http://ip-of-the-host:8080/catgenome in a browser and view demo datasets (Sample 1 and Sample 2), which contain Structural Variations

Using a demo version as a standalone viewer

NGB docker images is preconfigured to provide access to /ngs folder via Open from NGB server menu.

This means that if a demo docker is run with -v option (mount volume into docker) - it is possible to view own NGS datasets immediately without registration process.

Command to run

# This assumes that /ngs directory is available on 
# a host machine, of course any other host's
# folder can be used instead of /ngs

$ docker run -v /ngs:/ngs -p 8080:8080 -d --name ngbcore lifescience/ngb:latest-demo

If this is done - navigate to http://localhost:8080/catgenome in a web-browser and activate Open from NGB server menu

Contents of the host's /ngs folder will be shown and available to select and visualize

Open From NGB Server

Note: the following reference sequences and genes are available in a demo docker:

  • GRCh38
  • GRCh37/hg19
  • GRCm38/mm10
  • dm6

Running core image

For a core version replace <YOUR_NGS_DATA_FOLDER> placeholder with a real path to a folder with NGS data, and then run command

$ docker run -p 8080:8080 -d --name ngbcore -v <YOUR_NGS_DATA_FOLDER>:/ngs lifescience/ngb

This will create and start the container in a background mode and map port 8080 of the container to port 8080 of the host, then mount <YOUR_NGS_DATA_FOLDER> of the host to /ngs folder of the container and at last - make container accessible by name ngbcore

You can go to http://localhost:8080/catgenome or http://ip-of-the-host:8080/catgenome in a browser (Chrome) and verify that server started successfully (you should see empty list of datasets)

Registering data

To register your own data you should attach to a running container

$ docker exec -it ngbcore /bin/bash

This will put you inside a container's console and make ngb command available First of all you should register reference (genome data), using a mounted folder /ngs. NGB accepts FASTA files for reference sequence

# ngb reg_ref /ngs/<PATH_TO_FASTA> -n my_genome -t

According to FASTA size you should wait several minutes.

To make NGS data available via NGB, you should create a DATASET, that is used to group linked files You can register files and then add them to a dataset

Register file

# ngb reg_file my_genome /ngs/<PATH_TO_FILE> -n my_file1 -t

Note that you should provide reference name (my_genome in this case), also -n (name) key is optional, if it is not specified - original file name will be used

Create dataset and add file(s) to it

# ngb reg_dataset my_genome my_sample my_file1

Or you can create dataset and register files simultaneously

# ngb reg_dataset my_genome my_sample /ngs/<PATH_TO_FILE> /ngs/<PATH_TO_FILE2>

Note that when registering a dataset, you should specify a genome name, to which files correspond

After all you can leave container's console using

# exit

NGB container will continue running in a background. When datasets are created - you can immediately browse NGS data.

Persisting registered data

When any data was registered in NGB container - it will be lost once a container is removed. To avoid this, cache locations inside a container shall be exposed to the host filesystem.

This can be achieved by mounting of host folders into a container, using paths that contain NGB index database (H2 dir) and files caches (contents dir):

  • /opt/catgenome/H2
  • /opt/catgenome/contents

Note: these options shall be specified to a docker run command at start time

Example:

Imagine a host machine that contains two folders

  • /ngs - stores NGS data that shall be registered in NGB
  • /ngb-cache - empty folder that will be used to persist NGB caches

The following command can be used to persist all changes made to a container into that folders:

$ docker run -p 8080:8080 \ 
             -d \
             --name ngbcore \
             -v /ngs:/ngs \ 
             -v /ngb-cache/H2:/opt/catgenome/H2 \
             -v /ngb-cache/contents:/opt/catgenome/contents \
             lifescience/ngb

Restarting a container using this command will not cause loss of data or NGB configuration

-v /host/ngs:/ngs -v /host/H2:/opt/catgenome/H2 -v /host/contents:/opt/catgenome/contents

Demo data description

ngb:latest-demo container is built to show some basic features of NGB. It uses mostly shrinked data to minimize a container size

Points of interest