Skip to content

Commit

Permalink
changes for docker container
Browse files Browse the repository at this point in the history
  • Loading branch information
npbhavya committed Oct 4, 2024
1 parent 5555d76 commit 91ec1f6
Show file tree
Hide file tree
Showing 2 changed files with 104 additions and 84 deletions.
125 changes: 63 additions & 62 deletions Dockerfile
Original file line number Diff line number Diff line change
@@ -1,62 +1,63 @@
FROM --platform=linux/amd64 ubuntu:20.04

ENV DEBIAN_FRONTEND="noninteractive"

ARG LIBFABRIC_VERSION=1.18.1
ARG SPHAE_VERSION=1.4.5
ARG THREADS=8

# Install required packages and dependencies
RUN apt -y update \
&& apt -y install build-essential wget doxygen gnupg gnupg2 curl apt-transport-https software-properties-common libgl1 \
git vim gfortran libtool python3-venv ninja-build python3-pip \
libnuma-dev python3-dev \
&& apt -y remove --purge --auto-remove cmake \
&& wget -O - https://apt.kitware.com/keys/kitware-archive-latest.asc 2>/dev/null | gpg --dearmor - | tee /etc/apt/trusted.gpg.d/kitware.gpg >/dev/null \
&& apt-add-repository -y "deb https://apt.kitware.com/ubuntu/ focal-rc main" \
&& apt -y update

# Build and install libfabric
RUN (if [ -e /tmp/build ]; then rm -rf /tmp/build; fi;) \
&& mkdir -p /tmp/build \
&& cd /tmp/build \
&& wget https://github.com/ofiwg/libfabric/archive/refs/tags/v${LIBFABRIC_VERSION}.tar.gz \
&& tar xf v${LIBFABRIC_VERSION}.tar.gz \
&& cd libfabric-${LIBFABRIC_VERSION} \
&& ./autogen.sh \
&& ./configure \
&& make -j 16 \
&& make install

# Install Miniforge
RUN set -eux ; \
curl -LO https://github.com/conda-forge/miniforge/releases/latest/download/Miniforge3-Linux-x86_64.sh ; \
bash ./Miniforge3-* -b -p /opt/miniforge3 -s ; \
rm -rf ./Miniforge3-*
ENV PATH /opt/miniforge3/bin:$PATH

# Install conda environment
RUN set -eux ; \
mamba install -y -c conda-forge -c bioconda -c defaults sphae=${SPHAE_VERSION}=pyhdfd78af_0 python==3.11
ENV PATH /opt/miniforge3/bin:$PATH
RUN conda clean -af -y

# Download test data
RUN git clone "https://github.com/linsalrob/sphae.git"

# Install Sphae databases (with dynamic threads)
#RUN sphae install --threads ${THREADS} --conda-frontend mamba

# Environment settings for filtlong bug
ENV LC_ALL=C
ENV LANGUAGE=

#remove one of the test datasets
RUN rm -rf sphae/tests/data/illumina-subset/SRR16219309*

# Create required conda environments without running
RUN sphae run --threads ${THREADS} --input sphae/tests/data/illumina-subset -k --conda-frontend mamba --conda-create-envs-only --db_dir $DIR_DB
RUN sphae run --threads ${THREADS} --input sphae/tests/data/nanopore-subset --sequencing longread -k --conda-frontend mamba --conda-create-envs-only --db_dir $DIR_DB

# Cleanup
RUN rm -rf sphae.out /tmp/* /var/tmp/* /var/lib/apt/lists/*
FROM --platform=linux/amd64 ubuntu:20.04

ENV DEBIAN_FRONTEND="noninteractive"

ARG LIBFABRIC_VERSION=1.18.1
ARG SPHAE_VERSION=1.4.5
ARG THREADS=8

# Install required packages and dependencies
RUN apt -y update \
&& apt -y install build-essential wget doxygen gnupg gnupg2 curl apt-transport-https software-properties-common libgl1 \
git vim gfortran libtool python3-venv ninja-build python3-pip \
libnuma-dev python3-dev \
&& apt -y remove --purge --auto-remove cmake \
&& wget -O - https://apt.kitware.com/keys/kitware-archive-latest.asc 2>/dev/null | gpg --dearmor - | tee /etc/apt/trusted.gpg.d/kitware.gpg >/dev/null \
&& apt-add-repository -y "deb https://apt.kitware.com/ubuntu/ focal-rc main" \
&& apt -y update

# Build and install libfabric
RUN (if [ -e /tmp/build ]; then rm -rf /tmp/build; fi;) \
&& mkdir -p /tmp/build \
&& cd /tmp/build \
&& wget https://github.com/ofiwg/libfabric/archive/refs/tags/v${LIBFABRIC_VERSION}.tar.gz \
&& tar xf v${LIBFABRIC_VERSION}.tar.gz \
&& cd libfabric-${LIBFABRIC_VERSION} \
&& ./autogen.sh \
&& ./configure \
&& make -j 16 \
&& make install

# Install Miniforge
RUN set -eux ; \
curl -LO https://github.com/conda-forge/miniforge/releases/latest/download/Miniforge3-Linux-x86_64.sh ; \
bash ./Miniforge3-* -b -p /opt/miniforge3 -s ; \
rm -rf ./Miniforge3-*
ENV PATH /opt/miniforge3/bin:$PATH

# Install conda environment
RUN set -eux ; \
mamba install -y -c conda-forge -c bioconda -c defaults sphae=${SPHAE_VERSION}=pyhdfd78af_0 python==3.11
ENV PATH /opt/miniforge3/bin:$PATH
RUN conda clean -af -y

# Download test data
RUN git clone "https://github.com/linsalrob/sphae.git"

# Install Sphae databases (with dynamic threads)
#RUN sphae install --threads ${THREADS} --conda-frontend mamba

# Environment settings for filtlong bug
ENV LC_ALL=C
ENV LANGUAGE=

# Create the directory if it doesn't exist, then list its contents
RUN mkdir -p sphae/tests/db && ls sphae/tests/db
RUN mkdir sphae/tests/db/Pfam35.0 && touch sphae/tests/db/Pfam35.0/Pfam-A.hmm.gz

# Create required conda environments without running
RUN sphae run --threads ${THREADS} --input sphae/tests/data/illumina-subset -k --use-conda --db_dir sphae/tests/db --conda-create-envs-only
RUN sphae run --threads ${THREADS} --input sphae/tests/data/nanopore-subset --sequencing longread -k --conda-create-envs-only --db_dir sphae/tests/db

# Cleanup
RUN rm -rf sphae.out /tmp/* /var/tmp/* /var/lib/apt/lists/*
63 changes: 41 additions & 22 deletions README.md
Original file line number Diff line number Diff line change
Expand Up @@ -33,10 +33,8 @@ This snakemake workflow was built using Snaketool [https://doi.org/10.1371/journ
- Assembly
- Contig quality checks; read coverage, viral or not, completeness, and assembly graph components.
- Phage genome annotation

A complete list of programs used for each step is mentioned in the `sphae.CITATION` file.

**If you are new to bioinformatics, here is a tutorial to follow: https://github.com/AnitaTarasenko/sphae/wiki/Sphae-tutorial**
**If you are new to bioinformatics or running command line tools, here is a great tutorial to follow: https://github.com/AnitaTarasenko/sphae/wiki/Sphae-tutorial**

### Install

Expand Down Expand Up @@ -68,7 +66,41 @@ conda activate sphae
conda install -n base -c conda-forge mamba #if you don't already have mamba installed
```

Steps for installing sphae workflow
**Container Install**

We have two containers available,
1. [Sphae v1.4.5 with databases](https://hub.docker.com/repository/docker/npbhavya/sphae)
This is very large container, about 17.5 GB, so it may take a while to download and install.

Here are the commands to download sphae container with databases
```
TMPDIR=<where your tmpdir lives>
IMAGEDIR-<where you want the image to live>
singularity pull --tmpdir=$TMPDIR --dir $IMAGEDIR docker://npbhavya/sphae:latest
singularity exec sphae_latest.sif sphae --help
singularity exec sphae_latest.sif sphae run --help
singularity exec sphae_latest.sif sphae install --help
```
2. [Sphae v1.4.5 **without** databases](https://hub.docker.com/repository/docker/npbhavya/sphae)
This version of sphae container does not include the databases, so they would have to be downloaded separately. The advantage of this is the container is smaller, so quick to donwnload and the databases can be downloaded separately.
You will still need to install the databases with `sphae install` as outlined below.
```
TMPDIR=<where your tmpdir lives>
IMAGEDIR-<where you want the image to live>

singularity pull --tmpdir=$TMPDIR --dir $IMAGEDIR docker://npbhavya/sphae:latest
#test if sphae is installed
singularity exec sphae_latest.sif sphae --help
singularity exec sphae_latest.sif sphae run --help
#mount the databases to the image and run with a dataset
singularity exec -B </path/to/databases>:/databases sphae_latest.sif sphae run --input <input files> --db_dir /databases
```
**Source install**
```bash
#clone sphae repository
Expand All @@ -83,22 +115,6 @@ pip install -e .
#confirm the workflow is installed by running the below command
sphae --help
```
**Container Install**

You can use the pre-built sphae container with Docker/Singularity/apptainer available [here](https://quay.io/repository/gbouras13/sphae). It is very large as it comes with all the required software pre-installed, so may take a while to download and install.

As an example of installing the sphae .sif file and running sphae v1.4.4 with Singularity:

```
TMPDIR=<where your tmpdir lives>
IMAGEDIR-<where you want the image to live>
singularity pull --tmpdir=$TMPDIR --dir $IMAGEDIR docker://npbhavya/sphae:latest
singularity exec sphae_latest.sif sphae --help
singularity exec sphae_latest.sif sphae run --help
singularity exec sphae_latest.sif sphae install --help
```

You will still need to install the databases with `sphae install` as outlined below.


Expand Down Expand Up @@ -126,8 +142,10 @@ This step requires ~17G of storage

## Running the workflow

The command `sphae run` will run QC, assembly and annotation

Sphae is developed to be modular:
- `sphae run` will run QC, assembly and annotation
- `sphae annotate` will run only annotation steps

**Commands to run**

Only one command needs to be submitted to run all the above steps: QC, assembly and assembly stats
Expand Down Expand Up @@ -236,3 +254,4 @@ Genome summary file includes the following information to help,
If you come across any issues or errors, report them under [Issues](https://github.com/linsalrob/sphae/issues).

0 comments on commit 91ec1f6

Please sign in to comment.