Skip to content

Quick Start

Dreycey Albin edited this page Jun 7, 2024 · 7 revisions

Overview

This "quick start" guide walks through the core concepts of everything covered in the Wiki. It's essentially a TLDR for how to get started without reading through the other wiki pages.

Install PhageScanner

The fastest way to get started with PhageScanner is by using the Docker image. If you don't want to use docker, then please use the steps used in the installation guide. The following commands can be used to install the Docker image:

Download Docker (see the Docker Guides if you are new to Docker):

Pull down the docker image from DockerHub

docker pull dreyceyalbin/phagescanner

Test that the help message prints

docker run --rm dreyceyalbin/phagescanner --help

Run the three pipelines, sequentially:

Database pipeline - Build the database

docker run --rm \
    -v "$(pwd)/configs:/app/configs" \
    -v "$(pwd)/multiclass_database:/app/multiclass_database" \
    dreyceyalbin/phagescanner database -c /app/configs/multiclass_config.yaml -o /app/multiclass_database/ -v info

Training pipeline - Training and Test ML models

docker run --rm \
    -v "$(pwd)/configs:/app/configs" \
    -v "$(pwd)/multiclass_database:/app/multiclass_database" \
    -v "$(pwd)/training_output:/app/training_output" \
    dreyceyalbin/phagescanner train -c /app/configs/multiclass_config.yaml -o /app/training_output --database_csv_path /app/multiclass_database/ -v debug

Prediction pipeline - Run on metagenomic data, genomes or proteins

docker run --rm \
    -v "$(pwd)/configs:/app/configs" \
    -v "$(pwd)/examples:/app/examples" \
    -v "$(pwd)/prediction_output:/app/prediction_output" \
    -v "$(pwd)/training_output:/app/training_output" \
    dreyceyalbin/phagescanner predict -t "genome" -c /app/configs/multiclass_config.yaml -o /app/prediction_output -n "OUTPREFIX" -tdir .\training_output\ -i /app/examples/GCF_000912975.1_ViralProj227117_genomic.fna -v debug