Skip to content

a basic workflow for running seekdeep pipelines in snakemake

Notifications You must be signed in to change notification settings

bailey-lab/seekdeep_illumina_snakemake

Repository files navigation

seekdeep_illumina_snakemake

a basic workflow for running Nick Hathaway's seekdeep on illumina. This version splits up jobs into individual snakemake submissions.

Installation:

mamba create -c conda-forge -c bioconda -n snakemake snakemake
mamba activate snakemake

Setup your environment:

  • Change directory to a folder where you want to run the analysis
  • clone this repository with git clone web_address - you can get the web_address from the green 'code' button
  • Download the sif file from here into the same folder: https://seekdeep.brown.edu/programs/elucidator.sif

Usage:

  • Edit the seekdeep_illumina_general.yaml file using the instructions in the comments. Use a text editor that outputs unix line endings (e.g. vscode, notepad++, gedit, micro, emacs, vim, vi, etc.)
  • If snakemake is not your active conda environment, activate snakemake with:
mamba activate snakemake
  • If on a slurm system, edit the slurm/config.yaml file to match sbatch job submission instructions of your system, or if not on a slurm system, edit the non_slurm/config.yaml file. (If you already have a slurm or non_slurm profile saved in ~/.config/snakemake/slurm, you can delete the slurm or non_slurm folder)
  • Run all steps with (e.g. if using a slurm profile):
snakemake -s setup_run.smk --profile slurm
snakemake -s run_extractor.smk --profile slurm
snakemake -s finish_process.smk --profile slurm
  • You can also run all steps (editing the file with an appropriate --profile name) with:
bash run_all_steps.sh

Help:

You can read Nick Hathaway's manual here: https://seekdeep.brown.edu/

If you're in the folder where you downloaded the elucidator.sif file, you can get help on any seekdeep command with:

singularity exec elucidator.sif SeekDeep [cmd] -h

three main commands in the snakefile.

  • The first command gets info about the genome (genTargetInfoFromGenomes).
  • The second command sets up an analysis run (setupTarAmpAnalysis).
  • The third command runs 3 seekdeep programs (runAnalysis.sh, no help files).

Here are some example help commands to learn more about these commands:

  • singularity exec elucidator.sif SeekDeep -h
  • singularity exec elucidator.sif SeekDeep genTargetInfoFromGenomes -h
  • singularity exec elucidator.sif SeekDeep setupTarAmpAnalysis -h

three sub-steps of running seekdeep.

Each of these steps can be tweaked for sensitivity and specificity (via extra_ [step]_cmds at the bottom of the yaml file):

  • The first command extracts amplicon reads (extractor)
  • The second command clusters together similar reads (qluster)
  • The third command processes clusters into haplotypes (processClusters)

Here are some example help commands to learn more about these programs:

  • singularity exec elucidator.sif SeekDeep extractor -h
  • singularity exec elucidator.sif SeekDeep qluster -h
  • singularity exec elucidator.sif SeekDeep processClusters -h

About

a basic workflow for running seekdeep pipelines in snakemake

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published