Quick start

1. Find a suitable seed

There are different types of seed possible:

A single read from the dataset that originates from the organelle genome.
A organelle sequence derived from the same or a related species.
A complete organelle sequence of a more distant species (recommended when there is no close related sequence available)

The format should be like a standard fasta file (first line: >Id_sequence)

Be cautious for seed sequences that are similar in both mitochondrial and chloroplast genomes.
We observed good results with RUBP sequences as seeds for chloroplast assembly.

2. Create configuration file

You can download the example file (config.txt) and adjust the settings to your liking.
Every parameter of the configuration file is explained in the file.

3. Run NOVOPlasty

No further installation is necessary:

perl NOVOPlasty3.0.pl -c config.txt

The input reads have to be uncompressed Illumina reads (fastq/fasta files) or gz/bz2 zipped files. There is also an Ion Torrent option, but it does not produce the best results. Either two separate files(forward and reverse) or a merged fastq/fasta file.
Multiple libraries as input is not yet supported.

DO NOT filter or quality trim the reads!!! Use the raw whole genome dataset (Only adapters should be removed)!

You can subsample to speed up the process and to reduce the memory requirements. This also possible by using the max memory option in the config file. But it is recommended to use as much reads as possible, especially when the organelle genome contains AT-rich stretches.

You can always try different K-mer's. In the case of low coverage problems or seed errors, it's recommended to lower the K-mer (set between 21-39)!!!.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Quick start

1. Find a suitable seed

2. Create configuration file

3. Run NOVOPlasty

Clone this wiki locally