Skip to content

Verkko v1.2

Compare
Choose a tag to compare
@brianwalenz brianwalenz released this 31 Oct 14:14
· 503 commits to master since this release

These are release notes for Verkko version 1.2, which was released on October 31st, 2022. Verkko is a hybrid genome assembly pipeline developed for telomere-to-telomere assembly of PacBio HiFi and Oxford Nanopore reads.

The source code distribution contains everything you need to create a binary distribution for your own specific OS. Please report any issues you encounter.

Citation

Minimum Requirements

  • 8GB minimum memory; 16GB strongly suggested
  • GCC 7 or newer (for compilation only)
  • Rust 1.58 or newer (for compilation only)
  • Python 3.5 or newer
  • Snakemake 7.0 or newer
  • GraphAligner v1.0.16 or newer
  • MBG v1.0.12 or newer

Installation

Users can download Verkko as source code or installed through a package manager like conda. The source code package needs to be compiled and installed before it can be used. Do NOT download the .zip source code. It is missing files and will not compile. This is a known flaw with git itself.

Run either:

conda install -c conda-forge -c bioconda -c defaults verkko

or build from source

curl -L https://github.com/marbl/verkko/releases/download/v1.2/verkko-v1.2.tar.gz --output verkko-v1.2.tar.gz
tar -xzf verkko-v1.2.tar.gz
cd verkko-v1.2/src
make -j 8
cd ..

Confirm the MD5 for the tar.gz matches expected:

2975d74b265e22f9748385b08269d709  verkko-v1.2.tar.gz

Verkko will be installed in verkko-v1.2/bin. You can move the contents to verkko-v1.2/bin/* and verko-v1.2/lib/* to a central location if you would like. If GraphAligner or MBG are not available in your path, you may also symlink them under verkko/lib/verkko/bin/

See the README for more details.

Changes

  • Update MBG from version 1.0.10 to 1.0.12.

Bug Fixes

  • Avoid creating 1bp nodes when breaking mis-assembiles (#99)
  • Fix additional constraint checking in snakemake 7.1.1 (#95)
  • Fix crash on zero-length reads (#94)
  • Fix crashes in dry-run.

Known Issues

See the issues page for up-to date open issues, or to report a problem.

  • GraphAligner misses mappings on genomes with large (>20kb) stretches of simple sequence GA-rich repeats (only seen in plants so far), as a workaround, increase the --seed-max-length from the default of 10000 to 100000 (note this will increase runtime 2-3 fold). This will be addressed in a future release.
  • Long runtime of MBG with very high HiFi coverage (>200x). We recommend downsampling to 100x.
  • Lost heterozygosity in simple-sequence repeats in low-heterozygosity samples. When there is no other variation within at most 1 HiFi read length away, the simple sequence repeat difference will be ignored and a consensus of both haplotypes is produced. This will be addressed in a future release.

Legal

See the README.licenses file and individual source code files for details.