Skip to content

Optimisations

Compare
Choose a tag to compare
@alexjironkin alexjironkin released this 28 Jun 10:55
· 60 commits to master since this release

We have looked into the memory consumption of some of the processes and optimised them. vcf2fasta can run locally for many many VCFs without running out of memory and crashing. filter_vcf runtime has also been improved, along with memory utilisation.

  • Memory optimisations: #12
  • Annotating variants is also optimised to use only the momery required.
  • Fixed issue with N in the reference where we have to insert it for mpileup to work., #13.
  • samtools sort is supplied with -m option if there is at least 1G of memory per CPU. When there is enough memory, sort will be done in-memory, instead of temp files.
  • --with-mixtures option allows to output IUPAC extended alphabet mixture codes, for SNP positions that FAIL filters, based on frequency threshold. E.g. if threshold is set to 0.2, and record is A:50,T:50 for REF and ALT, then both are used because for each base mixture is 0.5. However, if A:2, T:98, then only T is used because A's ratio is 0.02. Multi ALT bases are also treated in the same fashion.
  • Progress for vcf2fasta based on guestimate of total number of records
  • Bug fixes