Optimisations
We have looked into the memory consumption of some of the processes and optimised them. vcf2fasta
can run locally for many many VCFs without running out of memory and crashing. filter_vcf
runtime has also been improved, along with memory utilisation.
- Memory optimisations: #12
- Annotating variants is also optimised to use only the momery required.
- Fixed issue with N in the reference where we have to insert it for mpileup to work., #13.
samtools sort
is supplied with -m option if there is at least 1G of memory per CPU. When there is enough memory, sort will be done in-memory, instead of temp files.--with-mixtures
option allows to output IUPAC extended alphabet mixture codes, for SNP positions that FAIL filters, based on frequency threshold. E.g. if threshold is set to 0.2, and record is A:50,T:50 for REF and ALT, then both are used because for each base mixture is 0.5. However, if A:2, T:98, then only T is used because A's ratio is 0.02. Multi ALT bases are also treated in the same fashion.- Progress for
vcf2fasta
based on guestimate of total number of records - Bug fixes