Releases: hartwigmedical/hmftools
Releases · hartwigmedical/hmftools
purple v4.0.1
Bugs:
- Block
LOW_TUMOR_VCN
filtered variants from being reportable
purple v3.9.4
Bugs:
- Block
LOW_TUMOR_VCN
filtered variants from being reportable
sigs v1.2
Technical:
- added version info
- compatibility with Purple v4.0 (new germline status types)
sage v3.4
Functional
- sync overlapping fragments on by default
- sync overlapping fragments logic: favor longer INDELs for CIGAR mismatch, forms consensus no matter how many bases are mismatching, ignore 3’ side of overlap where insert size < read length
- VCF writes AMQ (average map qual), ANM (average events per read) and MED (max edged distance)
- soft-filter average map qual for ref vs alt reads if different > 15
- average base qual filter lowered from 28 to 25
- new modMAPQ logic supports calling low mapq regions (default behaviour unchanged)
- added read strand bias filter
- added maxEdgeDistance filter
- AF filter can pass with lower AF if p-value condition met
- new deduplication logic for INDELs logic handles 1:M deduplication
Targeted panel mode, enabled with config 'high_depth_mode':
- ignore all reads with AD<30
- ignore all soft clip support
- added jitter AF filter
- BQR ignores overlapping bases
Bugs
- fragment strand bias calculated incorrectly
HTML Visualisations added for variants:
- creates a HTML for a subset of variants based on config. Shows read context and fragment support
- config 'vis_variants' to specify variants to produce visualisations, format: 'chromosome:position:ref:alt' separated by ';'. Will only run sage for +/- 200 base region around specified variants.
- config 'vis_pass_only' to generate files for passing variants, must be called with 'specific_regions'
- config 'vis_output_dir' default 'vis' if not specified
Config:
- sync_fragments removed since now on by default. Use 'no_fragment_sync' to disable.
- output -> output_vcf
purple v4.0
Functional:
- don’t fit short arms on 13,14,15,21,22
- Mask IG/TCR regions in fit
- Mask regions <2MB from centromere in fit
- Where ambiguous (low purity), BAF is now fit to minimise major allele CN
- chromosome X CN amplifications are called at 1.5x ploidy for males
- don’t smooth large germline deletions in diploid normalisation logic
- fit in 0.5% intervals for purity <20% and lower min to 7%
- write list of reportable transcripts to somatic VCF when alt-transcripts (eg CDKN2A) exist for a driver gene
- add LOH percent to QC output file
Technical:
- allow extra sample IDs in VCF and any order
- throw exception and exit on charting error
Bugs:
- ploidy was not calculated accurately in somatic mode
Visualisations
- only show diploid regions in input.png
- tumor is always in blue in input.png
Panel:
- add deviation penalty adjustment for GC ratio (config: gc_ratio_exponent, deviation_penalty_gc_min_adjust)
- set targeted panel default values:
-ploidy_penalty_standard_deviation 0.10
-ploidy_penalty_min 0.20
-ploidy_penalty_sub_one_major_allele_multiplier 3.00
-deviation_penalty_gc_min_adjust 0.25
-gc_ratio_exponent 3.0
-min_diploid_tumor_ratio_count 3
-min_diploid_tumor_ratio_count_centromere 3
mark-dups v1.1
Functional:
- use unmap regions file to unmap problematic reads
- set NM attribute on consensus reads
Technical:
- various performance improvements for BAM writing
Config:
- unmap_regions - a TSV of regions to unmap, see HMF pipeline resources (v5.34) for current files
- bam_file -> input_bam
- multi_bam, samtools and sambamba paths for post-run sort, merge and index
lilac v1.6
Technical:
- tumor-only mode write fragment into tumor coverage columns instead of ref
Bugs:
- protect against deletion in read near end of coding region
Config:
- in tumor-only mode specify the tumor BAM file with -tumor_bam instead of -reference_bam
- removed 'write_all_files', instead use 'write_types selecting 1 or more from SUMMARY, FRAGMENTS, READS, REF_COUNTS
gripss v2.4
Technical:
- handle Gridss VCF with sample IDs in reversed order
Panel:
- added panel soft filters: qual-per-AD and modified AF
- set panel config by default if in target regions mode
-hard_min_tumor_qual 200
-min_qual_break_point 1000
-min_qual_break_end 1000
-filter_sgls
-qual_per_ad 30
-modified_af 0.03
-modified_af_hotspot 0.005
gene-utils v1.1
Functional:
- expand Sage coverage to all driver genes
- added known fusion ref file generation routine
- added BED file validation and fix routine
- added general purpose liftover and probe creation routines
- coding region files can now be built from a gene name input file (config: gene_id_file)
Technical:
- improved region overlap check
Bugs:
- fixed rare phasing issues for proteome writer
cuppa v2.0
Major overhaul of CUPPA:
- Core classifier architecture of CUPPA is now layers of layers of logistic regressions. This is implemented in Python primarily using the scikit-learn library. In CUPPA v1, this was a series of formulas.
- Improved visualization
- New Java routine (CuppaDataPrep) to extract features from VCFs and TSVs that are passed to the Python component