Skip to content

Releases: exomiser/Exomiser

Discovering the ID

01 Oct 15:21
Compare
Choose a tag to compare

This point release is compatible with the 1902, 2003 and 2007 data releases. We recommend you check for the latest data update at https://data.monarchinitiative.org/exomiser/latest/ to keep Exomiser functioning optimally with the latest data.

New features:

  • The JSON output now shows the id of the variantEvaluation taken from the VCF file.

New APIs:

  • Added VariantEvaluation.getId() and VariantEvaluation.Builder.id() methods to store VCF id field contents.

Unifying the disease types

22 May 11:05
Compare
Choose a tag to compare

This is a point release to address this issue:

#337 (comment)

There are no other changes.

Up to eleven and one more - new pathogenicity scores and a variant whitelist

13 Mar 15:50
Compare
Choose a tag to compare

CLI changes

This release contains significant diagnostic performance improvements due to the inclusion of a high-quality ClinVar whitelist and 'second generation' pathogenicity scores.

  • Added new PathogenicitySource sources - M_CAP, MPC, MVP, PRIMATE_AI. Be aware that these may not be free for commercial use. Check the licencing before use!
  • Added new variant whitelist feature which enables flagging of variants on a whitelist and bypassing of FrequencyFilter and VariantEffectFilter. By default this will use ClinVar variants listed as Pathogenic or Likely_pathogenic and with a review status of criteria provided, single submitter or better. See https://www.ncbi.nlm.nih.gov/clinvar/docs/review_status/ for an explanation of the ClinVar review status.

n.b. This release is incompatible with data release 1811 and below.

Core API

API breaking changes:

  • Removed FREQUENCY_SOURCE_MAP from FrequencySource
  • Changed Frequency, RsId and PathogenicityScore static valueOf() constructor to of()
  • Removed deprecated IntervalFilter.getGeneticInterval()
  • Changed visibility of PhenodigmMatchRawScore from public to package private and made immutable
  • Changed visibility of CrossSpeciesPhenotypeMatcher from public to package private and added static of() constructor
  • Replaced redundant Default*DaoMvStoreProto classes with new AllelePropertiesDaoMvStore
  • Added OntologyService as constructor argument to AnalysisFactory, AnalysisParser and AnalysisBuilder
  • Replaced BasePathogenicityScore.compareTo() method with default PathogenicityScore.compareTo()
  • GeneticInterval no longer accepts ReferenceDictionary as a constructor argument

New APIs:

  • Added CADD and REMM to data-genome AlleleProperty
  • Moved JannovarDataSourceLoader from autoconfigure to core module
  • Added AllelePosition.isSymbolic() method
  • Added Variant.isCodingVariant() method
  • Added AnalysisBuilder.addIntervalFilter(Collection<ChromosomalRegion> chromosomalRegions) method
  • Added new non-public FilterStats class for more accurate filtering statistics
  • Added new AllelePropertiesDao interface
  • Added new AllelePropertiesDaoMvStore implementation
  • Added new AllelePropertiesDaoAdapter to fix issue of Spring cache proxy not being able to intercept internal calls
  • Added new HpoIdChecker class to return current HPO id/terms for an input id/term
  • Added new HumanPhenotypeOntologyDao.getIdToPhenotypeTerms() method
  • Added new OntologyService.getCurrentHpoIds() method
  • Added new SampleGenotype.isEmpty() method
  • Added new experimental VcfCodecs class for de/serialising VCF lines
  • Added new JannovarDataProtoSerialiser.loadProto() method for loading intermediate JannovarProto.JannovarData
  • Added new VariantWhiteList and InMemoryVariantWhiteList implementation
  • Added new VariantEvaluation.isWhiteListed() method and relevant builder methods
  • Added new JannovarDataFactory for a simple programmatic API to build JannovarData objects
  • Added new TranscriptSource enum
  • Added new PathogenicityScore.of() static factory constructor
  • Added new PathogenicityScore.getRawScore() method
  • Added default PathogenicityScore.compareTo() method
  • Added new static PathogenicityScore.compare() method
  • Added new ScaledPathogenicityScore class
  • Added new MpcScore class
  • Add new Contig class for converting contig names to integer-based id

Other changes:

  • Updated Spring Boot to version 2.1.3
  • Updated Jannovar to version 0.28
  • Updated HTSJDK to version 2.18.2
  • Refactored FrequencyData to use array-based backing for 5-10% memory usage improvement and lower GC especially when nearing max memory
  • Refactored AnalysisParser to utilise AnalysisBuilder directly reducing code duplication
  • Refactored AnalysisRunner classes to to utilise new FilterStats class
  • Refactored QueryPhenotypeMatch to store and return input queryPhenotypeMatches argument
  • Refactored VariantDataServiceImpl to use new AllelePropertiesDao
  • Refactored VariantDataServiceImpl for better readability and performance
  • Added check for obsolete HPO id input in AnalysisBuilder.hpoIds()
  • Re-enabled PhenixPrioritiser in AnalysisParser
  • Refactored VariantEvaluation.getSampleGenotypeString() implementation to use SampleGenotype instead of VariantContext
  • Refactored VariantEffectCounter internals with VariantEvaluation calls in place of VariantContext
  • Enabled flagging of variants on a whitelist and bypassing of FrequencyFilter and VariantEffectFilter
  • Changed DefaultDiseaseDao to only return diseases marked as having known disease-gene association or copy-number/structural causes
  • Added range check to BasePathogenicityScore constructor
  • Updated CaddScore and SiftScore to extend ScaledPathogenicityScore
  • Updated CaddDao to use CADD phred scaled score directly
  • Replaced production use of ReferenceDictionary from HG19RefDictBuilder with Contig
  • Added new PathogenicitySource sources - M_CAP, MPC, MVP, PRIMATE_AI. Be aware that these may not be free for commercial use.

This one goes up to eleven... Samples, Pedigrees and no more SPARSE

21 Sep 13:24
Compare
Choose a tag to compare

CLI changes

  • Removed analysisMode: SPARSE option - this will default to PASS_ONLY
  • Removed phenixPrioritiser: {} option - we recommend using hiPhivePrioritiser: {runParams: 'human'} for human-only model comparisons
  • Changed outputPassVariantsOnly to outputContributingVariantsOnly in outputOptions. Enabling this will only report the variants marked as CONTRIBUTING_VARIANT, i.e. those variants which contribute to the EXOMISER_GENE_VARIANT_SCORE and EXOMISER_GENE_COMBINED_SCORE score. This will default to false.
    outputOptions:
         outputContributingVariantsOnly: false

Core API

API breaking changes:

  • Removed unused VariantSerialiser
  • Moved ChromosomalRegionIndex from analysis.util package to model
  • Changed HiPhiveOptions.DEFAULT to HiPhiveOptions.defaults() to match style with the rest of the framework
  • Deleted redundant MvStoreUtil.generateAlleleKey() method in favour of AlleleProtoAdaptor.toAlleleKey()
  • Split VariantEffectPathogenicityScore.SPLICING_SCORE into SPLICE_DONOR_ACCEPTOR_SCORE and SPLICE_REGION_SCORE
  • Removed unused VariantEvaluation.getNumberOfIndividuals() and VariantEvaluation.Builder.numIndividuals()
  • InheritanceModeAnnotator now requires an Exomiser Pedigree as input and no longer takes a Jannovar de.charite.compbio.jannovar.pedigree.Pedigree
  • Changed SampleIdentifier default identifier from 'Sample' to 'sample' to fit existing internal implementation details
  • Replaced Analysis.AnalysisBuilder.pedPath(pedPath) and Analysis.getPedPath() with Analysis.AnalysisBuilder.pedigree(pedigree) and Analysis.getPedigree()
  • Replaced AnalysisBuilder.pedPath(pedPath) with AnalysisBuilder.pedigree(pedigree)
  • Removed obsolete PedigreeFactory - this functionality has been split amongst the new Pedigree API classes
  • Removed AnalysisMode.SPARSE this was confusing and unused. Unless you need to debug a script, we advise using AnalysisMode.PASS_ONLY
  • Replaced OutputSettings interface with the concrete implementation
  • Replaced OutputSettings.outputPassVariantsOnly() with OutputSettings.outputContributingVariantsOnly(). This still has the default value of false

New APIs:

  • Added new jannovar package and faster data serialisation format handled by the JannovarDataProtoSerialiser and JannovarProtoConverter.
  • Added new native Pedigree class for representing pedigrees.
  • Added new PedFiles class for reading PED files into a Pedigree object.
  • Added new PedigreeSampleValidator to check the pedigree, proband and VCF samples are consistent with each other.
  • Added SampleIdentifier.defaultSample() for use with unspecified single-sample VCF files.
  • Added InheritanceModeOptions.getMaxFreq() method for retrieving the maximum frequency of all the defined inheritance modes.
  • Added new no-args AnalysisBuilder.addFrequencyFilter() which uses maximum value from InheritanceModeOptions
  • Added Pedigree support to AnalysisBuilder
  • Added new VariantEvaluation.getSampleGenotypes() method to map sample names to genotype for that allele
  • Added new utility constructors to SampleGenotype e.g. SampleGenotype.het() , SampleGenotype.homRef()

Other changes:

  • Added support for REMM and CADD in AlleleProtoAdaptor
  • Added check to remove alleles not called as ALT in proband
  • SampleGenotypes now calculated for all variants in te VariantFactory
  • Added support for frequencyFilter: {} to AnalysisParser
  • Updated HTML output to display current SO terms for variant types/consequence
  • Various code clean-up changes
  • Changed dependency management to use spring-boot-dependencies rather than deprecated Spring Platform
  • Updated Spring Boot to version 2.0.4

JSON out, ClinVar data and multi-interval filters

18 May 10:09
Compare
Choose a tag to compare

CLI changes:

  • Added support for filtering multiple intervals in the intervalFilter
    # single interval
    intervalFilter: {interval: 'chr10:123256200-123256300'},
    # or for multiple intervals:
    intervalFilter: {intervals: ['chr10:123256200-123256300', 'chr10:123256290-123256350']},
    # or using a BED file - NOTE this should be 0-based, Exomiser otherwise uses 1-based coordinates in line with VCF
    intervalFilter: {bed: /full/path/to/bed_file.bed}
  • Added support for ClinVar annotations - available in the 1805 variant data release. These will appear automatically and are reported for information only.
  • Added JSON output format
    outputFormats: [HTML, JSON, TSV_GENE, TSV_VARIANT, VCF]

Core API changes:

  • Added new simple BedFiles class for reading in ChromosomalRegion from an external file.
  • Added support for filtering multiple intervals in the IntervalFilter
  • Added support for parsing multiple intervals in the AnalysisParser
  • Added new OutputOption.JSON
  • Added new JsonResultsWriter - JSON results format should be considered as being in a 'beta' state and may or may not change slightly in the future.
  • Added support for ClinVar annotations
  • Added ClinVar annotations to HTML and JSON output options
  • TSV_GENE and TSV_VARIANT output formats have been frozen as adding the new datasources will break the format. Use the JSON output for machines or HTML for humans.
  • Updated Spring platform to Brussels-SR9. This will be the final Exomiser release on the Brussels release train.

10.0.1

20 Mar 15:24
Compare
Choose a tag to compare

Tiny maintenance release.

  • Updated HTSJDK library to fix TribbleException being thrown when trying to parse bgzipped VCF files

Multiple inheritance modes, smaller, faster, leaner, better

07 Mar 16:22
Compare
Choose a tag to compare

CLI changes:

  • Deprecated extended cli options as these were less capable than the analysis file. Options are now --analysis or --analysis-batch only. See the .yml files in the examples directory for recommended scripts.
  • Exomiser can now analyse samples against multiple inheritance modes in one run using the new inheritanceModes field. This also allows variants to be considered under a model with a maximum frequency (%) cut-off. See example .yml files for more details.
       inheritanceModes: {
            AUTOSOMAL_DOMINANT: 0.1,
            AUTOSOMAL_RECESSIVE_HOM_ALT: 0.1,
            AUTOSOMAL_RECESSIVE_COMP_HET: 2.0,
            X_DOMINANT: 0.1,
            X_RECESSIVE_HOM_ALT: 0.1,
            X_RECESSIVE_COMP_HET: 2.0,
            MITOCHONDRIAL: 0.2
       }
  • The old modeOfInheritance option will still work, although it will only run with default frequency cut-offs and may be removed in a later release, so please update your analyses.
  • The new 1802_phenotype data release will not work on older exomiser versions as the PPI data is now shipped in a much more efficient storage format. This reduces the startup time to zero and reduces the memory footprint by approx 1 GB. We highly recommend you update older releases to the latest version in order to benefit from more recent phenotype data.
  • Default variant scores for FRAMESHIFT, NONSENSE, SPLICING, STOPLOSS and STARTLOSS have been increased from 0.95 to the maximum score of 1.0 to reflect clinical interpretation of these variant consequences.

Core changes:
API breaking changes:

  • Removed previously deprecated Settings and SettingsParser classes - this was only used by the cli which was also removed.
  • Removed unused PrioritiserSettings and PrioritiserSettingsImpl classes - these were only used by the SettingsParser
  • Removed unused PrioritiserFactory.makePrioritiser(PrioritiserSettings settings) method - this was only used by the SettingsParser
  • Removed unused PrioritiserFactory.getHpoIdsForDiseaseId(String diseaseId) method. This duplicated/called PriorityService.getHpoIdsForDiseaseId(String diseaseId)
  • Renamed VariantTypePathogenicityScore to VariantEffectPathogenicityScore
  • Method names of Inheritable have changed from InheritanceModes to CompatibleInheritanceModes to better describe their function.
  • Replaced SampleNameChecker with new SampleIdentifierUtil
  • Changed signature of InheritanceModeAnalyser to require an InheritanceModeAnnotator. This is now using Exomiser and Jannovar-native calls to analyse inheritance modes instead of the Jannovar mendel-bridge.
  • Changed GeneScorer.scoreGene() signature from Consumer<Gene> to Function<Gene, List<GeneScore>> to allow scoring of multiple inheritance modes in one run.
  • Changed Analysis and AnalysisBuilder method modeOfInheritance to inheritanceModes(InheritanceModeOptions inheritanceModeOptions)
  • Removed unused methods on AnalysisResults
  • Renamed OMIMPriority to OmimPriority
  • Renamed OMIMPriorityResult to OmimPriorityResult
  • Changed OmimPriorityResult constructor to require Map<ModeOfInheritance, Double> scoresByMode, getScoresByMode() and getScoreForMode(modeOfInheritance) methods
  • Changed DataMatrix from a concrete class to an interface
  • Changed ResultsWriter signatures to require a ModeOfInheritance to write results out for.
  • Changed ResultsWriterUtils now requires a specific ModeOfInheritance

New APIs:

  • Added new AlleleCall class to represent allele calls for alleles from the VCF file
  • Added new GeneScore class for holding results from the GeneScorer
  • Added new SampleIdentifier class
  • Added new SampleGenotype class to represent VCF GenotypeCalls for a sample on a particular allele.
  • GeneIdentifier now implements Comparable and has a static compare(geneIdentifier1, geneIdentifier2) method
  • Gene now contains GeneScore having been scored by a GeneScorer
  • VariantEvaluation now has methods to determine its compatibility and whether or not it contributes to the overall score under a particular ModeOfInheritance
  • Added new SampleIdentifierUtil to replace deleted SampleNameChecker
  • Added new InheritanceModeAnnotator and InheritanceModeOptions
  • Added new VariantContextSampleGenotypeConverter to create SampleGenotype from a VariantContext
  • Added new DataMatrixUtil, InMemoryDataMatrix, OffHeapDataMatrix, StubDataMatrix implementations
  • Added new methods on DataMatrixIO to facilitate loading new DataMatrix objects from disk.
  • Added new AnalysisResultsWriter to handle writing out results instead of having to manually specify writers and inheritance modes

Other changes:

  • Demoted most logging from info to debug
  • Removed Spring control of Thymeleaf from ThymeleafConfig and HtmlResultsWriter so this no longer interferes with web templates

Mitochondrial inheritance support

15 Jan 11:54
Compare
Choose a tag to compare

Updated the Jannovar library to 0.24 which now enables filtering for mitochondrial inheritance modes. Thanks to the whole Jannovar team for making this happen.

Multiple assemblies and many datasources

12 Dec 17:02
Compare
Choose a tag to compare

CLI changes:

  • Exomiser can now analyse hg19 or hg38 samples - see application.properties for setup details.
  • Analysis file has new genomeAssembly: field - see example .yml files. Will default to hg19 if not specified.
  • Genomic and phenotypic data are now separated to allow for more frequent and smaller updates - see README.md for details
  • Variant alleles are now stored in a new highly-compressed data format enabling much smaller on-disk footprint with minimal loss of read performance.
  • New variant frequency data-sets: TOPMed, UK10K, gnomAD - see example .yml files.
  • New caching mechanism - see application.properties for setup details.

Core changes:

  • Maven groupId changed from root org.monarchinitiative to more specific org.monarchinitiative.exomiser.
  • New AlleleProto protobuf class used to store allele data in the new MVstore.
  • Replaced DefaultPathogenicityDao and DefaultFrequencyDao implementations with MvStoreProto implementations.
  • Classes in the genome package are no longer under direct Spring control as the @Component and @Autowired annotations have been removed to enable user-defined genome assemblies on a per-analysis basis.
  • Genome package classes are now configured explicitly in the exomiser-spring-boot-autoconfigure module.
  • New GenomeAssembly enum
  • New GenomeAnalysisServiceProvider class
  • New GenomeAnalysisService interface - a facade for providing simplified access to the genome module.
  • New VcfFiles utility class for providing access to VCF files with the HTSJDK
  • New VariantAnnotator interface
  • New JannovarVariantAnnotator and JannovarAnnotationService classes
  • VariantFactoryImpl now takes a VariantAnnotator as a constructor argument.
  • VariantDataService getRegulatoryFeatures() and getTopologicalDomains() split out into new GenomeDataService
  • Deprecated Settings class - this will be removed in the next major version.
  • Updated classes in analysis package to enable analyses with user-defined genome assemblies.

Bugfix for intergenic variants in TAD null pointer

06 Sep 10:49
Compare
Choose a tag to compare

See #224 for details.

To update to this release unzip the distribution and edit the exomiser-cli-8.0.1/application.properties to point to the exomiser-cli-8.0.0 data directory. e.g. change

#root path where data is to be downloaded and worked on
#it is assumed that all the files required by exomiser listed in this properties file
#will be found in the data directory unless specifically overridden here.
exomiser.data-directory=data

to

exomiser.data-directory=/opt/exomiser-cli-8.0.0/data