Skip to content

ENAR2025

Ali Rahnavard edited this page Sep 5, 2024 · 1 revision

Statistical methods for biomarker discovery using metabolomics

The affordability of metabolomics profiling has enabled extensive surveys of metabolites in human health, other hosts, and the environment on an unprecedented scale. Consequently, this surge in data has driven the development of new statistical and computational approaches to analyze and integrate diverse metabolomics data. However, despite sharing many similarities with conventional omics, routine analysis methods from the literature cannot be directly applied to metabolomics studies to achieve complete mechanistic insight without risking false positive or false negative results. This challenge is amplified by the technical nature of metabolomics data, which are typically noisy, heterogeneous, and high-dimensional, often affected by platform-specific effects requiring specialized tools and methods for accurate analysis. From a practical standpoint, using generic downstream analysis software without understanding the inherent statistical properties of metabolomics data can lead to inconclusive and potentially misleading biological conclusions. Moreover, the abundance of available downstream analysis methods complicates the selection process for non-specialists and inexperienced researchers. Finally, identifying reproducible signals for clinical actionability necessitates well-powered experimental designs and meta-analysis across studies, both of which present significant challenges within current metabolomics analysis paradigms.

Our workshop will begin with a high-level introduction to computational multi-omics, focusing on the current state-of-the-art and addressing key challenges, particularly in downstream analysis methods for metabolomics data compared to other omics. Activities will include formulating biological hypotheses and exploring contemporary statistical methods to address them. The workshop will be project-focused and hands-on, encouraging participants to bring specific studies or projects for immediate application of the workshop content using real data. Drawing upon our extensive experience in both industry and academia, we aim to provide a diverse perspective on the topic. This includes insights from drug discovery and basic science, enabling attendees to gain a holistic understanding of multi-omics and clinical data integration through advanced tools applied to relevant examples and case studies. The workshop will start with an overview of the statistical challenges inherent in analyzing high-dimensional data typical of multi-omics studies. Introductory lectures will cover: Preprocessing and normalization of metabolomics compared to other omics. Harmonizing separately acquired metabolite profiles from multiple studies. Challenges associated with precisely testing multivariable associations in population-scale meta-omics studies. Meta-analysis of omics datasets for high-sensitivity discovery. Integration with other data types such as human genetics and metagenomics.

Please list ALL pre-requisites (statistical/programming knowledge required) for your course below. If none, write NONE.

Workshop attendees will use tools for metabolomics meta-analysis, including multi-study data scaling, integration, and harmonization using the massSight tool. Workshop attendees will utilize meta-analysis tools for pattern discovery in multi-omics with metabolomics data, including: Tweedieverse tutorial and Tweedieverse examples: A unified statistical framework for differential analysis of multi-omics, paired with the MMUPHin tool for meta-analysis. Workshop attendees will explore integrative machine learning to analyze and integrate various biological data types with metabolomic profiles using the Integratedlearner tool. Finally, attendees will practice generating publication-quality figures and effectively visualizing the results.

Topic(s) to be covered?

# Learning outcomes for participants

Participants will:

  1. Be able to apply novel techniques (such as massSight) to combine metabolite profiles and perform meta-analysis of metabolomics data.
  2. Understand statistical properties of metabolomics data and challenges for multivariable association testing in population-scale meta-omics studies.
  3. Be able to perform a meta-analysis of metabolomics datasets using Tweediverse and MMUPHin tools on combined multiple studies data using massSight tool.
  4. Integration of metabolomics with other omics data types using IntegratedLearner.
Clone this wiki locally