Collection of scripts for downstream analysis of methylation bedgraph files produced by the EpiDiverse WGBS pipeline
The EpiDiverse WGBS pipeline is a great tool performing mapping and methylation calling of Bisulfite Sequencing datasets from non-model species. This repository contains scripts for downstream analysis of the WGBS pipeline output, including coverage filtering, merging samples, NA filtering and region-specific average methylation calculation. Methylation of regions can be for both individual regions (eg. individual genes) or whole genomic features (eg. average methylation of all genes).
The repo is composed of two different workflows:
WGBS_simpleworkflow
Contains scripts for downstream analysis of the single-sample bedgraph files produced by the WGBS pipeline, including coverage filtering, merging samples into uninbed files, NA filtering and filtering of non-variable positions.
In addition, it is possible to calculate simple average methylation of regions (mean methylation in Schultz et al 2012).
The workhorse here are unionbed files (positions as rows and samples as columns) with the computed fraction of methylated/total reads for each sample and position.
Eg.
chrom start end Sample1 Sample2 Sample3 ...
Chr1 2234 2235 13.63 14.29 90.90
WGBS_completeworkflow
Contains scripts for downstream analysis of the single-sample bedgraph files produced by the WGBS pipeline, including coverage filtering, merging samples into unincount files and NA filtering.
In addition, it is possible to:
- calculate both mean and weighted methylation of regions according to Schultz et al 2012, optionally correcting for non-conversion rates.
- to classify genes methylation status using a binomial model adapted from Takuno and Gaut 2013 and Niedethuth et al. 2016.
The workhorse here are unioncount files (positions as rows and samples as columns) with the methylated/total read counts for each sample and position.
Eg.
chrom start end Sample1 Sample2 Sample3 ...
Chr1 2234 2235 3/22 4/28 10/11
PUBLICATIONS:
Genetic and environmental drivers of large-scale epigenetic variation in Thlaspi arvense