-
Notifications
You must be signed in to change notification settings - Fork 11
Common difficulties in characterisation of diploid genomes using k mer spectra analysis
The quality of fits of genome models is largely dependent on the quality of data, but also on the biological features of the genome. The most common problem of genome models is for the monoploid (1n) k-mer coverage converging on a “wrong genomic peak”. This usually happens, if the 1n coverage peak is not distinct. This can be caused by extremely low heterozygosity of the genome (i.e. the 1n signal is very weak), data contamination with other samples, or because the coverage is very low and the 1n peak largely overlaps with the error peak.
When the 1n coverage is not fit right, none of the estimated values will carry any biological information regarding the genome and it is important to visually inspect fits and make sure the estimates make sense in the context of the other known biology. For example, if we sequence a diploid selfing plant and the estimated heterozygosity is >5%, it is extremely likely that the true 1n coverage is ~½ of the estimated one. In GenomeScope we can add a flag “-n ” which adds a coverage prior and usually allows GenomeScope to converge to a biologically more relevant model.
In the next steps we will be going through some real examples to better explain the importance of a good model fit.
👉 ⚒ Let's start with identifying a low-sequencing coverage (depth) dataset here.
👆 Go back to Table of Content
Introduction
k-mer spectra analysis
- 📖 Introduction to K-mer spectra analysis
- 📖 Basics of genome modeling
- ⚒ manual model fitting (for better understanding of the underlying model)
- ⚒ simple diploid
- ⚒ demonstrating the effect of sequencing error rate on k-mer coverage
- 📖 Common difficulties in characterisation of diploid genomes using k mer spectra analysis
- ⚒ low coverage (pitfall) - to be merged
- ⚒ very homozygous diploid
- ⚒ highly heterozygous diploid
- ⚒ Genome size of a repetitive genome (pitfall)
- ⚒ Wrong ploidy (pitfall)
- 📖 Characterization of polyploid genomes using k mer spectra analysis
- ⚒ Autotetraploid
- ⚒ Allotetraploid
- ⚒ Estimating ploidy (smudgeplot)
- 📖 Genome modeling as a quality control
- ⚒ Contamination (pitfall)
- ⚒ k-mers in an assembly (Mercury/KAT)
- 📖 Analysing genome skimming data
Separation of chromosomes
- 📖Separate sub-genomes of an allopolyploid
- 📖Separating chromosomes by comparison of sequencing libraries
Species assignment using short k-mers
Others