An algorithm for calculating the coverage depth for each coding gene and the percentage of each gene covered at ≥ 10X depth.
The ouput file will give a list of gene identifiers with the corresponding mean coverage and their proportion covered at ≥ 10X depth.
We suppose you produced a sorted.bam file at this step (for example using Samtools).
The algorithm was written for Plasmodium falciparum species, but can also be used for other Plasmodium species and other organisms (by supplying fasta and gff files).
Prior to compute depth coverage at each position of the genome, you must convert your sorted.bam file into a sorted.bed file with bedtools bamtobed:
bedtools bamtobed -i file_sorted.bam > file_sorted.bed
To compute depth coverage at each position of the genome with bedtools genomecov, you must specified a text file containing the list of chromosomes and corresponding length concomitantly with the sorted.bed file. A list_chromosomes.txt file for the Plasmodium falciparum species (version 39 on PlasmoDB) is provided in the data
directory.
bedtools genomecov -d -i file_sorted.bed -g list_chromosomes.txt > file_coverage.txt
2. Calculating mean coverage of each coding gene and percentage of coding gene covered at ≥ 10x depth
To calculate the mean coverage of each coding gene and percentage of coding gene covered at ≥ 10x depth, you must provide a species.gff file (that contains the coordinates of each coding gene), a reference genome in fasta format (it must be the same version as the provided gff file), and your file_coverage.txt file previously obtained with bedtools genomecov.
An example of reference genome in fasta format and corresponding gff file are provided in the data
directory.
python3 Scan_gene_coverage.py -p file_coverage.txt -f reference_genome.fasta -g reference_coordinates.gff -o output.txt
If you use this program for your own work, please cite:
Coppée et al. 5WBF: A low-cost and straightforward whole blood filtration method suitable for whole-genome sequencing of Plasmodium falciparum clinical isolates. (2022) Malaria Journal. DOI: 10.1186/s12936-022-04073-1
https://malariajournal.biomedcentral.com/articles/10.1186/s12936-022-04073-1