The pipeline uses as input the sequencing reads of the target individuals, the target species reference genome, and the CADD scores and reference genome of a model species (i.e., chicken, chCADD scores (Groß, Bortoluzzi, et al., 2020) and the Galgal6 reference genome (Warren et al., 2017).
- (Yellow) Extraction of UCEs from the reference genome using Phyluce.
- (Dark Blue) Mapping the sequencing reads for individuals to the reference genome indicating two parallel approaches i) for 10X chromium read data and ii) for illumina read data.
- (Light Blue) Variant calling for SNPs within the UCEs.
- (Light grey) Creation of a chain file for the liftover of annotation from the chicken genome.
- (Dark Grey) chCADD scores conversion to pink pigeon (subject species) annotation.
- (Green) Intersection of BED files and UCE sites to output per site ppCADD (subject species) scores (Red).
Setting paths
When you are setting up the pipeline you must input the PATHS to your data and working directories.
To help with this you edit the file: scripts/add_paths_to_snakefiles.sh. (designed for a slurm system)
Then execute scripts/add_paths_to_snakefiles.sh.
You should then also edit and check the confiuration .yaml files along with the snakemake files to ensure PATHS are correct.
Publication:
Publication of analysis for original pipeline release DOI: https://doi.org/10.1111/1755-0998.13967