Misc scripts than can be useful for bioinformatics
Not all scripts were created by myself. Some have been adapted from scripts I have found. Credit can be found within each script
Not all scripts are ready to be used as is. They should be adapted for your needs.
Miscellaneous one-liners for everyday use
Collect HISAT2 mapping statistics from multiple summary files located in the current directory and organized them into a table
Create busco summary plot for several busco results as seen here https://busco.ezlab.org/busco_userguide.html#companion-scripts.
Create coverage table for mafCoverage results
Create summary table from qualimap results for several samples
Query the ensembl rapid release homologue gene page to identify homologues of a given list of genes
Calculates several gene structure measures based on a GFF annotation file
Get assembly statistics of a given fasta file
Create summary table of Sniffles results
Create summary table from svim and svim-asm results
Create AGP file from a fasta file
Collection of reusable R functions
Create mummerplot of the first X chromosomes/scaffolds
Create mummerplot from list of chromosomes/scaffolds
Generate manhattan plot from GWAS results
Create interactive PCA plot
Plot histogram of mapping stats: mean coverage, mapping rate and mean mapping quality. Meant to work with the output of create_qualimap_summary.sh
Plot histogram of num ref homozygous, num non ref hom, num heterozygous, ts/tv, indels from the bcftools stats output
Get qualimap summary table
Finds duplicated sequences in a fasta file, keeps only one copy and combines the name of the duplicated scaffolds.
Rename fasta sequences according to a two column file. First column is current name, second column is new name
reverse complement the sequences in the fasta file whose names are in scaffs_to_convert.txt
Read a repeat masker output file and write TEs in a bed file
Run and plot GWAS with Gemma: Run GEMMA Association Tests with Univariate Linear Mixed Models
run_scan.sh and run_scan.
Convert data from input file to an object of class haploh.
Compute iHH, iES and inES over a whole chromosome
run_ihs.sh and run_ihs.R
Compute iHS (standardized ratio of iHH values of two alleles). Create manhattan plot with results
Compute XP-EHH (standardized ratio of iES of two populations). Create manhattan plot with results
Plot SV larger than minsize from a SV vcf summary table
Visual Studio Code snippet to create snakemake rule
Get SV summary table from VCF with sniffles results
Get summary of the structural variants (from SVIM) found in a given vcf to stdout
Convert a vcf file to eigenstrat format. Removes multi-alleleic and indel sites