There is no official publication for MagicLamp. If it was useful for your work, please cite as follows:
Garber, AI., Ramirez, GA., Merino, N., Pavia MJ., McAllister, SM. (2020) MagicLamp: toolkit for annotation of genomic data using discreet and curated HMM sets. 2023: MagicLamp, GitHub repository: https://github.com/Arkadiy-Garber/MagicLamp.
Special thanks to AstrobioMike and his bit software package for enabling easy access to NCBI's RefSeq and GenBank assemblies.
git clone https://github.com/Arkadiy-Garber/MagicLamp.git
cd MagicLamp
conda create -n magiclamp -c bioconda -c conda-forge -c defaults -c astrobiomike hmmer bit --yes
conda activate magiclamp
YfGenie.py --hmm -d HMMs_dir -a GCF_023585845.1 -o GCF_023585845.1 -t 16
- In the above command GCF_023585845.1 represents the RefSeq assembly accession.
- HMMs_dir is the folder containing raw HMM files. See the subfolders inside the hmms directory to see what these look like.
- You can also provide a meta-data file via the -m argument with gene and pathway names for each provided HMM (formatted after the hmm_summary.csv file in this repo).
YfGenie.py --gff -y genes.tsv -a GCF_023585845.1 -o GCF_023585845.1 -t 16
- genes.tsv is a single-column file listing gene names of interest (example file of the same names can be found in this repo).
YfGenie.py --gc -a GCF_023585845.1 -o GCF_023585845.1 -t 16
- this will generate a single-line TSV file that lists usage frequences for each amino acid residue.
YfGenie.py --hmm --gff --gc -d HMMs_dir -y genes.tsv -a GCF_023585845.1 -o GCF_023585845.1 -t 16
YfGenie.py --hmm --gff --gc -d HMMs_dir -y genes.tsv -c genome.fa -g genome.gff -p genome.faa -o genome_out -t 16
while read i; do
YfGenie.py -a $i -o $i -t 16 --hmm -d HMMs_dir --gc
done < genomes.txt
- the above command is a "while loop", where genomes.txt is a single-column text file that contains a list of RefSeq or GenBank assemblies (see example file in this repo)