Python pipeline for mapping alternative genome markers onto phylogenetic tree.
- Requirements.
- conda users: install dependencies with
conda env create --file /path/to/algemapy.yaml
or if you do not have access (e.g. conda is installed system-wide)
conda env create --file /path/to/algemapy.yaml -p /your/path/to/env/
This environment includes: mafft, iqtree
- non-conda users: install dependencies listed in algemapy.yaml by any other means.
- External scripts/programs.
- How to install.
- Use python package manager to download and install dependencies.
- Add scripts to system path.
- Reference tree:
* Format: phylip.
* Download source: NCBI Taxonomy:
- Search for taxonomy group of interest using [Organism] field.
- Display results as common tree. This will redirect you to NCBI Taxonomy Browser.
- Check group of interest and click Choose.
- Save file as phylip.
* Sanitize the reference tree using:
agmdbf.py downloaded_tree.phy --output formatted_tree.phy --sanitize-ref-tree
- Reference genes:
* Format: tsv.
* Download source: NCBI Gene:
- Search for gene of interest within taxonomy group of interest using [Gene] and [Organism] fields.
- Send to file in Tabular (text) format.
- Download genes by ids and coordinates using
agmdbf.py tabular_summary.txt --output reference_genes_sequences.fasta --download-from-tab-summary