Skip to content

Latest commit

 

History

History
29 lines (19 loc) · 759 Bytes

search_seq_of_interest.md

File metadata and controls

29 lines (19 loc) · 759 Bytes

Steps to search metagenomic assemblies for a sequence of interest

  1. Make reference database of sequence of interest
makeblastdb -in ${ref} -dbtype nucl -out ${ref}.db
  1. Blast metagenomic assemblies against reference
blastn -query Assembly/MEGAHIT/${strain}.contigs.fa -max_target_seqs 1 -evalue 1e-10 -outfmt 6 -db ${ref}.db > blastn_${strain}
  1. Parse blastn output
cat blastn_${strain} | cut -f 1,7,8 > coord.${strain}
  1. Parse the fasta for the blastn contig ranges using SCGid
module load miniconda3

source SCGid/scgidenv/bin/activate 

python3 get_these_contig_ranges.py --fasta Assembly/MEGAHIT/${strain}.contigs.fa --coordinates coord.${strain} > ${strain}_${ref}.contigs