Skip to content

Latest commit

 

History

History
236 lines (157 loc) · 6.76 KB

TEST_README.md

File metadata and controls

236 lines (157 loc) · 6.76 KB

For rGFA-formatted pangenome graphs

When only having assemblies but not a pre-existing pangenome graph

  1. Download and prepare genome assemblies
sh download.testData.sh

sh prepare.asm.sh
  1. Construct a rGFA pangenome graph with minigraph and build the index files for VRPG
python3 ../script/vrpg_preprocess.py --minigraph ../bin/minigraph --asmList build.asm.txt --index --xDep 100 --outDir results
  1. Extract and index node sequences
../module/nodeSeq --gfaFile results/upload/input.ref.gfa --upDir  results/upload
  1. Add the gene annotation track for the primary linear reference genome
../module/GraphAnno addRef --inGFF GCF_000146045.2_R64_genomic.gff.gz --upDir results/upload
  1. Overlay gene annotation for all nodes
sh prepare.gff.sh

../module/GraphAnno nodeGene --gffList build.gff.txt --upDir results/upload
  1. Add additional annotation tracks from BED files
../module/GraphAnno addBed --inBed test.track.bed --upDir results/upload
  1. Move the prepared data to VRPG's upload directory for rendering
# This graph will be regarded as the default graph to display
mv results/upload/* ../upload
  1. Start the Django development server
python3 manage.py runserver
  1. Access the visualized pangenome graph in VRPG

Visit the following address in the web browser: http://127.0.0.1:8000/app/vrpg/

When already having a rGFA pangenome graph buit by minigraph

For this tutorial, we will just use the rGFA graph generated above for demonstration.

  1. Align assemblies to the rGFA graph
# Let's assume that data migration has not been done and input.ref.gfa is still in results/upload.
mkdir mapping
minigraph -cxasm -t 10 --vc results/upload/input.ref.gfa DBVPG6044.genome.fa.gz -o mapping/DBVPG6044.unstable.gaf
minigraph -cxasm -t 10 --vc results/upload/input.ref.gfa DBVPG6765.genome.fa.gz -o mapping/DBVPG6765.unstable.gaf
minigraph -cxasm -t 10 --vc results/upload/input.ref.gfa SK1.genome.fa.gz -o mapping/SK1.unstable.gaf
minigraph -cxasm -t 10 --vc results/upload/input.ref.gfa Y12.genome.fa.gz -o mapping/Y12.unstable.gaf

  1. Create GAF file list
echo "DBVPG6765#HP0 mapping/DBVPG6765.unstable.gaf NA" >> mapping/gaf.list
echo "Y12#HP0 mapping/Y12.unstable.gaf NA" >> mapping/gaf.list
echo "SK1#HP0 mapping/SK1.unstable.gaf NA" >> mapping/gaf.list
echo "DBVPG6044#HP0 mapping/DBVPG6044.unstable.gaf NA" >> mapping/gaf.list
  1. Build index files for VRPG
python ../script/vrpg_preprocess.py --gafList mapping/gaf.list --rGFA results/upload/input.ref.gfa --outDir rGFA_upload --thread 5 --index --xDep 100
  1. Extract and index node sequences
../module/nodeSeq --gfaFile results/upload/input.ref.gfa --upDir rGFA_upload
  1. Add the gene annotation track for the primary linear reference genome
../module/GraphAnno addRef --inGFF GCF_000146045.2_R64_genomic.gff.gz --upDir rGFA_upload
  1. Overlay gene annotation for all nodes
# If build.gff.txt has not been created, type
sh prepare.gff.sh

../module/GraphAnno nodeGene --gffList build.gff.txt --upDir rGFA_upload
  1. Add additional annotation tracks from BED files
../module/GraphAnno addBed --inBed test.track.bed --upDir rGFA_upload
  1. Move the prepared data to VRPG's upload directory for rendering
# If a default graph has been determined (see step 7 when only having assemblies but not a pre-existing pangenome graph), this graph will be the additional graph to display.
mv rGFA_upload/upload ../upload/rGFA_graph
  1. Start the Django development server
python3 manage.py runserver
  1. Access the visualized pangenome graph in VRPG

Visit the following address in the web browser: http://127.0.0.1:8000/app/vrpg/

For GFA-formatted pangenome graphs

For GFA-formatted graphs generated by Minigraph-Cactus

  1. Construct a pangenome graph with Minitraph-Cactus

Note: The parameters and options used below are for this testing example only, which may not work best for all cases. Please refer to this tutorial for a more detailed instruction on building pangenome graphs with Minigraph-Cacuts.

# Assume that Minitraph-Cactus has been installed
sh mc.genome.sh
cactus-pangenome ./js mc.genome.txt --outDir mc --outName mc --reference SGDref --gfa

  1. Format conversion and index building
../module/gfa2view --GFA mc/mc.gfa.gz --ref SGDref#0 --outDir mc_upload --index --range 2000 --thread 5 --xDep 100

  1. Extract node sequences and add annotation tracks
sh mc.gff.sh

../module/nodeSeq --gfaFile mc/mc.gfa.gz --upDir  mc_upload/upload
../module/GraphAnno addRef --inGFF GCF_000146045.2_R64_genomic.gff.gz --upDir mc_upload/upload
../module/GraphAnno nodeGene --gffList mc.gff.txt --upDir mc_upload/upload
../module/GraphAnno addBed --inBed test.track.bed --upDir mc_upload/upload

  1. Move the prepared data to the upload directory of VRPG for rendering
mv mc_upload/upload ../upload/mc_graph
  1. Start the Django development server
python3 manage.py runserver
  1. Access the visualized pangenome graph in VRPG

Visit the following address in the web browser: http://127.0.0.1:8000/app/vrpg/

For GFA-formatted graphs generated by PGGB

  1. Construct a pangenome graph with PGGB

Note: The parameters and options used below are for this testing example only, which may not work best for all cases. Please refer to this tutorial for a more detailed instruction on building pangenome graphs with Minigraph-Cacuts.

# Assume that fastix and pggb have been installed
sh pggb.genome.sh
pggb -i pggb_genome/all.fastix.fa -t 10 -p 95 -n 5 -k 23 -o pggb

  1. Format conversion and index building
# Please replace 'all.fastix.fa.*.final.gfa' with the actual file name
../module/gfa2view --GFA pggb/all.fastix.fa.*.final.gfa --ref SGDref#1 --outDir pggb_upload --index --range 2000 --thread 5 --xDep 100

  1. Extract node sequences and add annotation tracks
sh pggb.gff.sh

# Please replace 'all.fastix.fa.*.final.gfa' with the actual file name
../module/nodeSeq --gfaFile pggb/all.fastix.fa.*.final.gfa --upDir pggb_upload/upload
../module/GraphAnno addRef --inGFF GCF_000146045.2_R64_genomic.gff.gz --upDir pggb_upload/upload
../module/GraphAnno nodeGene --gffList pggb.gff.txt --upDir pggb_upload/upload
../module/GraphAnno addBed --inBed test.track.bed --upDir pggb_upload/upload
  1. Move the prepared data to the upload directory of VRPG for rendering
mv pggb_upload/upload ../upload/pggb_graph
  1. Start the Django development server
python3 manage.py runserver
  1. Access the visualized pangenome graph in VRPG

Visit the following address in the web browser: http://127.0.0.1:8000/app/vrpg/