Data for publications related genomic epidemiology for the Victorian SARS-CoV-2 response from The Peter Doherty Institute for Infection and Immunity and the Victorian Department of Health and Human Services in Australia.
Raw genome sequence reads have been deposited at NCBI SRA within PRJNA613958. Every isolate was sequenced with Illumina paired-end 150bp reads. Some isolates have additional Nanopore data.
The GISAID ID for each set of reads has been added to the BioSample description.
Early versions of consensus genomes were deposited in GISAID, but some were rejected due to "frame shifts". NCBI also rejected them for the same reason. Investigation is ongoing to confirm if these are genuine indels or sequencing artifacts.
Download a multi-FASTA file VIC.ffn of all genomes.
Download the aligned-FASTA file VIC.afa
Download Newick file VIC.nwk
Download CSV metadata file VIC.csv
Columns include:
VIC_ID
- the identifier used in FASTA, AFA and NWK.NCBI_ID
- this BioSample ID will link to FASTQ and FASTAGISAID_ID
- consensus genome accessionDate_coll
- date sample was collectedPatient_age
- patient age in yearsPatient_sex
- patient genderSeq_protocol
- ARTIC V1 or V3Sequencing_technology
- Instrument sequencing was done onPCR_Ct_value
- sample Ct, higher is worse
If you use any of the data in this repository or PRJNA613958 please cite:
T Seemann, CR Lane, NL Sherry, S Duchene, AG da Silva, L Caly, ML Sait, SA Ballard, K Horan, MB Schultz, T Hoang, M Easton, S Dougall, TP Stinear, J Druce, M Catton, B Sutton, A van Diemen, C Alpren, DA Williamson, BP Howden. Tracking the COVID-19 pandemic in Australia using genomics, medRχiv, 16 May 2020. doi:10.1101/2020.05.12.20099929