Skip to content

Latest commit

 

History

History
99 lines (72 loc) · 7.37 KB

README.md

File metadata and controls

99 lines (72 loc) · 7.37 KB

SPRQ benchmark comparisons

SPRQ Application brief

This readme provides the information required to reproduce the results. Please contact support@pacb.com with any questions.

PacBio HiFi

  • HG002 was sequenced on the Revio system with SPRQ chemistry, yielding 146 Gbp. The reads were aligned with pbmm2 v1.13.1 and downsampled from 8-fold to 40-fold coverage aligned depth for variant calling and benchmarking.
  • Small variants were called with DeepVariant 1.6.1.
  • Structural variants were called with Sawfish 0.12.4.
  • Link to root directory

PacBio HiFi small variant benchmarking results (DeepVariant 1.6.1)

Depth Type TRUTH.TOTAL TRUTH.TP TRUTH.FN QUERY.TOTAL QUERY.FP QUERY.UNK FP.gt METRIC.Recall METRIC.Precision METRIC.Frac_NA METRIC.F1_Score
20-fold SNP 3365127 3356805 8322 4143790 2255 779837 1225 0.997527 0.99933 0.188194 0.998428
30-fold SNP 3365127 3362495 2632 4177463 1152 808868 492 0.999218 0.999658 0.193627 0.999438
20-fold INDEL 525469 513484 11985 956105 9346 414296 5286 0.977192 0.98275 0.433316 0.979963
30-fold INDEL 525469 520114 5355 971125 4625 426619 2506 0.989809 0.991506 0.439304 0.990657

PacBio HiFi structural variant benchmarking results for titration (Sawfish 0.12.4)

Depth Recall Precision F1-score
9.72 0.8804 0.9900 0.9320
11.67 0.9072 0.9895 0.9466
13.61 0.9229 0.9894 0.9550
15.56 0.9356 0.9894 0.9618
17.50 0.9411 0.9891 0.9645
19.45 0.9463 0.9891 0.9672
24.31 0.9525 0.9882 0.9701
29.17 0.9560 0.9885 0.9720
34.04 0.9586 0.9883 0.9733
38.90 0.9608 0.9882 0.9743

HiFi structural variant benchmarking results (20-fold, Sawfish 0.12.4)

TP_base TP_comp FP FN Recall Precision F1-score
22557 21062 232 1281 0.9463 0.9891 0.9672

Illumina

Illumina small variant benchmarking results (DRAGEN)

DRAGEN version Depth Type Filter TRUTH.TOTAL TRUTH.TP TRUTH.FN QUERY.TOTAL QUERY.FP QUERY.UNK FP.gt METRIC.Recall METRIC.Precision METRIC.Frac_NA METRIC.F1_Score
3.7.5 30-fold SNP PASS 3365127 3353531 11596 4042134 14752 672953 3869 0.996554 0.995621 0.166485 0.996088
4.2.1 35-fold SNP PASS 3365127 3357852 7275 3849974 1860 489362 985 0.997838 0.999447 0.127108 0.998642
3.7.5 30-fold INDEL PASS 525469 521874 3595 995996 3500 448346 1869 0.993158 0.993609 0.450148 0.993384
4.2.1 35-fold INDEL PASS 525469 524141 1328 980875 721 433435 474 0.997473 0.998683 0.441886 0.998077

Illumina structural variant benchmarking results (35-fold, DRAGEN 4.2.4)

TP_base TP_comp FP FN Recall Precision F1-score
9243 8454 247 13549 0.4055 0.9716 0.5722

ONT

  • HG002 variant call sets from 60-fold aligned sup-basecall reads were downloaded from the s3 bucket associated with this EPI2ME post.
  • Small variants were called by Clair3 1.0.0: hg002.wf_snp.vcf.gz (s3://ont-open-data/giab_2023.05/analysis/variant_calling/hg002_sup_60x/hg002.wf_snp.vcf.gz, md5sum fa2111cdeb4959e1ed1cfe402d128c39)
  • Structural variants were called by Sniffles2 2.0.7: hg002.wf_sv.vcf.gz (s3://ont-open-data/giab_2023.05/analysis/variant_calling/hg002_sup_60x/hg002.wf_snp.vcf.gz, md5sum cd185fb011345e702f7eb2ba7a19213b)

ONT small variant benchmarking results (sup-basecall, 60-fold, Clair3 1.0.0)

Type Filter TRUTH.TOTAL TRUTH.TP TRUTH.FN QUERY.TOTAL QUERY.FP QUERY.UNK FP.gt METRIC.Recall METRIC.Precision METRIC.Frac_NA METRIC.F1_Score
SNP PASS 3365127 3357580 7547 4418637 4925 1054410 1166 0.997757 0.998536 0.238628 0.998147
INDEL PASS 525469 453173 72296 776968 24911 283861 8400 0.862416 0.949482 0.365345 0.903857

ONT structural variant benchmarking results (sup-basecall, 60-fold, Sniffles2 2.0.7)

TP_base TP_comp FP FN Recall Precision F1-score
21087 18649 243 1743 0.9237 0.9871 0.9543

GIAB 4.2.1 small variant truthset

GIAB T2TQ100 V1.0 structural variant truthset