SPRQ benchmark comparisons

SPRQ Application brief

This readme provides the information required to reproduce the results. Please contact support@pacb.com with any questions.

PacBio HiFi

HG002 was sequenced on the Revio system with SPRQ chemistry, yielding 146 Gbp. The reads were aligned with pbmm2 v1.13.1 and downsampled from 8-fold to 40-fold coverage aligned depth for variant calling and benchmarking.
Small variants were called with DeepVariant 1.6.1.
- 20-fold
- 30-fold
Structural variants were called with Sawfish 0.12.4.
- 20-fold
- 30-fold
Link to root directory

PacBio HiFi small variant benchmarking results (DeepVariant 1.6.1)

Depth	Type	TRUTH.TOTAL	TRUTH.TP	TRUTH.FN	QUERY.TOTAL	QUERY.FP	QUERY.UNK	FP.gt	METRIC.Recall	METRIC.Precision	METRIC.Frac_NA	METRIC.F1_Score
20-fold	SNP	3365127	3356805	8322	4143790	2255	779837	1225	0.997527	0.99933	0.188194	0.998428
30-fold	SNP	3365127	3362495	2632	4177463	1152	808868	492	0.999218	0.999658	0.193627	0.999438
20-fold	INDEL	525469	513484	11985	956105	9346	414296	5286	0.977192	0.98275	0.433316	0.979963
30-fold	INDEL	525469	520114	5355	971125	4625	426619	2506	0.989809	0.991506	0.439304	0.990657

PacBio HiFi structural variant benchmarking results for titration (Sawfish 0.12.4)

Depth	Recall	Precision	F1-score
9.72	0.8804	0.9900	0.9320
11.67	0.9072	0.9895	0.9466
13.61	0.9229	0.9894	0.9550
15.56	0.9356	0.9894	0.9618
17.50	0.9411	0.9891	0.9645
19.45	0.9463	0.9891	0.9672
24.31	0.9525	0.9882	0.9701
29.17	0.9560	0.9885	0.9720
34.04	0.9586	0.9883	0.9733
38.90	0.9608	0.9882	0.9743

HiFi structural variant benchmarking results (20-fold, Sawfish 0.12.4)

TP_base	TP_comp	FP	FN	Recall	Precision	F1-score
22557	21062	232	1281	0.9463	0.9891	0.9672

Illumina

HG002 DRAGEN variant call sets were obtained from 10.5281/zenodo.8350255 (DRAGEN 4.2.1/4.2.4) and S3 bucket (s3://human-pangenomics/publications/PANGENOME_2022/DeepVariant/samples/HG002) (DRAGEN 3.7.5)
Small variant calls:
- DRAGEN 4.2.1, 35-fold depth: HG002_35x.hard-filtered.vcf.gz (md5sum 388f58faa52a8811fe19b06533d2c3d5)
- DRAGEN 3.7.5, 30-fold depth: HG002.30x_novaseq_pcrfree.dragen.vcf.gz (md5sum cf2c302a99b96e1e4806cb644524357c)
Structural variant calls:
- DRAGEN 4.2.4: HG002_35x.sv.vcf.gz (md5sum 760b9c5c295fc82b045f83ed15e524a9)

Illumina small variant benchmarking results (DRAGEN)

DRAGEN version	Depth	Type	Filter	TRUTH.TOTAL	TRUTH.TP	TRUTH.FN	QUERY.TOTAL	QUERY.FP	QUERY.UNK	FP.gt	METRIC.Recall	METRIC.Precision	METRIC.Frac_NA	METRIC.F1_Score
3.7.5	30-fold	SNP	PASS	3365127	3353531	11596	4042134	14752	672953	3869	0.996554	0.995621	0.166485	0.996088
4.2.1	35-fold	SNP	PASS	3365127	3357852	7275	3849974	1860	489362	985	0.997838	0.999447	0.127108	0.998642
3.7.5	30-fold	INDEL	PASS	525469	521874	3595	995996	3500	448346	1869	0.993158	0.993609	0.450148	0.993384
4.2.1	35-fold	INDEL	PASS	525469	524141	1328	980875	721	433435	474	0.997473	0.998683	0.441886	0.998077

Illumina structural variant benchmarking results (35-fold, DRAGEN 4.2.4)

TP_base	TP_comp	FP	FN	Recall	Precision	F1-score
9243	8454	247	13549	0.4055	0.9716	0.5722

ONT

HG002 variant call sets from 60-fold aligned sup-basecall reads were downloaded from the s3 bucket associated with this EPI2ME post.
Small variants were called by Clair3 1.0.0: hg002.wf_snp.vcf.gz (s3://ont-open-data/giab_2023.05/analysis/variant_calling/hg002_sup_60x/hg002.wf_snp.vcf.gz, md5sum fa2111cdeb4959e1ed1cfe402d128c39)
Structural variants were called by Sniffles2 2.0.7: hg002.wf_sv.vcf.gz (s3://ont-open-data/giab_2023.05/analysis/variant_calling/hg002_sup_60x/hg002.wf_snp.vcf.gz, md5sum cd185fb011345e702f7eb2ba7a19213b)

ONT small variant benchmarking results (sup-basecall, 60-fold, Clair3 1.0.0)

Type	Filter	TRUTH.TOTAL	TRUTH.TP	TRUTH.FN	QUERY.TOTAL	QUERY.FP	QUERY.UNK	FP.gt	METRIC.Recall	METRIC.Precision	METRIC.Frac_NA	METRIC.F1_Score
SNP	PASS	3365127	3357580	7547	4418637	4925	1054410	1166	0.997757	0.998536	0.238628	0.998147
INDEL	PASS	525469	453173	72296	776968	24911	283861	8400	0.862416	0.949482	0.365345	0.903857

ONT structural variant benchmarking results (sup-basecall, 60-fold, Sniffles2 2.0.7)

TP_base	TP_comp	FP	FN	Recall	Precision	F1-score
21087	18649	243	1743	0.9237	0.9871	0.9543

GIAB 4.2.1 small variant truthset

see small variants readme

GIAB T2TQ100 V1.0 structural variant truthset

see SV readme
for benchmarking steps using truevari please also see: Saunders, et al. bioRxiv, 2024

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

README.md

README.md

SPRQ benchmark comparisons

PacBio HiFi

PacBio HiFi small variant benchmarking results (DeepVariant 1.6.1)

PacBio HiFi structural variant benchmarking results for titration (Sawfish 0.12.4)

HiFi structural variant benchmarking results (20-fold, Sawfish 0.12.4)

Illumina

Illumina small variant benchmarking results (DRAGEN)

Illumina structural variant benchmarking results (35-fold, DRAGEN 4.2.4)

ONT

ONT small variant benchmarking results (sup-basecall, 60-fold, Clair3 1.0.0)

ONT structural variant benchmarking results (sup-basecall, 60-fold, Sniffles2 2.0.7)

GIAB 4.2.1 small variant truthset

GIAB T2TQ100 V1.0 structural variant truthset

Files

README.md

Latest commit

History

README.md

File metadata and controls

SPRQ benchmark comparisons

PacBio HiFi

PacBio HiFi small variant benchmarking results (DeepVariant 1.6.1)

PacBio HiFi structural variant benchmarking results for titration (Sawfish 0.12.4)

HiFi structural variant benchmarking results (20-fold, Sawfish 0.12.4)

Illumina

Illumina small variant benchmarking results (DRAGEN)

Illumina structural variant benchmarking results (35-fold, DRAGEN 4.2.4)

ONT

ONT small variant benchmarking results (sup-basecall, 60-fold, Clair3 1.0.0)

ONT structural variant benchmarking results (sup-basecall, 60-fold, Sniffles2 2.0.7)

GIAB 4.2.1 small variant truthset

GIAB T2TQ100 V1.0 structural variant truthset