This is a collection of tools and resources for analysis and processing of linked-reads.
Name | Category | Description | Last commit |
---|---|---|---|
Aldy | structural variants, variant calling | Allelic decomposition and exact genotyping of highly polymorphic and structurally variant genes | |
Ambigram | structural variants | Detection of complex breakage-fusion-bridge genome rearrangements that supports linked-reads | |
Aquila | assembly, pipeline | Diploid personal genome assembly and comprehensive variant detection based on linked-reads | |
Aquila_stLFR | assembly, pipeline | Human haplotype-resolved assembly and variant detection for stLFR, hybrid assembly for linked-reads | |
AquilaDeepFilter | structural variants | Deep learing based filtering of genome-wide false positive large deletions | |
AquilaSV | structural variants, variant calling | Structural variant detection from region-based phased diploid assemblies for 10X and stLFR linked-reads | |
ARBitR | scaffolding | ARBitR is an overlap aware genome assembly scaffolder for linked sequencing reads. | |
Ariadne | assembly, metagenomics | de Bruijn graph-based program for barcoded read deconvolution | |
arcs | assembly | Scaffold genome sequence assemblies. | |
Athena | assembly, metagenomics | Read cloud assembler for metagenomes | |
BarCrawler | qc | QC package for 10X genomics barcoded reads. | |
bcctools | toolkit | Correcting barcodes in 10X linked-read sequencing data | |
bcmap | mapping, toolkit | Fast tool to map approximate genome locations for barcoded molecules | |
BLR | pipeline | An end-to-end Snakemake workflow for whole genome haplotyping and structural variant calling from FASTQs from multiple linked-read technologies. | |
bxtools | toolkit | Tools for analyzing mapped 10x data | |
cloudSPAdes | assembly | Assembly of synthetic long reads using de Bruijn graphs | |
ChromeQC | qc | Summarize sequencing library quality of 10x Genomics Chromium linked reads | |
Cue | structural variants | Deep learning framework for SV calling and genotyping | |
DrLink | structural variants | Detecting recombination breakpoints using Linked read sequencing | |
EMerAld (EMA) | mapping | Preforms barcode-aware alignment of linked reads. Also does preprocessing of 10x Genomics data. | |
Gemtools | toolkit | Tools for working with linked-read sequencing (10X Genomics) data | |
grocsvs | structural variants | Genome-wide reconstruction of complex structural variants | |
HapCUT2 | phasing | Phasing of barcode linked reads | |
HapTree-X | phasing | Haplotype phaser for next-generation sequencing data | |
HARPY | pipeline | Process raw haplotagging data, from raw sequences to phased haplotypes, batteries included. | |
HAST | assembly | Haplotype-Resolved Assembly for Synthetic Long Reads Using A Trio-Binning Strategy | |
Lancet | variant calling | Microassembly based somatic variant caller for linked-read data | |
Lariat | mapping | Linked-Read Alignment Tool | |
LEVIATHAN | structural variants | Linked-reads based structural variant caller with barcode indexing | |
Link_STR | toolkit | Analysis scripts developed for genotyping STRs in linked-read data | |
LinkedSV | structural variants | Structural variant caller for linked-read sequencing data | |
Linker | toolkit | Tools for analyzing long and linked read sequencing | |
LongRanger | pipeline | Pipeline for alignment, variant calling, phasing, and ptructural variant calling | |
LRez | toolkit | Standalone tool and library allowing to work with barcoded linked-reads | |
LRTK | pipeline, toolkit | A unified and versatile toolkit for analyzing Linked-Read sequencing data | |
LRTK-SIM | simulation | A program to simulate linked reads sequencing from 10X Chromium System | |
LRSIM | simulation | A simulator for linked reads | |
MetaTrass | assembly | Taxonomic Reads Assembly For a Single Species to Metagenomics | |
Minerva | assembly | Sort Linked Read DNA Into Fragment Specific Clusters | |
mLinker (alt) | phasing, tookit | Tools for Determining Haplotype Phase from Long/Linked Read Sequencing | |
MTG-Link | assembly | Novel gap-filling tool for draft genome assemblies, dedicated to linked read data | |
NAIBR (original) NAIBR (fork) |
structural variants | Identifies novel adjacencies created by structural variation events such as deletions, duplications, inversions, and complex rearrangements | |
Novel-X | structural variants | Novel insertion detection with 10X reads | |
NPGREAT | assembly | A hybrid assembly method that utilizes Nanopore and Linked-Reads datasets for the assembly of the human subtelomere regions. | |
Pangaea | assembly, metagenomics | A metagenome assembler for the linked-reads with high-barcode specificity | |
proc10xG | toolkit | Collection of scripts for processing 10x genomics reads | |
Pseudoseq | simulation | Fake genomes, fake sequencing, real insights. | |
Pyslr | assembly | Construct a Physical Map from Linked Reads | |
QuickDeconvolution | assembly | Quick and scalable software to deconvolve read clouds from linked-reads experiments without a reference genome | |
Samovar | variant calling | Somatic (mosaic) SNV caller for 10X Genomics data using random forest classification and feature-based filters | |
samplot | structural variants | Plot structural variant signals from many BAMs and CRAMs | |
Scaff10x (v5) Scaff10x (≤v4.1) |
assembly | Pipeline for scaffolding and breaking a genome assembly | |
SpecHLA | phasing | Reconstructs entire diploid sequences of HLA genes and infers LOH events | |
SpLitteR (alt) | assembly | Repeat resolution in assembly graph using synthetic long reads | |
stLFRdenovo | assebly | De Novo assembly pipeline to deal with barcoded reads. It is based on Supernova, with a fastq parsing and sorting module constumized for stLFR data. | |
stLFRsv | structural variants | Structure variation(SV) pipeline for stLFR co-barcode reads | |
SuperNova | assembly | 10x Genomics Linked-Read Diploid De Novo Assembler | |
SVenX | structural variants | Pipeline for SV detection using 10X genomics data | |
tenx_utils | toolkit | Utility functions for 10x data | |
Tigmint | assembly | Correct misassemblies using Linked Reads | |
TitanCNA_10x | pipeline,structural variants,cancer | Snakemake workflow for 10X Genomics WGS analysis using TitanCNA | |
Topsorter | structural variants, qc | Graphic assement of structural variants | |
VISOR | simulation | VarIant SimulatOR for short, long and linked reads | |
Valor | structural variants | Variation discovery using long range information in linked-reads | |
WhatsHap | phasing,qc,toolkit | Read-based phasing of genomic variants, also called haplotype assembly. Implements several tools which work with linked reads | |
Wrath | structural variants,qc | Visualisation and identification of candidate structural variants (SVs) from linked read data | |
xTea | structural variants | Comprehensive TE insertion identification | |
ZoomX | structural variants | Single Molecule Based Rearrangement Analysis with Linked Read Sequencing |
10x Genomics linked-read technology comes in two versions; the older GemCode (v1) and more recent Chromium Genome (v2). Long DNA fragments are combined in droplets with barcode-containing gel-beads to create GEMs ((Gel Bead-In EMulsions). The fragments are amplified and barcoded using a combination of free random hexamers and barcode-linked random hexamers from the gel beads. Following this barcoded fragments are recovered and fragments before ligation of 3' sequencing adaptor. Libraries are sequenced using Illumina Sequencing. The commercial version of the technology is currently discontinued.
TELL-seq is based on the technology from Chen et al. 2020 and is commercially available from the company Universal Sequencing. The method uses clonaly barcode beads with attacted tagmentases to cut and barcode individual long DNA fragments in solution. A second tagmentation is also preformed in solution to introduce a second adaptor. The library is sequenced using Illumina sequencing with special setup to sequence the barcode as index 1.
stLFR (single-tube long fragment read) is based on the technology described in Wang et al. 2019 and is commercially available from MGI. The technology uses tagmentation to individually cut-and-hold long DNA fragments in solution. The tagmentase-DNA complex is then hybridized and individual wrapped around barcoded beads through the adaptor introduced by the tagmentation. The barcode is then ligated to each subfragment before recovery and final library prepration. Sequencing is preformed on the DNBSEQ platfroms.
Droplet Barcode Sequencing (DBS) is based on the technology described in Redin el al. 2019. Long DNA fragments are subjected to tagmentation using Tn5-covered beads to cut, tag and wrap the fragment around the beads. The DNA-wrapped beads are then used in emmulsion PCR along with barcoded oligo. Within each emmulsion droplet the barcode and tagged fragments are amplified independently and then linked using overlap-extension. Barcode-linked fragments are recovered and indexed for Illumina sequencing.
Technologies based Amini et al. 2014 and the follow-up CPTv2-seq from Zhang et al. 2017. These technologies were developed by Illumina but are not commercially available.
Haplotagging is based on the technology presented in Meier et al. 2021. The technology uses barcoded beads covered with Tn5 tagmentase to cut and barcode individual long DNA fragments in solution. The beads are coated in a combination of two barcodes AB and CB that become inserted at the 5' and 3' of each cut fragment. Barcodes are combinatorialy generated with about 85 million possible combinations in total.
Is some linked-read related tool missing from this resource? Either create a new issue with information about the tool you want to add or submit a pull request with the addition directly.
Inspired by the collection in Awesome-10x-genomics.