dedup with `paired` option #581

YichaoOU · 2023-03-01T22:05:06Z

Hello,

Since paired option is slow, I'm wondering what will happen if I do not use the paired option?

If read A and B have the same UMI, but:

A and B have different tlen
A R1 and B R1 is different, but A R2 and B R2 mapped to the same exact location.

Will A and B be collapsed into one read?

Thanks,
Yichao

The text was updated successfully, but these errors were encountered:

IanSudbery · 2023-03-02T10:07:55Z

In single end mode, R2s are always discarded, as single-end BAM files should not have R2s. If the R1 from A and B are different, then they will be not be collapsed.

However, in recent releases, paired mode should not be substantially slower than single end mode for the majority of datasets. Or, at least, it is not the pairing per-se that makes it slower; paired mode might be slower because it leads to more reads being considered independent of each other, and therefore gives a more complex network to devconvolve.

YichaoOU · 2023-03-02T14:55:20Z

Thank you! It seems to be a newly fixed issue in 1.1.3 and above? #539

I will upgrade.

Thanks,
Yichao

TomSmithCGAT closed this as completed Mar 2, 2023

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

dedup with `paired` option #581

dedup with `paired` option #581

YichaoOU commented Mar 1, 2023

IanSudbery commented Mar 2, 2023

YichaoOU commented Mar 2, 2023

dedup with paired option #581

dedup with paired option #581

Comments

YichaoOU commented Mar 1, 2023

IanSudbery commented Mar 2, 2023

YichaoOU commented Mar 2, 2023

dedup with `paired` option #581

dedup with `paired` option #581