Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

dedup with paired option #581

Closed
YichaoOU opened this issue Mar 1, 2023 · 2 comments
Closed

dedup with paired option #581

YichaoOU opened this issue Mar 1, 2023 · 2 comments

Comments

@YichaoOU
Copy link

YichaoOU commented Mar 1, 2023

Hello,

Since paired option is slow, I'm wondering what will happen if I do not use the paired option?

If read A and B have the same UMI, but:

  1. A and B have different tlen

  2. A R1 and B R1 is different, but A R2 and B R2 mapped to the same exact location.

Will A and B be collapsed into one read?

Thanks,
Yichao

@IanSudbery
Copy link
Member

In single end mode, R2s are always discarded, as single-end BAM files should not have R2s. If the R1 from A and B are different, then they will be not be collapsed.

However, in recent releases, paired mode should not be substantially slower than single end mode for the majority of datasets. Or, at least, it is not the pairing per-se that makes it slower; paired mode might be slower because it leads to more reads being considered independent of each other, and therefore gives a more complex network to devconvolve.

@YichaoOU
Copy link
Author

YichaoOU commented Mar 2, 2023

Thank you! It seems to be a newly fixed issue in 1.1.3 and above? #539

I will upgrade.

Thanks,
Yichao

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants