Support for running everything in PE mode without merging #64

apeltzer · 2018-10-17T07:29:57Z

We should be able to run datasets without merging reads and just clipping etc. pp.

alexhbnr · 2019-02-14T11:40:20Z

Keep in mind that if you want to implement to run datasets without merging, i.e. keeping PE reads as separate mates, you would need to re-implement the extracting the (un-)mapped reads, too.

Currently, it is implemented in a way that the BWA output is filtered for the SAM flag "is_unmapped". However, it happens regularly that BWA ends up aligning one mate of a pair but cannot align the other one. Simply filtering for whether a read is unmapped would rip apart two mates in this example. To avoid that you would need to filter differently to extract mapped reads, e.g. for being either properly paired for paired reads and not paired and not unmapped for single reads. This kind of edge case doesn't happen to often, but still happens regularly enough (approx. 0.01% of the reads at least in my recent experiment that made me aware of this issue).

I personally do the more complex filtering with the powerful tool bam-mangle (https://bitbucket.org/ustenzel/biohazard-tools), which allows you filter your reads in a DSL way using the concatenation of boolean expressions, but it is written in Haskell and might be a pain to get it into a bioconda recipe.

apeltzer · 2019-02-14T13:42:27Z

Thanks @alexhbnr for bringing this up! I think we might not even need to do this, as the use case is quite rare in general or not?

jfy133 · 2019-02-14T14:06:35Z

The case what @alexhbnr was referring to is when someone is running modern data long-molecule but short sequence data. This can happen quite often when trying to run modern reference data at the same time (e.g. with UDG data).

…

On Thu, 14 Feb 2019 at 14:42, Alexander Peltzer ***@***.***> wrote: Thanks @alexhbnr <https://github.com/alexhbnr> for bringing this up! I think we might not even need to do this, as the use case is quite rare in general or not? — You are receiving this because you are subscribed to this thread. Reply to this email directly, view it on GitHub <#64 (comment)>, or mute the thread <https://github.com/notifications/unsubscribe-auth/ARHmT9tiRPiNmHjytCOgrVXTtN1s0JK2ks5vNWfEgaJpZM4XjS-n> .

jfy133 · 2019-02-14T14:07:08Z

i.e. it's not a priority but a minor feature request for downstream.

…

apeltzer · 2019-03-05T12:53:30Z

Fixed in #159

apeltzer modified the milestones: V2.1 "Ulm", V2.0 "Kaufbeuren" Oct 17, 2018

jfy133 mentioned this issue Mar 4, 2019

Add optional merging and trimming #142

Merged

8 tasks

apeltzer mentioned this issue Mar 4, 2019

Flexible AdapterRemoval #159

Merged

8 tasks

apeltzer closed this as completed Mar 5, 2019

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Support for running everything in PE mode without merging #64

Support for running everything in PE mode without merging #64

apeltzer commented Oct 17, 2018

alexhbnr commented Feb 14, 2019

apeltzer commented Feb 14, 2019

jfy133 commented Feb 14, 2019 via email

jfy133 commented Feb 14, 2019 via email

apeltzer commented Mar 5, 2019

Support for running everything in PE mode without merging #64

Support for running everything in PE mode without merging #64

Comments

apeltzer commented Oct 17, 2018

alexhbnr commented Feb 14, 2019

apeltzer commented Feb 14, 2019

jfy133 commented Feb 14, 2019 via email

jfy133 commented Feb 14, 2019 via email

apeltzer commented Mar 5, 2019