Getting splitters BAM from long reads data? #42

cmdcolin · 2019-11-14T00:02:21Z

Hi there,
I was curious if this tool works for getting splitters for a long read BAM file. I am currently running the steps to try it out but I was wondering if it's something I could recommend to other people (I have a visualization tool that would be ideal if it just got a BAM file with the splitters with everything else filtered out)

GregoryFaust · 2019-11-14T21:28:31Z

Yes, samblaster will work to output splitters from single-end reads if you use the --ignoreUnmated option. You may also want to read #37

cmdcolin · 2019-11-14T23:41:55Z

Super thank you! I figured it'd do the trick

cmdcolin · 2019-11-15T00:33:14Z

If you get a chance maybe add a note in the readme 👍 I'll close for now

GregoryFaust · 2020-03-17T02:09:24Z

Release 0.1.25 includes sample scenarios in both the README.md and in the program help text.

cmdcolin · 2020-03-17T02:32:30Z

This is a somewhat weird postmortem, but I found after asking this question that my BAM parser I made wasn't parsing the SA tag and I so I was operating on an assumptionthat there were split reads that lacked SA tag. Since my parser was bad though, it seems generally there will be an SA tag. Would it be fair to say that I could probably rely on the SA tag in most cases and then I could filter splitters from a coordinate sorted BAM by just grepping for the SA tag?

GregoryFaust · 2020-03-17T18:54:18Z

In our experience, you rarely want to look at all chimeric alignments. That is why samblaster has no fewer than 4 parameters that control which split reads that are output in the splitter file: --maxSplitCount, --maxUnmappedBases, --minIndelSize, and --minNonOverlap. These parameters and their default values were carefully selected to report likely split reads relevant for use in detecting structural variants without a lot of false positives or false negatives. We developed these ideas in Ira Hall's Lab at UVA (now at Wash. U. St. Louis) while doing research that led to several tools/pipelines for SV detection such as SpeedSeq, Lumpy, Hydra, YAHA, SVsim and others.

cmdcolin · 2020-03-18T20:05:55Z

Thank you for the detailed response. This is quite helpful. My angle is that I am developing tools to help visualize split/paired reads for structural variation, and I will definitely look into these tools as sources of the data (already have used lumpy)

cmdcolin closed this as completed Nov 15, 2019

cmdcolin mentioned this issue Nov 15, 2019

Segmentation fault on getting split reads from long read file? #43

Closed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Getting splitters BAM from long reads data? #42

Getting splitters BAM from long reads data? #42

cmdcolin commented Nov 14, 2019

GregoryFaust commented Nov 14, 2019

cmdcolin commented Nov 14, 2019

cmdcolin commented Nov 15, 2019

GregoryFaust commented Mar 17, 2020

cmdcolin commented Mar 17, 2020 •

edited

Loading

GregoryFaust commented Mar 17, 2020

cmdcolin commented Mar 18, 2020

Getting splitters BAM from long reads data? #42

Getting splitters BAM from long reads data? #42

Comments

cmdcolin commented Nov 14, 2019

GregoryFaust commented Nov 14, 2019

cmdcolin commented Nov 14, 2019

cmdcolin commented Nov 15, 2019

GregoryFaust commented Mar 17, 2020

cmdcolin commented Mar 17, 2020 • edited Loading

GregoryFaust commented Mar 17, 2020

cmdcolin commented Mar 18, 2020

cmdcolin commented Mar 17, 2020 •

edited

Loading