-
Notifications
You must be signed in to change notification settings - Fork 596
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Support unmapped reads in Spark. #3369
Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The changes look good to me, but I think we should a few additional test cases.
return true; | ||
} | ||
if (traversalParameters.traverseUnmappedReads() && record.getReadUnmappedFlag() && record.getAlignmentStart() == SAMRecord.NO_ALIGNMENT_START) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Could you add a comment for future generations explaining this case.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Done
{ unmappedBam, null, Arrays.asList(publicTestDir + "org/broadinstitute/hellbender/engine/reads_data_source_test1_unmapped2.intervals"), Arrays.asList("a", "b", "c", "k", "u1", "u2", "u3", "u4", "u5") }, | ||
{ ceuSnippet, null, Arrays.asList("unmapped"), Arrays.asList("g", "h", "h", "i", "i") }, | ||
{ ceuSnippet, null, Arrays.asList("20:10000009-10000011", "unmapped"), Arrays.asList("a", "b", "c", "d", "e", "g", "h", "h", "i", "i") }, | ||
{ ceuSnippet, null, Arrays.asList("20:10000009-10000013", "unmapped"), Arrays.asList("a", "b", "c", "d", "e", "f", "f", "g", "h", "h", "i", "i") }, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We should add a test case here that is just an interval and not including unmapped, I'm sure it's covered by existing tests, but it would be good to have one here for completeness.
} | ||
|
||
@Test(dataProvider = "UnmappedReadInclusionTestData") | ||
public void testUnmappedReadInclusion( final File input, final String reference, final List<String> intervalStrings, final List<String> expectedReadNames ) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We might want to make a matching test in ReadSparkSourceUnitTest that directly exercises getParallelReads.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Done
05b51f7
to
8e90c46
Compare
Thanks for the review @lbergelson. I've addressed all your comments. (Note the tests will still fail until there's a new Hadoop-BAM release.) |
d63466d
to
f2a5981
Compare
@lbergelson, @droazen are you OK with adding a SNAPSHOT dependency for Hadoop-BAM so we can commit this (and also the GVCF PR)? |
Codecov Report
@@ Coverage Diff @@
## master #3369 +/- ##
==============================================
+ Coverage 79.923% 79.94% +0.017%
- Complexity 17884 17897 +13
==============================================
Files 1198 1198
Lines 64966 64980 +14
Branches 10114 10120 +6
==============================================
+ Hits 51923 51945 +22
+ Misses 9010 9002 -8
Partials 4033 4033
|
01816b6
to
bffea93
Compare
bffea93
to
de1a9a1
Compare
Not ready for merge as this relies on HadoopGenomics/Hadoop-BAM#136, hence a new Hadoop-BAM release.
Addresses #2572 and #2571